Air Pollution and Mortality in Seven Million Adults
Air Pollution and Mortality in Seven Million Adults
In the Netherlands, population statistics are compiled by Statistics Netherlands (http://www.cbs.nl/en-GB/menu/home/default.htm) and are based on digital municipal population registers (Prins 2000). This registration system is known as the GBA (Gemeentelijke Basis Administratie), the municipal basic registration of population data. The GBA was implemented on 1 October 1994.
Statistics Netherlands combines the data from the GBA into a longitudinal file for each individual registered in the GBA (de Bruin et al. 2004). These records start on 1 January 1995. Changes in demographic attributes (e.g., death, address, marital status, emigration) are updated yearly by adding additional information on the nature and the date of the change. In these files, the individual identification number of the GBA is replaced by a meaningless, but unique, identification number. This identification number is used to enrich the individual files with information from other central data sources maintained by Statistics Netherlands such as the social statistical database, which contains, among others, data from the tax authorities and about employment status (Arts and Hoogteijling 2002).
From the database with the longitudinal files of all Dutch inhabitants, we selected all individuals of ≥ 30 years of age on 1 January 2004, living at the same residential address since 1 January 1999. We used data about sex, age, marital status, and region of origin. The data about origin distinguishes between Dutch, Western origin, and non-Western origin. Individuals of non-Western origin are those born in or with a parent born in Africa, Asia (except Japan and Indonesia, who are categorized as "Western origin"), or Latin America. Given the relatively large size of groups within non-Western origin, a distinction is made among those with Turkey, Morocco, and Suriname origin. Furthermore, we enriched the database with standardized disposable household income. This individual socioeconomic indicator is adjusted for differences in household size and composition.
We also used a socioeconomic indicator at four-digit postal code level. These postal code areas comprise on average about 4,000 inhabitants. This social status indicator is derived every 4 years by the Netherlands Institute for Social Research (http://www.scp.nl/english/) (Knol 1998). Each postal code area receives a unique ranking for social status according to the income level, unemployment rate, and education level of its inhabitants. The ranking is transformed to a 0–1 scale, with 1 being the lowest possible ranking on social status within the Netherlands. We took the indicator from 2002 and linked it to the cohort through the postal code of the residential addresses.
The follow-up period of the cohort was from 1 January 2004 to 1 January 2011. Subjects were lost to follow-up if their final record in the longitudinal file ended before 1 January 2011 and death was not registered as a reason for termination. Emigration was the main cause of censoring.
The Dutch population registers are intended primarily for municipal administrative purposes. However, many national and nongovernmental organizations benefit from them as well. Given the confidential character of the data, there is no free access to the population registers. Each organization interested in receiving data on a regular basis is given the opportunity to use the data upon request to the Ministry for the Interior. The Ministry decides to which data the organization gets access.
All our analyses were performed within strict privacy rules; that is, only researchers who received a signed permit were allowed to do analyses within a secured environment at our Institute. Before publication, Statistics Netherlands made sure that none of the analysis results showed potential reducibility to the individual level.
A database with mortality data was available from Statistics Netherlands (Harteloh et al. 2010). We selected nonaccidental mortality [International Classification of Diseases, 10th Revision (ICD-10) codes A00-R99], circulatory disease mortality (ICD-10 codes I00-I99), respiratory disease mortality (ICD-10 codes J00-J99), and lung cancer mortality (ICD-10 codes C33–C34). A study of cause-of-death coding showed high reliability for these specific causes (> 90% for major causes of death such as cancers and acute myocardial infarction and about 85% for respiratory disease mortality) (Harteloh et al. 2010).
We made use of previously published land use regression (LUR) models to produce high-resolution air pollution maps (100 m × 100 m grids) of annual mean concentrations of PM10 and NO2 in 2001. Details of development and validation of the LUR models are presented elsewhere (Vienneau et al. 2010). Briefly, for both pollutants regression models were derived from annual mean concentrations for the year 2001 based on routine measurement data from the Dutch National Air Quality Monitoring Network (van Elzakker and Buijsman 2014). Predictor variables used for the modeling were traffic, land use, and topography integrated in a geographical information system. Addresses at baseline were linked to the estimated PM10 and NO2 concentration in the corresponding grid.
We did not assign PM2.5 (PM with diameter ≤ 2.5 μm) concentrations to cohort addresses, because in 2001 PM2.5 was not measured in the national monitoring network. Two monitoring studies (Cyrys et al. 2003; Eeftens et al. 2012) showed that the spatial variation of PM10 in the Netherlands is largely driven by PM2.5 (R = 0.76 and 0.72) with a median ratio between PM2.5 and PM10 of 0.66 that was stable over time from 2000 to 2009.
Statistical analyses were performed with SAS version 9.1 (SAS Institute Inc., Cary, NC, USA). We applied age-stratified Cox proportional hazards regression models to estimate the associations [hazard ratio (HR) and 95% confidence interval (CI)] between (cause-specific) mortality and long-term exposure to PM10 or NO2. We used 1-year age strata. We analyzed the data with a) models adjusting for age and sex ("unadjusted" model); b) models adjusted for age, sex, marital status, region of origin, and household income (individual confounder model); c) the individual confounder models extended with a socioeconomic status indicator of postal code areas (full model). In addition, we d) extended the full model with the second pollutant to analyze the robustness of the one-pollutant estimate when adjusted for the second pollutant. Statistical significance was defined as p-values < 0.05.
We explored nonlinearity in the relationships between PM10 and NO2 exposure and mortality with natural splines (2 degrees of freedom). We used the likelihood ratio test (LRT) (p < 0.05) to compare spline models with linear models. We analyzed these models with R version 2.15.1 (R Core Team 2014).
To assess the sensitivity of our relative risk estimates to missing individual lifestyle factor data, we assessed the association between our air pollution exposure estimates and lifestyle factors in a separate survey of adults across the Netherlands. We obtained data from health surveys from Community Health Services (GGD GHOR Nederland 2014) conducted in 2003–2005. We included data from 11 Community Health Services with available information on self-reported four-digit postal code, age, sex, marital status, level of education, region of origin, smoking, body mass index (BMI), alcohol consumption, and exercise. Criteria for alcohol consumption were defined by a national working group of experts for the purpose of the Community Health Services health surveys. We calculated the age- and sex-adjusted mean PM10 and NO2 concentrations at four-digit postal code level for different categories of smoking (current smoker, former smoker, never smoker), BMI (< 18.5, 18.5–25, 25–30, > 30), alcohol consumption (different categories of compliance to three criteria for responsible alcohol use), and exercise (compliance to 30 min of moderate exercise per day on at least 5 days per week). Subsequently, we additionally adjusted for the individual and neighborhood confounders that were also included in the Cox proportional hazard regression models [i.e., marital status, region of origin, individual socioeconomic status (using level of education in place of standardized household income, which was not available), and social status]. In addition to these regression analyses, we calculated the prevalence of the different variables under study for different categories (deciles) of PM10 and NO2 exposure.
Furthermore, in the full population we additionally adjusted for area-level smoking-related mortality estimated based upon observed lung cancer rates (Janssen and Spriensma 2012).
In addition we assessed effect modification by stratifying our analyses by sex, age (30–65 or > 65 years), socioeconomic status (five categories), and degree of urbanization (five categories). Results are graphically presented.
Methods
The Study Cohort
In the Netherlands, population statistics are compiled by Statistics Netherlands (http://www.cbs.nl/en-GB/menu/home/default.htm) and are based on digital municipal population registers (Prins 2000). This registration system is known as the GBA (Gemeentelijke Basis Administratie), the municipal basic registration of population data. The GBA was implemented on 1 October 1994.
Statistics Netherlands combines the data from the GBA into a longitudinal file for each individual registered in the GBA (de Bruin et al. 2004). These records start on 1 January 1995. Changes in demographic attributes (e.g., death, address, marital status, emigration) are updated yearly by adding additional information on the nature and the date of the change. In these files, the individual identification number of the GBA is replaced by a meaningless, but unique, identification number. This identification number is used to enrich the individual files with information from other central data sources maintained by Statistics Netherlands such as the social statistical database, which contains, among others, data from the tax authorities and about employment status (Arts and Hoogteijling 2002).
From the database with the longitudinal files of all Dutch inhabitants, we selected all individuals of ≥ 30 years of age on 1 January 2004, living at the same residential address since 1 January 1999. We used data about sex, age, marital status, and region of origin. The data about origin distinguishes between Dutch, Western origin, and non-Western origin. Individuals of non-Western origin are those born in or with a parent born in Africa, Asia (except Japan and Indonesia, who are categorized as "Western origin"), or Latin America. Given the relatively large size of groups within non-Western origin, a distinction is made among those with Turkey, Morocco, and Suriname origin. Furthermore, we enriched the database with standardized disposable household income. This individual socioeconomic indicator is adjusted for differences in household size and composition.
We also used a socioeconomic indicator at four-digit postal code level. These postal code areas comprise on average about 4,000 inhabitants. This social status indicator is derived every 4 years by the Netherlands Institute for Social Research (http://www.scp.nl/english/) (Knol 1998). Each postal code area receives a unique ranking for social status according to the income level, unemployment rate, and education level of its inhabitants. The ranking is transformed to a 0–1 scale, with 1 being the lowest possible ranking on social status within the Netherlands. We took the indicator from 2002 and linked it to the cohort through the postal code of the residential addresses.
The follow-up period of the cohort was from 1 January 2004 to 1 January 2011. Subjects were lost to follow-up if their final record in the longitudinal file ended before 1 January 2011 and death was not registered as a reason for termination. Emigration was the main cause of censoring.
The Dutch population registers are intended primarily for municipal administrative purposes. However, many national and nongovernmental organizations benefit from them as well. Given the confidential character of the data, there is no free access to the population registers. Each organization interested in receiving data on a regular basis is given the opportunity to use the data upon request to the Ministry for the Interior. The Ministry decides to which data the organization gets access.
All our analyses were performed within strict privacy rules; that is, only researchers who received a signed permit were allowed to do analyses within a secured environment at our Institute. Before publication, Statistics Netherlands made sure that none of the analysis results showed potential reducibility to the individual level.
Mortality Outcomes
A database with mortality data was available from Statistics Netherlands (Harteloh et al. 2010). We selected nonaccidental mortality [International Classification of Diseases, 10th Revision (ICD-10) codes A00-R99], circulatory disease mortality (ICD-10 codes I00-I99), respiratory disease mortality (ICD-10 codes J00-J99), and lung cancer mortality (ICD-10 codes C33–C34). A study of cause-of-death coding showed high reliability for these specific causes (> 90% for major causes of death such as cancers and acute myocardial infarction and about 85% for respiratory disease mortality) (Harteloh et al. 2010).
Air Pollution Exposure Assessment
We made use of previously published land use regression (LUR) models to produce high-resolution air pollution maps (100 m × 100 m grids) of annual mean concentrations of PM10 and NO2 in 2001. Details of development and validation of the LUR models are presented elsewhere (Vienneau et al. 2010). Briefly, for both pollutants regression models were derived from annual mean concentrations for the year 2001 based on routine measurement data from the Dutch National Air Quality Monitoring Network (van Elzakker and Buijsman 2014). Predictor variables used for the modeling were traffic, land use, and topography integrated in a geographical information system. Addresses at baseline were linked to the estimated PM10 and NO2 concentration in the corresponding grid.
We did not assign PM2.5 (PM with diameter ≤ 2.5 μm) concentrations to cohort addresses, because in 2001 PM2.5 was not measured in the national monitoring network. Two monitoring studies (Cyrys et al. 2003; Eeftens et al. 2012) showed that the spatial variation of PM10 in the Netherlands is largely driven by PM2.5 (R = 0.76 and 0.72) with a median ratio between PM2.5 and PM10 of 0.66 that was stable over time from 2000 to 2009.
Statistical Analyses
Statistical analyses were performed with SAS version 9.1 (SAS Institute Inc., Cary, NC, USA). We applied age-stratified Cox proportional hazards regression models to estimate the associations [hazard ratio (HR) and 95% confidence interval (CI)] between (cause-specific) mortality and long-term exposure to PM10 or NO2. We used 1-year age strata. We analyzed the data with a) models adjusting for age and sex ("unadjusted" model); b) models adjusted for age, sex, marital status, region of origin, and household income (individual confounder model); c) the individual confounder models extended with a socioeconomic status indicator of postal code areas (full model). In addition, we d) extended the full model with the second pollutant to analyze the robustness of the one-pollutant estimate when adjusted for the second pollutant. Statistical significance was defined as p-values < 0.05.
We explored nonlinearity in the relationships between PM10 and NO2 exposure and mortality with natural splines (2 degrees of freedom). We used the likelihood ratio test (LRT) (p < 0.05) to compare spline models with linear models. We analyzed these models with R version 2.15.1 (R Core Team 2014).
To assess the sensitivity of our relative risk estimates to missing individual lifestyle factor data, we assessed the association between our air pollution exposure estimates and lifestyle factors in a separate survey of adults across the Netherlands. We obtained data from health surveys from Community Health Services (GGD GHOR Nederland 2014) conducted in 2003–2005. We included data from 11 Community Health Services with available information on self-reported four-digit postal code, age, sex, marital status, level of education, region of origin, smoking, body mass index (BMI), alcohol consumption, and exercise. Criteria for alcohol consumption were defined by a national working group of experts for the purpose of the Community Health Services health surveys. We calculated the age- and sex-adjusted mean PM10 and NO2 concentrations at four-digit postal code level for different categories of smoking (current smoker, former smoker, never smoker), BMI (< 18.5, 18.5–25, 25–30, > 30), alcohol consumption (different categories of compliance to three criteria for responsible alcohol use), and exercise (compliance to 30 min of moderate exercise per day on at least 5 days per week). Subsequently, we additionally adjusted for the individual and neighborhood confounders that were also included in the Cox proportional hazard regression models [i.e., marital status, region of origin, individual socioeconomic status (using level of education in place of standardized household income, which was not available), and social status]. In addition to these regression analyses, we calculated the prevalence of the different variables under study for different categories (deciles) of PM10 and NO2 exposure.
Furthermore, in the full population we additionally adjusted for area-level smoking-related mortality estimated based upon observed lung cancer rates (Janssen and Spriensma 2012).
In addition we assessed effect modification by stratifying our analyses by sex, age (30–65 or > 65 years), socioeconomic status (five categories), and degree of urbanization (five categories). Results are graphically presented.