Detailled Script.
In this document, we describe the data sources and provide the codebook of the variables.
Should you have any questions, please do not hesitate to contact us at leo.zabrocki@gmail.com
To reproduce exactly the data_sources_codebook.html document, we first need to have installed:
Once everything is set up, we load the following packages:
We finally load the data:
The dataset we use in our tutorial was gathered for a previous work by Tarik Benmarhnia et al. (2015).
All non-accidental deaths that occurred in the summers (June, July and August) of 1990-2007 were retrieved for the island of Montreal, Canada. The Quebec life table for Montreal for the years 2000 to 2002 was used to compute the total number of years of life lost (YLL).
Daily mean outdoor temperatures (°C) and daily relative humidity (%) were obtained for the period 1981–2010 from Environment Canada meteorological observation station at the Montreal Pierre Elliott Trudeau International Airport. We defined a heat wave day as any day with daily maximum temperature exceeding 30°C following the defined threshold for triggering the “active watch” level in the Montreal Heat Action Plan.
We retrieved air pollution concentrations from the National Air Pollution Surveillance network of fixed-site monitors in Montreal (https://www.ec.gc.ca/rnspa-naps/). We averaged hourly concentrations over all stations and calculated daily (and lagged) mean concentrations for ozone (O3) and nitrogen dioxide (NO2).
The final dataset contains 1376 daily observations for the summers of the 1990-2007 period and 23 variables. Over that period, 122 heat waves occurred. Below are summary statistics for the variables:
data %>%
dplyr::select(yll, temperature_average, temperature_maximum, humidity_relative:o3, no2) %>%
pivot_longer(cols = everything(.), names_to = "Variable", values_to = "value") %>%
group_by(Variable) %>%
summarise(Mean = mean(value),
SD = sd(value),
Min = min(value),
Max = max(value)) %>%
mutate_at(vars(Mean:Max), ~ round(., 1)) %>%
kable(., align = c("l", "c", "c", "c", "c"))
Variable | Mean | SD | Min | Max |
---|---|---|---|---|
humidity_relative | 68.7 | 10.4 | 38.5 | 95.8 |
no2 | 25.3 | 8.6 | 2.3 | 62.4 |
o3 | 25.5 | 11.7 | 0.8 | 76.1 |
temperature_average | 20.4 | 3.3 | 9.6 | 29.2 |
temperature_maximum | 24.9 | 3.8 | 12.0 | 35.4 |
yll | 2661.1 | 503.7 | 1075.7 | 5208.3 |
We load below the codebook of the data:
If you see mistakes or want to suggest changes, please create an issue on the source repository.