A. He and T. Munasinghe, "Chronic Respiratory Disease: Risk Modeling Potential and Limitations," in 2021 IEEE International Conference on Big Data (Big Data), Orlando, FL, USA, 2021 pp. 1045-1053.
doi: 10.1109/BigData52589.2021.9672074
keywords: {microorganisms;temperature;pulmonary diseases;big data;water pollution;data models;spatiotemporal phenomena}
url's:
- https://ieeexplore.ieee.org/document/9672074
- https://doi.ieeecomputersociety.org/10.1109/BigData52589.2021.9672074
- Get original data:
- Download original GFED data from https://www.geo.vu.nl/~gwerf/GFED/GFED4/, place in folder "GFED4s"
- Construct GFED data with appropriate datasets multiplied by cell area:
- Copy all .hdf5 files from the original GFED data (from folder "GFED4s") into new folder "GFED4s_timesArea"
- Run "code/multiply_by_area_gfed4s.py"
- Corresponding article: https://pubs.acs.org/doi/full/10.1021/acs.est.0c01764
- More info + instructions: https://sites.wustl.edu/acag/datasets/surface-pm2-5/
- CDC WONDER - Underlying Cause of Death - Chronic lower respiratory diseases - (saved query): https://wonder.cdc.gov/controller/saved/D76/D133F078
- Run "adjust_sup_deaths_data_by_pop.py"
- Run "read_acag_pm2-5.py", "read_gfed4s.py"
- Run "write_county_month_pm2-5.py"
- Run, in any order:
- Run "write_county_month_gfed.py"
- Run "write_county_month_clim.py"
- Run "write_county_month_median-income.py"
- To write AQI data (optional)
- Run "impute_county_month_AQI.py"
- Run "write_county_month_AQI_main.py"
If in any case, you encounter error message where a directory does not exist, create it in the path described
- To tune/test hyperparameters and static combinations of features, run "random_forest.py"
- Set hyperparameters by editing "param_grid" variable
- Set combinations of features by editing "columns_list" variable
- To perform feature selection, run "random_forest_RFECV.py"
- Adjust starting features by editing "columns" variable
- Note: not for tuning hyperparameters due to runtime; hyperparameters ("param_grid" variable) can be set for a single iteration of RFECV