Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add green data #131

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,4 @@ inst/doc
/doc/
/Meta/
/.env
/data-raw/
2 changes: 1 addition & 1 deletion inst/codec_catalog/make_all_codec_dpkg.R
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ dpkgs <-
get_codec_dpkg("traffic-v0.1.2"),
get_codec_dpkg("drivetime-v0.2.2"),
get_codec_dpkg("environmental_justice_index-v0.1.0"),
get_codec_dpkg("landcover-v0.1.0"),
get_codec_dpkg("green-v0.1.0"),
get_codec_dpkg("parcel-v0.1.1"),
get_codec_dpkg("property_code_enforcements-v0.2.0"),
get_codec_dpkg("xx_address-v0.2.0")
Expand Down
29 changes: 29 additions & 0 deletions inst/codec_data/green/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Green

[![latest github release for green dpkg](https://img.shields.io/github/v/release/geomarker-io/codec?sort=date&filter=green-*&display_name=tag&label=%5B%E2%98%B0%5D&labelColor=%238CB4C3&color=%23396175)](https://github.com/geomarker-io/codec/releases?q=green&expanded=false)

## About

Annual measures of greenness and built environment. The codec dpkg includes the most recently available vintage for each data product (2023 greenspace and impervious, 2021 tree canopy, 2024 EVI, and 2020 city treecanopy)

## Data

### greenspace

The National Landcover Database (NLCD) classifies land at a 30 m x 30 m resolution into one of [16 categories](https://www.mrlc.gov/data/type/land-cover). Here, a grid cell is considered greenspace if its NLCD land cover classification is in any category other than water, ice/snow, developed medium intensity, developed high intensity, or rock/sand/clay. Grid cells are aggregated to census tracts by calculating the percentage of overlapping "green" grid cells in each tract.

### tree canopy and impervious surface

Percentage [tree canopy](https://www.mrlc.gov/data/type/tree-canopy) and [impervious surface](https://www.mrlc.gov/data/type/fractional-impervious-surface) coverage at 30 m x 30 m resolution were obtained from NLCD and aggregated to census tract by calculating the mean tree canopy percentage for grid cells overlapping each tract.

### EVI

The Enhanced Vegetation Index (EVI) is a measure of greenness that ranges from -0.2 to 1, with higher values corresponding to more vegetation. A cloud-free composite EVI raster at a resolution of 250 × 250 m was created by assembling individual images collected via remote sensing between June 9 and June 24, 2024. Tract averages were calculated using the mean EVI of overlapping grid cells. The EVI raster file (`MOD13Q1.A2024161.h11v05.061.2024181211403.hdf`) was downloaded from https://search.earthdata.nasa.gov/search on 2025-03-12, scaled by 0.0001, and rounded to 3 decimal points

### park service area coverage and park greenspace coverage

The percentage of area covered by at least one park service area (defined as a 10-minute walk or 0.25 mile buffer around each park) and the percentage of area covered by park greenspace was calculated for each tract. Data was obtained from correspondence with Cincinnati Parks in August 2024.

### city tree canopy

2020 LIDAR measured tree canopy area by 2010 census block groups summed by 2010 census tracts then interpolated to 2020 census tract using area weighting. City tree canopy is only available for census tracts within the City of Cincinnati.
45 changes: 45 additions & 0 deletions inst/codec_data/green/dataverse.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
#' get_dv_url
#' get the download URL for a file hosted on a dataverse instance
#' @param persistent_id the dataset's unique, persistent identifier
#' @param filename the name of the file in the dataset
#' @param version the dataset's version
#' @param server_url the dataverse instance's URL
#' @export
get_dv_url <- function(persistent_id, filename = NULL, version = "latest", server_url = "https://dataverse.harvard.edu") {

stopifnot(is.character(persistent_id))
stopifnot(is.character(version))
stopifnot(is.character(server_url))
if (! substr(persistent_id, 1, 4) == "doi:") stop("`persistent_id` must begin with 'doi:'", call. = FALSE)
if (version == "latest") version <- ":latest"

req <-
httr2::request(server_url) |>
httr2::req_user_agent("pcog (https://github.com/geomarker-io/pcog)") |>
httr2::req_url_path_append("api", "datasets", ":persistentId", "versions", version) |>
httr2::req_url_query("persistentId" = persistent_id) |>
httr2::req_error(
is_error = function(resp) httr2::resp_status(resp) != 200,
body = function(resp) glue::glue("version {version} of {persistent_id} not found at {server_url}")
)
resp <- httr2::req_perform(req)

the_files <-
httr2::resp_body_json(resp)$data$files |>
vapply(\(.) .$dataFile[["id"]], integer(1))

names(the_files) <-
httr2::resp_body_json(resp)$data$files |>
vapply(\(.) .$dataFile[["filename"]], character(1))

if (length(filename) == 1 && filename %in% names(the_files)) {
file_id <- the_files[[filename]]
} else {
message("available files for ", persistent_id, " include: \n ", paste(names(the_files), collapse = "\n "))
if (length(filename) == 0) stop("no filename requested", call. = FALSE)
stop("filename ", filename, " not found.", call. = FALSE)
}

cog_url <- glue::glue("https://dataverse.harvard.edu/api/access/datafile/{file_id}")
return(as.character(cog_url))
}
177 changes: 177 additions & 0 deletions inst/codec_data/green/green.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
if (tryCatch(read.dcf("DESCRIPTION")[1, "Package"] == "codec", finally = FALSE)) {
devtools::load_all()
} else {
library(codec)
}
message("Using CoDEC, version ", packageVersion("codec"))
source("inst/codec_data/green/dataverse.R")

nlcd_legend <-
tibble::tribble(
~value, ~landcover_class, ~landcover, ~green,
11, "water", "water", FALSE,
12, "water", "ice/snow", FALSE,
21, "developed", "developed open", TRUE,
22, "developed", "developed low intensity", TRUE,
23, "developed", "developed medium intensity", FALSE,
24, "developed", "developed high intensity", FALSE,
31, "barren", "rock/sand/clay", FALSE,
41, "forest", "deciduous forest", TRUE,
42, "forest", "evergreen forest", TRUE,
43, "forest", "mixed forest", TRUE,
51, "shrubland", "dwarf scrub", TRUE,
52, "shrubland", "shrub/scrub", TRUE,
71, "herbaceous", "grassland", TRUE,
72, "herbaceous", "sedge", TRUE,
73, "herbaceous", "lichens", TRUE,
74, "herbaceous", "moss", TRUE,
81, "cultivated", "pasture/hay", TRUE,
82, "cultivated", "cultivated crops", TRUE,
90, "wetlands", "woody wetlands", TRUE,
95, "wetlands", "emergent herbaceous wetlands", TRUE
)

green_rast <-
dpkg::stow("https://s3-us-west-2.amazonaws.com/mrlc/Annual_NLCD_LndCov_2023_CU_C1V0.tif") |>
terra::rast()

hc <-
codec::cincy_county_geo() |>
sf::st_as_sfc() |>
terra::vect() |>
terra::project(green_rast)

tract <-
codec::cincy_census_geo("tract", "2020") |>
terra::vect() |>
terra::project(green_rast)

out <- tibble::tibble(census_tract_id_2020 = tract$geoid)

# land cover - greenspace
out$greenspace_2023 <-
green_rast |>
terra::crop(hc) |>
terra::extract(tract) |>
dplyr::rename(value = 2) |>
dplyr::left_join(nlcd_legend, by = "value") |>
dplyr::group_by(ID) |>
dplyr::summarize(greenspace = round(sum(green) / dplyr::n() * 100)) |>
dplyr::pull(greenspace)

# tree canopy
out$treecanopy_2021 <-
dpkg::stow("https://s3-us-west-2.amazonaws.com/mrlc/nlcd_tcc_CONUS_2021_v2021-4.zip") |>
unzip(files = "nlcd_tcc_conus_2021_v2021-4.tif", exdir = "/Users/RASV5G/Library/Application Support/org.R-project.R/R/stow/") |>
terra::rast() |>
terra::crop(hc) |>
terra::extract(tract, fun = mean, ID = FALSE) |>
dplyr::pull(1)

# impervious
out$impervious_2023 <-
get_dv_url(
persistent_id = "doi:10.7910/DVN/KXETFC",
filename = glue::glue("Annual_NLCD_FctImp_2023_CU_C1V0_COG.tif"),
version = "latest"
) |>
terra::rast(vsi = TRUE) |>
terra::extract(tract, fun = mean, ID = FALSE) |>
dplyr::pull(1)

# EVI
out$evi_2024 <-
terra::sds("data-raw/MOD13Q1.A2024161.h11v05.061.2024181211403.hdf")[2] |>
terra::project(terra::crs(hc)) |>
terra::crop(hc) |>
terra::extract(tract, fun = mean) |>
dplyr::mutate(evi = round(`250m 16 days EVI`*0.0001, 3)) |>
dplyr::pull(evi)

# parks
sf::st_layers(dsn = "data-raw/TPL.gdb/")
tract_sf <- codec::cincy_census_geo("tract", "2020") |> sf::st_transform(5072)

parks_sa <-
sf::st_read("data-raw/TPL.gdb/", layer = "ParkServiceAreas", quiet = TRUE) |>
sf::st_transform(sf::st_crs(tract_sf))

parks_green <-
sf::st_read("data-raw/TPL.gdb/", layer = "Greenspace", quiet = TRUE) |>
sf::st_transform(sf::st_crs(tract_sf))

get_area_pct <- function(tract, parks) {
area_denom <- sf::st_area(tract) |> as.numeric()

intersection <- sf::st_intersection(tract, parks)

if(nrow(intersection) < 1) {
return(0)
} else {
area_numer <-
intersection|>
sf::st_union() |>
sf::st_area() |>
as.numeric()

return(round(area_numer / area_denom * 100))
}
}

out$park_service_area <-
purrr::map_dbl(
1:nrow(tract_sf),
\(x) get_area_pct(tract_sf[x,], parks_sa)
)

out$park_greenspace <-
purrr::map_dbl(
1:nrow(tract_sf),
\(x) get_area_pct(tract_sf[x,], parks_green)
)

# city tree canopy
city_canopy <-
sf::st_read("data-raw//Cincinnati Canopy Inforamtion 2/CincinnatiCanopyInformation.gdb", layer = "Block_Group_Canopy_Change", quiet = TRUE) |>
sf::st_drop_geometry() |>
dplyr::select(
block_group_id_2010 = ID,
TreeCanopy_2020_Area
) |>
dplyr::mutate(census_tract_id_2010 = stringr::str_sub(block_group_id_2010, 1, 11)) |>
dplyr::group_by(census_tract_id_2010) |> #2010 tract/bgs
dplyr::summarize(TreeCanopy_2020_Area = sum(TreeCanopy_2020_Area)) |>
dplyr::left_join(codec::cincy_census_geo("tract", "2013"), by = c("census_tract_id_2010" = "geoid")) |>
sf::st_as_sf() |>
sf::st_transform(3735) |>
dplyr::mutate(
tract_area = sf::st_area(s2_geography),
city_treecanopy_2020 = as.numeric(round(TreeCanopy_2020_Area/tract_area*100))
) |>
sf::st_drop_geometry() |>
dplyr::select(census_tract_id_2010, city_treecanopy_2020)

out$city_treecanopy_2020 <-
dplyr::left_join(codec::cincy_census_geo("tract", "2013"), city_canopy, by = c("geoid" = "census_tract_id_2010")) |>
dplyr::rename(
census_tract_id_2010 = geoid,
geometry = s2_geography
) |>
sf::st_transform(5072) |>
cincy::interpolate(to = cincy::tract_tigris_2020, weights = "area") |>
sf::st_drop_geometry() |>
dplyr::pull(city_treecanopy_2020)

# dpkg
out_dpkg <-
out |>
dplyr::mutate(year = 2023) |>
as_codec_dpkg(
name = "green",
version = "0.1.0",
title = "Greenspace and Built Environment",
homepage = "https://github.com/geomarker-io/codec",
description = paste(readLines(fs::path_package("codec", "codec_data", "green", "README.md")), collapse = "\n")
)

dpkg::dpkg_gh_release(out_dpkg, draft = TRUE)
1 change: 1 addition & 0 deletions vignettes/articles/data.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,5 @@ Click the latest release badge for more information about the data package, incl
| American Community Survey Measures |[![latest github release for acs_measures dpkg](https://img.shields.io/github/v/release/geomarker-io/codec?sort=date&filter=acs_measures-*&display_name=tag&label=%5B%E2%98%B0%5D&labelColor=%238CB4C3&color=%23396175)](https://github.com/geomarker-io/codec/releases?q=acs_measures&expanded=false)|
| Average Annual Daily Truck and Total Traffic Counts |[![latest github release for traffic dpkg](https://img.shields.io/github/v/release/geomarker-io/codec?sort=date&filter=traffic-*&display_name=tag&label=%5B%E2%98%B0%5D&labelColor=%238CB4C3&color=%23396175)](https://github.com/geomarker-io/codec/releases?q=traffic&expanded=false)|
| Voter Participation Rates |[![latest github release for voter_participation dpkg](https://img.shields.io/github/v/release/geomarker-io/codec?sort=date&filter=voter_participation-*&display_name=tag&label=%5B%E2%98%B0%5D&labelColor=%238CB4C3&color=%23396175)](https://github.com/geomarker-io/codec/releases?q=voter_participation_rates&expanded=false)|
| Green |[![latest github release for green dpkg](https://img.shields.io/github/v/release/geomarker-io/codec?sort=date&filter=green-*&display_name=tag&label=%5B%E2%98%B0%5D&labelColor=%238CB4C3&color=%23396175)](https://github.com/geomarker-io/codec/releases?q=green&expanded=false)|

Loading