Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix!: correct nwis module GeoDataFrames crs to EPSG:4269 (NAD 83) #169

Merged
merged 1 commit into from
Oct 25, 2024

Conversation

aaraney
Copy link
Contributor

@aaraney aaraney commented Oct 25, 2024

BREAKING CHANGE: v1.0.10 is the only affected version. See footer for more detail.

The nwis site service (and the USGS more broadly) uses the NAD83 crs.

64a575d introduced a bug that hard coded the nwis module's crs to EPSG:4236 -- Hu Tzu Shan 1950 (https://epsg.io/4236). I guarantee 64a575d meant to hard code EPSG:4326 -- WGS84. It appears it was fat fingered and missed in #143's PR review (it happens to the best of us:)).

Bug reproduction:

import dataretrieval
from dataretrieval import nwis

assert dataretrieval.__version__ == "1.0.10"

df, metadata = nwis.get_info(sites="01013500")
assert df.crs.to_epsg() == 4236

In either case, EPSG:4326 is still technically the wrong crs. For example:

https://waterservices.usgs.gov/nwis/site?sites=01013500&siteOutput=Expanded&format=rdb

returns (truncated)

USGS 01013500 ... NAD83

for the field coord_datum_cd -- Latitude-longitude datum.

So, EPSG:4269 -- NAD83 is the correct crs. To my understanding the default crs used by the USGS is NAD83 (please check me on this). There are exceptions for example, the NLDI service provides GeoJSON data which the spec specified should be in EPSG:4326 -- WGS84.

There is room for discussion if NAD 83 vs WGS 84 really matters in the conterminous US however for Alaska errors in both the x and y direction can be upwards of 1 meter in either direction.

BREAKING CHANGE: v1.0.10 is the only affected version. The GeoDataFrames returned from the following functions now correctly have a crs of EPSG:4269. In v1.0.10 they mistakenly returned with a crs of EPSG:4236 -- Hu Tzu Shan 1950.

dataretrieval.nwis.get_qwdata
dataretrieval.nwis.get_discharge_measurements
dataretrieval.nwis.get_discharge_peaks
dataretrieval.nwis.get_gwlevels
dataretrieval.nwis.get_stats
dataretrieval.nwis.get_info
dataretrieval.nwis.get_pmcodes
dataretrieval.nwis.get_water_use
dataretrieval.nwis.get_ratings
dataretrieval.nwis.what_sites
dataretrieval.nwis.get_dv
dataretrieval.nwis.get_iv

The nwis site service (and the USGS more broadly) uses the NAD83 crs.

64a575d introduced a bug that hard coded the nwis module's  crs to
EPSG:4236 -- Hu Tzu Shan 1950 (https://epsg.io/4236). I guarantee
64a575d meant to hard code _EPSG:4326_ -- WGS84. It appears it was fat
fingered and missed in DOI-USGS#143's PR review (it happens to the best of us:)).

In either case, EPSG:4326 is still _technically_ the wrong crs. For example:

https://waterservices.usgs.gov/nwis/site?sites=01013500&siteOutput=Expanded&format=rdb

returns
```
USGS 01013500 ... NAD83
```
for the field `coord_datum_cd  -- Latitude-longitude datum`.
So EPSG:4269 -- NAD83 is the correct crs.

There is room for discussion if NAD83 vs WGS 84 really matters in the
conterminous US however for Alaska errors in both the x and y direction
can be upwards of 1 meter in either direction.

BREAKING CHANGE: v1.0.10 is the only affected version. The GeoDataFrames
returned from the following functions now correctly have a crs of
EPSG:4269.
dataretrieval.nwis.get_qwdata
dataretrieval.nwis.get_discharge_measurements
dataretrieval.nwis.get_discharge_peaks
dataretrieval.nwis.get_gwlevels
dataretrieval.nwis.get_stats
dataretrieval.nwis.get_info
dataretrieval.nwis.get_pmcodes
dataretrieval.nwis.get_water_use
dataretrieval.nwis.get_ratings
dataretrieval.nwis.what_sites
dataretrieval.nwis.get_dv
dataretrieval.nwis.get_iv
@aaraney
Copy link
Contributor Author

aaraney commented Oct 25, 2024

I think it is worth considering if dataretrieval v1.0.10 should be yanked from PyPI after presumedly v1.0.11 is released.

@aaraney aaraney changed the title fix!: correct site service crs to EPSG:4269 (NAD 83) fix!: correct nwis module GeoDataFrames crs to EPSG:4269 (NAD 83) Oct 25, 2024
@ehinman
Copy link
Collaborator

ehinman commented Oct 25, 2024

Hey @aaraney! Thank you for bringing this to our attention. I have verified that NAD83 is the recommended CRS for USGS products: https://www.usgs.gov/about/organization/science-support/office-science-quality-and-integrity/guidance-use-and
and verified that you have used the correct CRS code: https://epsg.io/4269

@thodson-usgs, OK to merge?

@thodson-usgs
Copy link
Collaborator

Hmm. it doesn't fix Hawaii, right? But I agree we should merge and cut a new release ASAP. On it.

@mikemahoney218-usgs
Copy link

Want to flag that it's not safe to assume a single CRS from Waterservices returns.

Here's a site that uses 4326:
https://waterservices.usgs.gov/nwis/site/?format=rdb&sites=483554104034801&siteOutput=expanded

Here's one using "OLDGUAM":
https://waterservices.usgs.gov/nwis/site?sites=073335134341070&siteOutput=Expanded&format=rdb

Copy link
Collaborator

@thodson-usgs thodson-usgs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might need to revisit this, but I agree this needs to be fixed ASAP.

@thodson-usgs thodson-usgs merged commit 3ba0c83 into DOI-USGS:main Oct 25, 2024
11 checks passed
@aaraney aaraney deleted the fix-nwis-modules-crs branch October 25, 2024 17:30
@lstanish-usgs
Copy link
Collaborator

Agreed, we don't want to assume that all sites use the same CRS if they do not. Thank you for raising this issue @aaraney

@aaraney
Copy link
Contributor Author

aaraney commented Oct 25, 2024

Certainly, @lstanish-usgs! Thanks for acting on this so quickly, @ehinman & @thodson-usgs!

@thodson-usgs
Copy link
Collaborator

thodson-usgs commented Oct 25, 2024

Agreed, we don't want to assume that all sites use the same CRS if they do not. Thank you for raising this issue @aaraney

Hate to say it, but the only fix would be to reproject everything back to WGS84, right?

At least the information in the dataframes is correct, and users in Guam etc. would probably catch this in their sleep.

@aaraney
Copy link
Contributor Author

aaraney commented Oct 25, 2024

Great find, @mikemahoney218-usgs. A quick search led me to this seemingly related issue Kevin-M-Smith/nwisnfie#18. Following one of the links there I found a table of coord_datum_cd values (https://help.waterdata.usgs.gov/code/coord_datum_cd_query?fmt=html) via https://help.waterdata.usgs.gov/codes-and-parameters/codes#SI.

Gw Ref Cd Gw Ref Ds
NAD27 North American Datum of 1927
NAD83 North American Datum of 1983
OLDAK Old Alaska (Mainland) and Aleutian Island Datum
OLDAK Old Alaska (Mainland) and Aleutian Islands Datum
OLDGUAM Old Guam Datum
OLDHI Old Hawaiian Datum
OLDPR Old Puerto Rico and Virgin Island Datum
OLDPR Old Puerto Rico and Virgin Islands Datum
OLDSAMOA Old American Samoa Datum
PUERTORICO Puerto Rico & Virgin Islands
WGS72 World Datum 1972
WGS84 World Datum 1984

@aaraney
Copy link
Contributor Author

aaraney commented Oct 25, 2024

Agreed, we don't want to assume that all sites use the same CRS if they do not. Thank you for raising this issue @aaraney

Hate to say it, but the only fix would be to reproject everything back to WGS84, right?

At least the information in the dataframes is correct, and users in Guam etc. would probably catch this in their sleep.

@thodson-usgs to get it right 100% of the time, I think you would need to groupby the site identifier then set the crs of each group to its actual crs (e.g. OLDGUAM) then transform each group to a common crs (e.g. WGS84) and recombine.

Maybe the easiest thing to do in the short term is warn if something other than NAD83 is present in the dec_coord_datum_cd column.

@ehinman
Copy link
Collaborator

ehinman commented Oct 25, 2024

Thanks for the discussion, everyone. Added an issue here: #170

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants