From 103724b490bd92f6a7f0ee24982950ccade82553 Mon Sep 17 00:00:00 2001 From: Pabitra Dash Date: Fri, 4 Oct 2024 10:09:52 -0400 Subject: [PATCH] [#157] Example notebook for NLDI data access (#158) * [#157] new notebook with examples for retrieving data from NLDI * [#157] fixing corrupted notebook * [#157] deleting empty cell * [#157] adding examples with no multi-index * [#157] reformating code cells * [#157] updates to nldi examples notebook by Jeff Horsburgh * Update demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb Co-authored-by: Elise Hinman <121896266+ehinman@users.noreply.github.com> * Update demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb Co-authored-by: Elise Hinman <121896266+ehinman@users.noreply.github.com> * Update demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb Co-authored-by: Elise Hinman <121896266+ehinman@users.noreply.github.com> --------- Co-authored-by: Timothy Hodson <34148978+thodson-usgs@users.noreply.github.com> Co-authored-by: Elise Hinman <121896266+ehinman@users.noreply.github.com> --- ...S_dataretrieval_DailyValues_Examples.ipynb | 19 +- ...retrieval_GroundwaterLevels_Examples.ipynb | 17 + .../USGS_dataretrieval_NLDI_Examples.ipynb | 618 ++++++++++++++++++ .../USGS_dataretrieval_Peaks_Examples.ipynb | 23 +- ...USGS_dataretrieval_SiteInfo_Examples.ipynb | 2 +- ...dataretrieval_SiteInventory_Examples.ipynb | 2 +- ...GS_dataretrieval_Statistics_Examples.ipynb | 7 - ...GS_dataretrieval_UnitValues_Examples.ipynb | 17 + ..._dataretrieval_WaterSamples_Examples.ipynb | 39 ++ 9 files changed, 727 insertions(+), 17 deletions(-) create mode 100644 demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb diff --git a/demos/hydroshare/USGS_dataretrieval_DailyValues_Examples.ipynb b/demos/hydroshare/USGS_dataretrieval_DailyValues_Examples.ipynb index e12e6eef..1c3cf92e 100644 --- a/demos/hydroshare/USGS_dataretrieval_DailyValues_Examples.ipynb +++ b/demos/hydroshare/USGS_dataretrieval_DailyValues_Examples.ipynb @@ -286,6 +286,23 @@ "display(dailyMultiSites[0])" ] }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "The following example is the same as the previous example but with multi index turned off (multi_index=False)" + }, + { + "metadata": {}, + "cell_type": "code", + "outputs": [], + "execution_count": null, + "source": [ + "dailyMultiSites = nwis.get_dv(sites=[\"01491000\", \"01645000\"], parameterCd=[\"00010\", \"00060\"],\n", + " start=\"2012-01-01\", end=\"2012-06-30\", statCd=[\"00001\",\"00003\"],\n", + " multi_index=False)\n", + "display(dailyMultiSites[0])" + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -330,4 +347,4 @@ }, "nbformat": 4, "nbformat_minor": 1 -} \ No newline at end of file +} diff --git a/demos/hydroshare/USGS_dataretrieval_GroundwaterLevels_Examples.ipynb b/demos/hydroshare/USGS_dataretrieval_GroundwaterLevels_Examples.ipynb index a0afd9f9..5c31853b 100644 --- a/demos/hydroshare/USGS_dataretrieval_GroundwaterLevels_Examples.ipynb +++ b/demos/hydroshare/USGS_dataretrieval_GroundwaterLevels_Examples.ipynb @@ -286,6 +286,23 @@ "display(data2[0])" ] }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "The following example is the same as the previous example but with multi index turned off (multi_index=False)" + }, + { + "metadata": {}, + "cell_type": "code", + "outputs": [], + "execution_count": null, + "source": [ + "site_ids = [\"434400121275801\", \"375907091432201\"]\n", + "data2 = nwis.get_gwlevels(sites=site_ids, multi_index=False)\n", + "print(\"Retrieved \" + str(len(data2[0])) + \" data values.\")\n", + "display(data2[0])" + ] + }, { "cell_type": "markdown", "metadata": { diff --git a/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb b/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb new file mode 100644 index 00000000..c627e452 --- /dev/null +++ b/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb @@ -0,0 +1,618 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "94cf2fc11d917e1b", + "metadata": { + "tags": [] + }, + "source": [ + "# USGS dataretrieval Python Package NLDI Data Access Examples\n", + "\n", + "This notebook provides examples of using the Python dataretrieval package to retrieve data from the United States Geological Survey (USGS) Hydro Network-Linked Data Index (NLDI). The dataretrieval package provides a collection of functions to get data from the USGS Hydro Network-Linked Data Index (NLDI). For more information about NLDI visit the [USGS website](https://labs.waterdata.usgs.gov/docs/nldi/about-nldi/index.html) describing NLDI or [this blog post](https://waterdata.usgs.gov/blog/nldi-intro/) that covers basic features of the NLDI.\n", + "\n", + "From the [NLDI API documentation](https://labs.waterdata.usgs.gov/api/nldi/swagger-ui/index.html): The NLDI is a search service that takes a watershed outlet identifier as a starting point, a navigation mode to perform, and the type of data desired in response to the request. It can provide geospatial representations of the navigation or linked data sources found along the navigation. It also has the ability to return landscape characteristics for the catchment the watershed outlet is contained in or the total upstream basin." + ] + }, + { + "cell_type": "markdown", + "id": "8695bd7a7b335650", + "metadata": {}, + "source": [ + "### Install the Package\n", + "\n", + "Use the following code to install the package if it doesn't exist already within your Jupyter Python environment. Note the `nldi` option in the `dataretrieval` package installation. The default `dataretrieval` package installation does not support NLDI data access. You must run the dataretrieval install with the `nldi` option to ensure that you have the necessary capabilities. If you are running this notebook in the CUAHSI JuypyterHub server linked to HydroShare, you will want to run the following pip install command. The base `dataretrieval` package is already installed in the CUAHSI JupyterHub Python environment, but it does not include the NLDI option." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2cb985f4a60ce046", + "metadata": {}, + "outputs": [], + "source": [ + "!pip install dataretrieval[nldi]" + ] + }, + { + "cell_type": "markdown", + "id": "ef27dd9de0b05a9a", + "metadata": {}, + "source": [ + "Load the package so that you can use its functions in this notebook." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "aa0f8aad72102b29", + "metadata": {}, + "outputs": [], + "source": [ + "from dataretrieval import nldi\n", + "from IPython.display import display\n", + "import folium" + ] + }, + { + "cell_type": "markdown", + "id": "213e4c0d0b983a19", + "metadata": { + "tags": [] + }, + "source": [ + "***\n", + "\n", + "## Usage Examples\n", + "\n", + "The dataretrieval package provides a number of functions to get data from the USGS NLDI. The following sections provide examples of how each of the available functions can be used\n", + "\n", + "* `get_basin()`: function used to get the upstream basin boundary for any feature indexed by NLDI.\n", + "* `get_flowlines()`: function used to get upstream or downstream flowline data from NLDI.\n", + "* `get_features()`: function used to retrieve information about features from any source indexed by NLDI.\n", + "* `search()`: general function used to retrieve data from NLDI. Can be used in place of the functions listed above.\n", + "\n", + "***\n", + "\n", + "### Examples for the `get_basin()` function:\n", + "\n", + "The following examples show how to use the `get_basin()` function from the dataretrieval package to retrieve the upstream basin boundary (the contributing watershed for a feature) from the USGS NLDI associated with any feature indexed by NLDI. \n", + "\n", + "The following arguments are supported:\n", + "\n", + "* **feature_source** (string): The name of the NLDI feature source.\n", + "* **feature_id** (string): The identifier of the NLDI feature.\n", + "* **simplified** (boolean): If True, the data will be returned with simplified polygons. If False, the data will be returned as a single polygon (default is False).\n", + "* **split_catchment** (boolean): If True, the data will be returned with split catchment polygons. If False, the data will be returned as a single polygon (default is False) NOTE: Setting this to True may result in an error due to a known issue with NLDI API.\n", + "* **as_json** (boolean): If True, the data will be returned as a python dictionary. If False, the data will be returned as a geopandas dataframe (default is False).\n" + ] + }, + { + "cell_type": "markdown", + "id": "9900c519345f9d2f", + "metadata": {}, + "source": [ + "#### Example 1: Get the upstream basin for a single feature from a single source.\n", + "In this example, we will retrieve the boundary for the upstream basin for a single monitoring site feature. For the monitoring site feature, we will use a USGS water quality monitoring station. The feature source is the Water Quality Portal (WQP), and the identifier of the feature is the USGS water quality monitoring site identifer prepended with the agency code for USGS (USGS-01031500). You can use the identifiers for any monitoring station indexed by the NLDI to retrieve its upstream watershed boundary." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8db53f3d4d004e65", + "metadata": {}, + "outputs": [], + "source": [ + "# set the parameters needed to retrieve data\n", + "feat_source = 'WQP'\n", + "feat_id = 'USGS-01031500'" + ] + }, + { + "cell_type": "markdown", + "id": "c8595b1e706a8468", + "metadata": {}, + "source": [ + "Now we can call the `get_basin` function to get the coordinates of the polygon making up the basin boundary associated with this monitoring site - i.e., the watershed area upstream of the given monitoring site. The result will be returned as a geopandas dataframe unless the `as_json` argument is used and set to True." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d8d0d847d8c171b6", + "metadata": {}, + "outputs": [], + "source": [ + "basin_gdf = nldi.get_basin(feature_source=feat_source, feature_id=feat_id)\n", + "display(basin_gdf)" + ] + }, + { + "cell_type": "markdown", + "id": "af599a9f632930d", + "metadata": {}, + "source": [ + "If you want to get the basin boundary coordinate data in GeoJSON format, you can use the `as_json` argument in the function call (as_json=True)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "340793f67b33ff39", + "metadata": {}, + "outputs": [], + "source": [ + "basin_json_data = nldi.get_basin(feature_source=feat_source, feature_id=feat_id, as_json=True)\n", + "print(basin_json_data)" + ] + }, + { + "cell_type": "markdown", + "id": "c077de87-a826-4d93-b96a-bd65d98706f5", + "metadata": {}, + "source": [ + "Make a quick map of the selected monitoring station and the upstream boundary returned by `get_basin()`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9ff150da-b55f-4b19-97a5-75d5aa4d63f4", + "metadata": {}, + "outputs": [], + "source": [ + "# Get the feature associated with the monitoring site \n", + "# More examples of how to use the get_features() function are given below\n", + "site_gdf = nldi.get_features(feature_source=feat_source, feature_id=feat_id)\n", + "\n", + "# Set the Coordinate Reference System (CRS) for the GeoDataFrames \n", + "# containing the basin boundary coordinates and the monitoring site\n", + "# epsg='4326' for WGS84\n", + "basin_gdf.set_crs(epsg='4326', inplace=True)\n", + "site_gdf.set_crs(epsg='4326', inplace=True)\n", + "\n", + "# Create a base map using folium\n", + "m = folium.Map(location=[site_gdf.geometry.x[0], site_gdf.geometry.y[0]], zoom_start=10)\n", + "\n", + "# Add the selected monitoring location and basin features to the map\n", + "folium.GeoJson(site_gdf, name='Monitoring Location').add_to(m)\n", + "folium.GeoJson(basin_gdf, name='Basin Boundary', color='red').add_to(m)\n", + "\n", + "# Zoom the map to the bounds of the data\n", + "bounds = m.get_bounds()\n", + "m.fit_bounds(bounds)\n", + "\n", + "# Add layer control to toggle layers\n", + "folium.LayerControl().add_to(m)\n", + "\n", + "# Display the map\n", + "m" + ] + }, + { + "cell_type": "markdown", + "id": "23a84052f0711d2", + "metadata": {}, + "source": [ + "***\n", + "\n", + "### Examples for the `get_flowlines()` function:\n", + "\n", + "The following examples show how to use the `get_flowlines()` function from the dataretrieval package to get flowline data from the USGS NLDI. Flowlines can be traced upstream or downstream from any indexed feature (e.g., a USGS monitoring location), and the result can include mainstem only or tributaries. Flowlines are returned for the specified navigation in WGS84 latitude/longitude coordinats in GeoJSON format.\n", + "\n", + "The following arguments are supported:\n", + "\n", + "* **navigation_mode** (string): Navigation mode (allowed values are 'UM', 'DM', 'UT', 'DD'). 'UM' = Upstream on the mainstem. 'DM' = Downstream on the mainstem. 'UT' = Upstream with tributaries. 'DD' = Downstream with diversions.\n", + "* **feature_source** (string): The name of the NLDI feature source.\n", + "* **feature_id** (string): The identifier of the NLDI feature.\n", + "* **comid** (integer): COMID (required if feature_resource is not specified).\n", + "* **distance** (integer): Distance in kilometers (default is 5). This distance parameter dictates the distance to navigate.\n", + "* **as_json** (boolean): If True, the data will be returned as a python dictionary. If False, the data will be returned as a geopandas dataframe (default is False)." + ] + }, + { + "cell_type": "markdown", + "id": "3dc19d7dd78e3173", + "metadata": {}, + "source": [ + "#### Example 1: Get the flowline data using feature_source and feature_id\n", + "Get the flowline data tracing upstream on the mainstem from a USGS water quality monitoring site with the result returned as a geopandas dataframe. In this example, we will trace upstream including tributaries and include the distance argument with a distance that is sufficiently large to retrieve all of the flowlines upstream of the monitoring station." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "404457b0b8ea283c", + "metadata": {}, + "outputs": [], + "source": [ + "feat_source = 'WQP'\n", + "feat_id = 'USGS-01031500'\n", + "\n", + "flowlines_gdf = nldi.get_flowlines(navigation_mode='UT', feature_source=feat_source, feature_id=feat_id, distance=100)\n", + "display(flowlines_gdf)" + ] + }, + { + "cell_type": "markdown", + "id": "8e21e235eb64f446", + "metadata": {}, + "source": [ + "Get the same flowline data with the result returned as GeoJSON (as_json=True)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c1d916a742e0e986", + "metadata": {}, + "outputs": [], + "source": [ + "flowlines_json_data = nldi.get_flowlines(navigation_mode='UT', feature_source=feat_source, feature_id=feat_id, as_json=True)\n", + "print(flowlines_json_data)" + ] + }, + { + "cell_type": "markdown", + "id": "62c634be-74e1-4476-86f1-b7bd8fdbbd76", + "metadata": {}, + "source": [ + "Add the retrieved flowline data to the map for visualization of what is returned. To change the distance traced upstream on the flowlines, change the value of the distance argument in the `get_flowlines()` function call and set the distance upstream you want to trace. You can also change the navigation_mode argument to include or exclude tributaries." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0e9b0a6f-faef-40a4-ae34-a75e210bf5d2", + "metadata": {}, + "outputs": [], + "source": [ + "# Get the feature associated with the monitoring site \n", + "# More examples of how to use the get_features() function are given below\n", + "site_gdf = nldi.get_features(feature_source=feat_source, feature_id=feat_id)\n", + "\n", + "# Set the Coordinate Reference System (CRS) for the GeoDataFrames \n", + "# containing the basin boundary coordinates and the monitoring site\n", + "# epsg='4326' for WGS84\n", + "site_gdf.set_crs(epsg='4326', inplace=True)\n", + "flowlines_gdf.set_crs(epsg='4326', inplace=True)\n", + "\n", + "# Create a base map using folium\n", + "m = folium.Map(location=[site_gdf.geometry.x[0], site_gdf.geometry.y[0]], zoom_start=10)\n", + "\n", + "# Add the selected monitoring location and basin features to the map\n", + "folium.GeoJson(site_gdf, name='Monitoring Location').add_to(m)\n", + "folium.GeoJson(basin_gdf, name='Basin Boundary', color='red').add_to(m)\n", + "folium.GeoJson(flowlines_gdf, name='Flowlines', color='blue').add_to(m)\n", + "\n", + "# Zoom the map to the bounds of the data\n", + "bounds = m.get_bounds()\n", + "m.fit_bounds(bounds)\n", + "\n", + "# Add layer control to toggle layers\n", + "folium.LayerControl().add_to(m)\n", + "\n", + "# Display the map\n", + "m" + ] + }, + { + "cell_type": "markdown", + "id": "42259375160429ab", + "metadata": {}, + "source": [ + "#### Example 2: Get flowline data using a NHDPlus COMID\n", + "In some cases, you may wish to get flowline data associated with a NHDPlus COMID rather than flowlines associated with a monitoring station. The following shows how to get flowline data for a COMID as a geopandas dataframe." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c014b708c08984e2", + "metadata": {}, + "outputs": [], + "source": [ + "gdf = nldi.get_flowlines(navigation_mode='UM', comid=13294314)\n", + "display(gdf)" + ] + }, + { + "cell_type": "markdown", + "id": "49856c0e97950d5d", + "metadata": {}, + "source": [ + "To get the same flowline data as GeoJSON (as_json=True) instead of as a geopandas dataframe, do the following." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b39d360a47ba170f", + "metadata": {}, + "outputs": [], + "source": [ + "flowlines_json_data = nldi.get_flowlines(navigation_mode='UM', comid=13294314, as_json=True)\n", + "print(flowlines_json_data)" + ] + }, + { + "cell_type": "markdown", + "id": "b27151f75e00f649", + "metadata": {}, + "source": [ + "***\n", + "\n", + "### Examples for the `get_features()` function:\n", + "\n", + "The following examples show how to use the `get_features()` function from the dataretrieval package to get features data from the USGS NLDI. The `get_features()` function returns all features found along the specified navigation as points in WGS84 latitude/longitude endoced as GeoJSON.\n", + "\n", + "The following arguments are supported:\n", + "\n", + "* **data_source** (string): The name of the NLDI data source.\n", + "* **navigation_mode** (string): Navigation mode (allowed values are 'UM', 'DM', 'UT', 'DD'). 'UM' = Upstream on the mainstem. 'DM' = Downstream on the mainstem. 'UT' = Upstream with tributaries. 'DD' = Downstream with diversions.\n", + "* **feature_source** (string): The name of the NLDI feature source.\n", + "* **feature_id** (string): The identifier of the NLDI feature (required if feature_source is specified).\n", + "* **comid** (integer): COMID (required if feature_source is not specified).\n", + "* **distance** (integer): Distance in kilometers (default is 50). This distance parameter dictates the distance to navigate.\n", + "* **lat** (float): Latitude (required if feature for a specific location is specified).\n", + "* **long** (float): Longitude (required if feature for a specific location is specified).\n", + "* **as_json** (boolean): If True, the data will be returned as a python dictionary. If False, the data will be returned as a geopandas dataframe (default is False)." + ] + }, + { + "cell_type": "markdown", + "id": "b24a6b1f49ed5f7d", + "metadata": {}, + "source": [ + "#### Example 1: Get all indexed features along a specified navigation path\n", + "Get all of the indexed features from a particular data source using a navigation path that traces upstream along the mainstem (UM) and uses as an origin for the trace a feature from a given feature source (e.g., a monitoring station from the Water Quality Portal). You can get any indexed features from any of the included data sources. Example data sources and the codes you need to retrieve features from those sources include:\n", + "* \"census2020-nhdpv2\" - 2020 Census Block - NHDPlusV2 Catchment Intersections\n", + "* \"epa_nrsa\" - EPA National Rivers and Streams Assessment\n", + "* \"gfv11_pois\" - USGS Geospatial Fabric V1.1 Points of Interest\n", + "* \"huc12pp\" - HUC12 Pour Points NHDPlusV2\n", + "* \"npdes\" - NPDES Facilities that Discharge to Water\n", + "* \"nwisgw\" - NWIS Groundwater Sites\n", + "* \"nwissite\" - NWIS Surface Water Sites\n", + "* \"WQP\" - Water Quality Portal\n", + "* \"comid\" - NHDPlus comid\n", + "\n", + "There are a few others - for a complete list, consult the getDataSources endpoint of the NLDI API. For this example, we will trace upstream along the mainstem and find any NWIS surface water sites located along the mainstem of the river." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "492b5bedfb71a478", + "metadata": {}, + "outputs": [], + "source": [ + "feat_source = 'WQP'\n", + "feat_id = 'USGS-01031500'\n", + "features_gdf = nldi.get_features(data_source='nwissite', navigation_mode='UM', feature_source=feat_source, feature_id=feat_id)\n", + "display(features_gdf)" + ] + }, + { + "cell_type": "markdown", + "id": "ee85219d-4950-4350-8a2e-68f38fe0c096", + "metadata": {}, + "source": [ + "Add the returned features to the map." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "935739e5-e438-40fe-b8e1-7dc123dbe878", + "metadata": {}, + "outputs": [], + "source": [ + "# Get the feature associated with the monitoring site \n", + "# More examples of how to use the get_features() function are given below\n", + "site_gdf = nldi.get_features(feature_source=feat_source, feature_id=feat_id)\n", + "\n", + "# Set the Coordinate Reference System (CRS) for the GeoDataFrames \n", + "# containing the basin boundary coordinates and the monitoring site\n", + "# epsg='4326' for WGS84\n", + "site_gdf.set_crs(epsg='4326', inplace=True)\n", + "features_gdf.set_crs(epsg='4326', inplace=True)\n", + "\n", + "# Create a base map using folium\n", + "m = folium.Map(location=[site_gdf.geometry.x[0], site_gdf.geometry.y[0]], zoom_start=10)\n", + "\n", + "# Add the selected monitoring location and basin features to the map\n", + "folium.GeoJson(site_gdf, name='Monitoring Location').add_to(m)\n", + "folium.GeoJson(basin_gdf, name='Basin Boundary', color='red').add_to(m)\n", + "folium.GeoJson(flowlines_gdf, name='Flowlines', color='blue').add_to(m)\n", + "folium.GeoJson(features_gdf, name='Features', color='green').add_to(m)\n", + "\n", + "# Zoom the map to the bounds of the data\n", + "bounds = m.get_bounds()\n", + "m.fit_bounds(bounds)\n", + "\n", + "# Add layer control to toggle layers\n", + "folium.LayerControl().add_to(m)\n", + "\n", + "# Display the map\n", + "m" + ] + }, + { + "cell_type": "markdown", + "id": "2a61ce386ef17c8a", + "metadata": {}, + "source": [ + "Rather than using a water quality monitoring station as the orgin, you can also use a NHDPlus COMID as the origin for the trace. The code below does the same thing as the code above except it uses a COMID as the origin for the navigation. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fe7bee5ba6e4f419", + "metadata": {}, + "outputs": [], + "source": [ + "gdf = nldi.get_features(data_source='nwissite', navigation_mode='UM', comid=13294314)\n", + "display(gdf)" + ] + }, + { + "cell_type": "markdown", + "id": "c7418ac7d155af6c", + "metadata": {}, + "source": [ + "#### Example 2: Get information about indexed features\n", + "Sometimes you may want to get information about indexed features without using any sort of navigation path. In this case, you can use `get_features()` without specifying a navigation_mode. The following code returns the feature information for a water quality monitoring station from the Water Quality Portal." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "bbde823aba2b82ba", + "metadata": {}, + "outputs": [], + "source": [ + "gdf = nldi.get_features(feature_source='WQP', feature_id='USGS-01031500')\n", + "display(gdf)" + ] + }, + { + "cell_type": "markdown", + "id": "d72767444885dc2d", + "metadata": {}, + "source": [ + "#### Example 3: Get information about indexed features for a specific location\n", + "Get information about indexed features for a specific location by passing latitude and longitude coordinates into the `get_features()` function." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e6769b9885ef0edb", + "metadata": {}, + "outputs": [], + "source": [ + "gdf = nldi.get_features(lat=43.073051, long=-89.401230)\n", + "display(gdf)" + ] + }, + { + "cell_type": "markdown", + "id": "4283a91f6b12446d", + "metadata": {}, + "source": [ + "***\n", + "\n", + "### Examples for the `search()` function:\n", + "\n", + "The following examples show how to use the `search()` function from the dataretrieval package to get data (basins, flowlines, and features) from the USGS NLDI. You can use the `search()` function instead of the `get_basin()`, `get_flowlines()`, and `get_features()` functions described above. The search function returns data as a python dictionary. \n", + "\n", + "The following arguments are supported:\n", + "\n", + "* **feature_source** (string): The name of the NLDI feature source.\n", + "* **feature_id** (string): The identifier of the NLDI feature (required if feature_resource is specified).\n", + "* **navigation_mode** (string): Navigation mode (allowed values are 'UM', 'DM', 'UT', 'DD'). 'UM' = Upstream on the mainstem. 'DM' = Downstream on the mainstem. 'UT' = Upstream with tributaries. 'DD' = Downstream with diversions.\n", + "* **data_source** (string): The name of the NLDI data source.\n", + "* **find** (string): The specific data type to search for. Allowed values are 'basin', 'flowlines', and 'feature' (default is 'features').\n", + "* **comid** (integer): NHDPlus COMID (required if feature_source is not specified).\n", + "* **lat** (float): Latitude (required if feature for a specific location is specified).\n", + "* **long** (float): Longitude (required if feature for a specific location is specified).\n", + "* **distance** (integer): Distance in kilometers (default is 50). This distance parameter dictates the distance to navigate." + ] + }, + { + "cell_type": "markdown", + "id": "9fbfebefedf5d5c2", + "metadata": {}, + "source": [ + "#### Example 1: Get the upstream basin for an indexed water quality monitoring station\n", + "Instead of using `get_basin()`, the `search()` function can be used to retrieve the upstream contributing area for a water quality station from the Water Quality Portal." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9fe88e0664f629e8", + "metadata": {}, + "outputs": [], + "source": [ + "# set the parameters needed to retrieve data\n", + "feat_source = 'WQP'\n", + "feat_id = 'USGS-01031500'" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d7422c075998921c", + "metadata": {}, + "outputs": [], + "source": [ + "basin_data = nldi.search(feature_source=feat_source, feature_id=feat_id, find=\"basin\")\n", + "print(basin_data)" + ] + }, + { + "cell_type": "markdown", + "id": "bc4ef96efc59550a", + "metadata": {}, + "source": [ + "#### Example 2: Get flowlines data for a specified feature source\n", + "Instead of using `get_flowlines()`, the `search()` function can be used to retrieve the flowlines traced upstream or downstream from a feature. In this case, we can get the flowlines upstream along the mainstem for the water quality station in the last example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e247afc5b85a226c", + "metadata": {}, + "outputs": [], + "source": [ + "flowlines_data = nldi.search(navigation_mode='UM', feature_source=feat_source, feature_id=feat_id, find='flowlines')\n", + "print(flowlines_data)" + ] + }, + { + "cell_type": "markdown", + "id": "7e17c9af5d643323", + "metadata": {}, + "source": [ + "#### Example 3: Get all features along a specified navigation path\n", + "Likewise, instead of using `get_features()`, the `search()` function can be used to retrieve all of the indexed features from a particular data source along a particular navigation path - in this case tracing upstream along the mainstem from the same water quality station in the previous examples." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a40613fc4fedc416", + "metadata": {}, + "outputs": [], + "source": [ + "features_data = nldi.search(data_source='nwissite', navigation_mode='UM', feature_source=feat_source,\n", + " feature_id=feat_id, find='features')\n", + "print(features_data)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.7" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/demos/hydroshare/USGS_dataretrieval_Peaks_Examples.ipynb b/demos/hydroshare/USGS_dataretrieval_Peaks_Examples.ipynb index 3dd77e66..236eff05 100644 --- a/demos/hydroshare/USGS_dataretrieval_Peaks_Examples.ipynb +++ b/demos/hydroshare/USGS_dataretrieval_Peaks_Examples.ipynb @@ -183,6 +183,22 @@ "print(\"The query URL used to retrieve the data from NWIS was: \" + peak_data[1].url)" ] }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "The following example is the same as the previous example but with multi index turned off (multi_index=False)" + }, + { + "metadata": {}, + "cell_type": "code", + "outputs": [], + "execution_count": null, + "source": [ + "site_ids = ['01594440', '040851325']\n", + "peak_data = nwis.get_discharge_peaks(site_ids, multi_index=False)\n", + "print(\"Retrieved \" + str(len(peak_data[0])) + \" data values.\")" + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -239,13 +255,6 @@ "data4 = nwis.get_discharge_peaks(stations, start='1953-01-01', end='1960-01-01')\n", "display(data4[0])" ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] } ], "metadata": { diff --git a/demos/hydroshare/USGS_dataretrieval_SiteInfo_Examples.ipynb b/demos/hydroshare/USGS_dataretrieval_SiteInfo_Examples.ipynb index c9acc633..51007f9b 100644 --- a/demos/hydroshare/USGS_dataretrieval_SiteInfo_Examples.ipynb +++ b/demos/hydroshare/USGS_dataretrieval_SiteInfo_Examples.ipynb @@ -98,7 +98,7 @@ "* **siteOutput** (string 'basic' or 'expanded'): Indicates the richness of metadata you want for site attributes. Note that for visually oriented formats like Google Map format, this argument has no meaning. Note: for performance reasons, siteOutput='expanded' cannot be used if seriesCatalogOutput=true or with any values for outputDataTypeCd.\n", "* **seriesCatalogOutput** (boolean): A switch that provides detailed period of record information for certain output formats. The period of record indicates date ranges for a certain kind of information about a site, for example the start and end dates for a site's daily mean streamflow.\n", "\n", - "For additional parameter options see https://waterservices.usgs.gov/docs/site-service/site-service-details/ + "For additional parameter options see https://waterservices.usgs.gov/docs/site-service/site-service-details" ] }, { diff --git a/demos/hydroshare/USGS_dataretrieval_SiteInventory_Examples.ipynb b/demos/hydroshare/USGS_dataretrieval_SiteInventory_Examples.ipynb index 70193ddd..d8a2ceb1 100644 --- a/demos/hydroshare/USGS_dataretrieval_SiteInventory_Examples.ipynb +++ b/demos/hydroshare/USGS_dataretrieval_SiteInventory_Examples.ipynb @@ -96,7 +96,7 @@ "* **siteOutput** (string 'basic' or 'expanded'): Indicates the richness of metadata you want for site attributes. Note that for visually oriented formats like Google Map format, this argument has no meaning. Note: for performance reasons, siteOutput='expanded' cannot be used if seriesCatalogOutput=true or with any values for outputDataTypeCd.\n", "* **seriesCatalogOutput** (boolean): A switch that provides detailed period of record information for certain output formats. The period of record indicates date ranges for a certain kind of information about a site, for example the start and end dates for a site's daily mean streamflow.\n", "\n", - "For additional parameter options see https://waterservices.usgs.gov/docs/site-service/site-service-details/ + "For additional parameter options see https://waterservices.usgs.gov/docs/site-service/site-service-details" ] }, { diff --git a/demos/hydroshare/USGS_dataretrieval_Statistics_Examples.ipynb b/demos/hydroshare/USGS_dataretrieval_Statistics_Examples.ipynb index e386a301..f67f1510 100644 --- a/demos/hydroshare/USGS_dataretrieval_Statistics_Examples.ipynb +++ b/demos/hydroshare/USGS_dataretrieval_Statistics_Examples.ipynb @@ -286,13 +286,6 @@ " startDt=\"2000\", endDt=\"2007\")\n", "display(x3[0])" ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] } ], "metadata": { diff --git a/demos/hydroshare/USGS_dataretrieval_UnitValues_Examples.ipynb b/demos/hydroshare/USGS_dataretrieval_UnitValues_Examples.ipynb index c38402fd..c24fc587 100644 --- a/demos/hydroshare/USGS_dataretrieval_UnitValues_Examples.ipynb +++ b/demos/hydroshare/USGS_dataretrieval_UnitValues_Examples.ipynb @@ -366,6 +366,23 @@ "print('Retrieved ' + str(len(discharge_multisite[0])) + ' data values.')\n", "display(discharge_multisite[0])" ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "The following example is the same as the previous example but with multi index turned off (multi_index=False)" + }, + { + "metadata": {}, + "cell_type": "code", + "outputs": [], + "execution_count": null, + "source": [ + "discharge_multisite = nwis.get_iv(sites=['04024430', '04024000'], parameterCd=parameterCode,\n", + " start='2013-10-01', end='2013-10-01', multi_index=False)\n", + "print('Retrieved ' + str(len(discharge_multisite[0])) + ' data values.')\n", + "display(discharge_multisite[0])" + ] } ], "metadata": { diff --git a/demos/hydroshare/USGS_dataretrieval_WaterSamples_Examples.ipynb b/demos/hydroshare/USGS_dataretrieval_WaterSamples_Examples.ipynb index 0bb8b1f7..44a9f3b3 100644 --- a/demos/hydroshare/USGS_dataretrieval_WaterSamples_Examples.ipynb +++ b/demos/hydroshare/USGS_dataretrieval_WaterSamples_Examples.ipynb @@ -225,6 +225,24 @@ "display(wq_multi_site[0])" ] }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "The following example is the same as the previous example but with multi index turned off (multi_index=False)" + }, + { + "metadata": {}, + "cell_type": "code", + "outputs": [], + "execution_count": null, + "source": [ + "site_ids = ['04024430', '04024000']\n", + "parameter_code = '00065'\n", + "wq_multi_site = nwis.get_qwdata(sites=site_ids, parameterCd=parameter_code, multi_index=False)\n", + "print('Retrieved data for ' + str(len(wq_multi_site[0])) + ' samples.')\n", + "display(wq_multi_site[0])" + ] + }, { "cell_type": "markdown", "metadata": { @@ -260,6 +278,27 @@ "display(wq_data2[0])\n" ] }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "The following example is the same as the previous example but with multi index turned off (multi_index=False)" + }, + { + "metadata": {}, + "cell_type": "code", + "outputs": [], + "execution_count": null, + "source": [ + "site_ids = ['04024430', '04024000']\n", + "parameterCd = ['34247', '30234', '32104', '34220']\n", + "startDate = '2012-01-01'\n", + "endDate = ''\n", + "wq_data2 = nwis.get_qwdata(sites=site_ids, parameterCd=parameterCd,\n", + " start=startDate, end=endDate, multi_index=False)\n", + "print('Retrieved data for ' + str(len(wq_multi_site[0])) + ' samples.')\n", + "display(wq_data2[0])" + ] + }, { "cell_type": "markdown", "metadata": {},