Skip to content

Generate NcML that standardizes and aggregates datasets using YAML input

License

Notifications You must be signed in to change notification settings

USGS-CMG/yaml2ncml

Repository files navigation

yaml2ncml

NcML aggregation from YAML specifications.

yaml2ncml is a command line tool to facilitate the creation of NcML aggregation file for THREDDS servers.

Install it using pip or conda.

conda install -c conda-forge yaml2ncml

The user must create a YAML config file (see the example below) and run:

yaml2ncml roms.yaml

# or save the output

yaml2ncml roms.yaml --output roms_aggregation.ncml

roms.yaml

dataset:
    id: "USGS_COAWST_MVCO_CBLAST_Ripples_SWAN_40m"

    title: "USGS-CMG-COAWST Model: CBLAST2007 Ripples with SWAN-40m res"

    summary: "Simulation of hydrodynamics and bottom stress south of Marthas Vineyard, MA using the COAWST modeling system.  These results are from the 40m inner nest of a four-level nested simulation."

    project:
        - CMG_Portal
        - Sandy_Portal

    creator:
        email: [email protected]
        name: Neil Ganju
        url: http://water.usgs.gov/fluxes

    publisher:
        email: [email protected]
        name: Tarandeep Kalra
        url: http://www.usgs.gov

    contributor:
        role: advisor
        email: [email protected]
        name: Rich Signell
        url: http://profile.usgs.gov/rsignell


    license: "The data may be used and redistributed for free but is not intended for legal use, since it may contain inaccuracies. Neither the data Contributor, nor the United States Government, nor any of their employees or contractors, makes any warranty, express or implied, including warranties of merchantability and fitness for a particular purpose, or assumes any legal liability for the accuracy, completeness, or usefulness, of this information."

    references:
        - http://www.whoi.edu/science/AOPE/dept/CBLASTmain.html
        - http://water.usgs.gov/fluxes/mvco.html
        - doi:10.1029/2011JC007035

    acknowledgements:
        - USGS-CMGP
        - NSF

variables:
    include:
        - temp
        - salt

    exclude:
        - ubar
        - vbar

aggregation:
    time_var: ocean_time
    dir: Output
    sample_file: test_nc4_0001.nc
    pattern: .*test_nc4_[0-9]{4}\.nc$

Notes on the YAML file:

  1. The aggregation dir: is the directory where the data (e.g. NetCDF files) are located, relative to the directory where the NcML will be. In the above example, the NetCDF files are located in a subdirectory called "Output". If the NetCDF files will be in the same directory as the NcML file, specify dir: '.'.
  2. Specify that all variables should appear in the aggregation (none excluded) like this:
variables:
    include:
        - All

    exclude:
        - None

Development

virtualenv yaml2ncml
cd yaml2ncml
source bin/activate
git clone https://github.com/USGS-CMG/yaml2ncml.git

Running tests

# via distutils
python setup.py test
# manually
cd yaml2ncml/tests && py.test

Code Conventions

yaml2ncml code conventions are as per [PEP8](https://www.python.org/dev/peps/pep-0008)

# manually
flake8 --max-line-length=100 <file.py>

Issues

Issues are managed at https://github.com/USGS-CMG/yaml2ncml/issues