Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

neonutilities submission #213

Open
15 of 32 tasks
cklunch opened this issue Sep 25, 2024 · 16 comments
Open
15 of 32 tasks

neonutilities submission #213

cklunch opened this issue Sep 25, 2024 · 16 comments

Comments

@cklunch
Copy link

cklunch commented Sep 25, 2024

Submitting Author: Claire Lunch (@cklunch)
All current maintainers: (@cklunch, @bhass-neon, @znickerson8)
Package Name: neonutilities
One-Line Description of Package: neonutilities is a package for accessing and wrangling data generated and published by the National Ecological Observatory Network.
Repository Link: https://github.com/NEONScience/NEON-utilities-python
Version submitted: v1.0.1
EiC: @cmarmo
Editor: @JuliMillan
Reviewer 1: @ethanwhite
Reviewer 2: @benjamindonnachie
Archive: TBD
JOSS DOI: TBD
Version accepted: TBD
Date accepted (month/day/year): TBD


Code of Conduct & Commitment to Maintain Package

Description

The neonutilities Python package provides utilities for discovering, downloading, and working with data files published by the National Ecological Observatory Network (NEON). NEON data files can be downloaded from the NEON Data Portal or API. The neonutilities package includes wrapper functions for the API and functions to reformat and stack NEON tabular data for analysis. This is a Python-native adaptation of the heavily used neonUtilities R package.

Scope

  • Please indicate which category or categories.
    Check out our package scope page to learn more about our
    scope. (If you are unsure of which category you fit, we suggest you make a pre-submission inquiry):

    • Data retrieval
    • Data extraction
    • Data processing/munging
    • Data deposition
    • Data validation and testing
    • Data visualization1
    • Workflow automation
    • Citation management and bibliometrics
    • Scientific software wrappers
    • Database interoperability

Domain Specific

  • Geospatial
  • Education

Community Partnerships

If your package is associated with an
existing community please check below:

  • For all submissions, explain how and why the package falls under the categories you indicated above. In your explanation, please address the following points (briefly, 1-2 sentences for each):

    • Who is the target audience and what are scientific applications of this package?

The target audience is scientists doing research using NEON data. The package enables programmatic workflows for data downloading, and provides a standardized way to merge the product-site-month data files NEON publishes, making them analysis-ready.

  • Are there other Python packages that accomplish the same thing? If so, how does yours differ?

There is an incomplete package on PyPi here that was started in 2020 by a student at a coding camp. It doesn't appear to have been finished, and is not maintained. Some NEON users have developed their own code to do some of the functionality covered by neonutilities, but as far as I know none of them have shared it broadly.

  • If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted:

Technical checks

For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:

  • does not violate the Terms of Service of any service it interacts with.
  • uses an OSI approved license.
  • contains a README with instructions for installing the development version.
  • includes documentation with examples for all functions.
  • contains a tutorial with examples of its essential functions and uses.
  • has a test suite.
  • has continuous integration setup, such as GitHub Actions CircleCI, and/or others.

Publication Options

JOSS Checks
  • The package has an obvious research application according to JOSS's definition in their submission requirements. Be aware that completing the pyOpenSci review process does not guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS.
  • The package is not a "minor utility" as defined by JOSS's submission requirements: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria.
  • The package contains a paper.md matching JOSS's requirements with a high-level description in the package root or in inst/.
  • The package is deposited in a long-term repository with the DOI:

Note: JOSS accepts our review as theirs. You will NOT need to go through another full review. JOSS will only review your paper.md file. Be sure to link to this pyOpenSci issue when a JOSS issue is opened for your package. Also be sure to tell the JOSS editor that this is a pyOpenSci reviewed package once you reach this step.

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.

  • Yes I am OK with reviewers submitting requested changes as issues to my repo. Reviewers will then link to the issues in their submitted review.

Confirm each of the following by checking the box.

  • I have read the author guide.
  • I expect to maintain this package for at least 2 years and can help find a replacement for the maintainer (team) if needed.

Please fill out our survey

P.S. Have feedback/comments about our review process? Leave a comment here

Editor and Review Templates

The editor template can be found here.

The review template can be found here.

Footnotes

  1. Please fill out a pre-submission inquiry before submitting a data visualization package.

@cmarmo
Copy link
Member

cmarmo commented Sep 30, 2024

Editor in Chief checks

Hi @cklunch! Thank you for submitting your package for pyOpenSci review.
Below are the basic checks that your package needs to pass to begin our review.
If some of these are missing, we will ask you to work on them before the review process begins.

Please check our Python packaging guide for more information on the elements below.

  • Installation The package can be installed from a community repository such as PyPI (preferred), and/or a community channel on conda (e.g. conda-forge, bioconda).
    • The package imports properly into a standard Python environment import package.
  • Fit The package meets criteria for fit and overlap.
  • Documentation The package has sufficient online documentation to allow us to evaluate package function and scope without installing the package. This includes:
    • User-facing documentation that overviews how to install and start using the package.
    • Short tutorials that help a user understand how to use the package and what it can do for them.
    • API documentation (documentation for your code's functions, classes, methods and attributes): this includes clearly written docstrings with variables defined using a standard docstring format.
  • Core GitHub repository Files
    • README The package has a README.md file with clear explanation of what the package does, instructions on how to install it, and a link to development instructions.
    • Contributing File The package has a CONTRIBUTING.md file that details how to install and contribute to the package.
    • Code of Conduct The package has a CODE_OF_CONDUCT.md file.
    • License The package has an OSI approved license.
      NOTE: We prefer that you have development instructions in your documentation too.
  • Issue Submission Documentation All of the information is filled out in the YAML header of the issue (located at the top of the issue template).
  • Automated tests Package has a testing suite and is tested via a Continuous Integration service.
  • Repository The repository link resolves correctly.
  • Package overlap The package doesn't entirely overlap with the functionality of other packages that have already been submitted to pyOpenSci.
  • Archive (JOSS only, may be post-review): The repository DOI resolves correctly.
  • Version (JOSS only, may be post-review): Does the release version given match the GitHub release (v1.0.0)?

  • Initial onboarding survey was filled out
    We appreciate each maintainer of the package filling out this survey individually. 🙌
    Thank you authors in advance for setting aside five to ten minutes to do this. It truly helps our organization. 🙌


Editor comments

Here my comments to clarify why I haven't ticked all the check-boxes.

  • Documentation: I've found a lot of relevant information in the https://www.neonscience.org/ page, however there is no standalone documentation for the package, even if a docs directory is present in the repo with an empty index.md. I strongly recommend to add some documentation pages in order to structure the access to the information. There is no need to rewrite what is already on https://www.neonscience.org/, just linking and structuring, following for example the structure suggested in the pre-review checks. Also, it is not clear to me what is part of the Neon Data API and what is part of the python implementation.
  • The README does not contain the documentation about how to contribute: it is true that the CONTRIBUTING file specifies that external contributions will not be considered for now. Perhaps a link to the CONTRIBUTING would be enough.
  • As opening issues is the only way to contribute for now, it would be nice to have issue templates for different kind of contributions (bug reports, new feature suggestions, documentation,...)

I think the fluidity of the review process will benefit from those improvements.

Thank you so much for your understanding!

@lwasser
Copy link
Member

lwasser commented Sep 30, 2024

hey there @cklunch @bhass-neon @znickerson8 👋🏻 it's nice to see you here on GitHub! I have a question for you about your contributing doc which says:

The neonutilities package is currently not accepting external contributions. If you have a suggestion for a fix or enhancement to the package, please create an Issue in this repository, or contact us using the NEON Contact Us page.

We do consider opening issues to report bugs and to request useful features to be contributions. They are a different type than a PR of course. What is the intent of "we are not accepting contributions" is in this case? We are having some discussions around our pyOpenSci policies, maintainer responsiveness to users and open source processes in general related to this package. Any input from you would be super helpful! thank you!!

@cklunch
Copy link
Author

cklunch commented Oct 1, 2024

@cmarmo @lwasser Thanks for the feedback!

For documentation, the intent was for the tutorials that are linked in the readme to provide the documentation users need to understand how to use the package. The Download and Explore tutorial provides instructions in the most commonly used functions, and some context and common follow-up data wrangling, and the neonUtilities tutorial provides a function index. If this isn't what you're looking for, or if different content is needed, do you have an example of documentation that would be appropriate?

For contributing, of course bug reports and requests are always welcome! We've found with the R package that it isn't realistic for external folks to contribute code directly. The package is so specific to the NEON publication system, it's hard for people who aren't deeply familiar with that system to write generic code for it. It's almost always easier for us to incorporate requested changes ourselves than to work with external PRs. But if there's a better way to express that in the documentation, I'm very happy to.

I'll get issue templates set up in the repo, and wait to update documentation based on what we decide in this discussion. Thanks!

@lwasser
Copy link
Member

lwasser commented Oct 3, 2024

hey there @cklunch

Have a look at our packaging guide here. Typically, python packages have a documentation "website". You might also have a look at some of the structures for other packages in our ecosystem.

tutorials are great, but it's also important to document the code base for easier future maintenance, help people get started with installing the package, etc. Please have a look at those resources and let us know if you have any questions!

@cmarmo
Copy link
Member

cmarmo commented Oct 3, 2024

For documentation, the intent was for the tutorials that are linked in the readme to provide the documentation users need to understand how to use the package. The Download and Explore tutorial provides instructions in the most commonly used functions, and some context and common follow-up data wrangling, and the neonUtilities tutorial provides a function index. If this isn't what you're looking for, or if different content is needed, do you have an example of documentation that would be appropriate?

Indeed, that's why I said that a lot of information is already there. The way it is structured though is not easy to browse for someone new to the neon ecosystem.
It would be easier also for the review, to have a reference documentation page (like your index.md), rendered by github pages or similar, with a table of contents and the essential information: Getting started, tutorials, API, and then the links to the related pages already available in the neon website.
Does that sound reasonable?

@cklunch
Copy link
Author

cklunch commented Oct 4, 2024

@lwasser @cmarmo Thanks for the clarification. I'll see about creating a dedicated page.

@cklunch
Copy link
Author

cklunch commented Dec 4, 2024

@lwasser @cmarmo Changes to neonutilities have been pushed, including creating issue templates, clarifying in the CONTRIBUTING file, and creating a ReadTheDocs page. Thank you for your guidance!

https://github.com/NEONScience/NEON-utilities-python

@cmarmo
Copy link
Member

cmarmo commented Dec 7, 2024

Thank you for your work @cklunch !
I have checked the related items in my initial comment.... just a nitpick ... at the end of the get started page the link to the tutorials is not linked....

I'm going to start looking for an editor.

@lwasser lwasser moved this from pre-review-checks to seeking-editor in peer-review-status Dec 7, 2024
@cklunch
Copy link
Author

cklunch commented Dec 9, 2024

Thanks @cmarmo ! I've added a link to the Tutorials reference you pointed out.

@cmarmo cmarmo removed their assignment Feb 5, 2025
@lwasser
Copy link
Member

lwasser commented Feb 12, 2025

hey @cklunch my apologies for the delay! i am putting out a call for editors today on social (bluesky, fosstodon and linked in). we will need to onboard someone new. Please bare with us. The state of the world is hitting us all I think so things are moving more slowly now. I hope you and everyone at NEON are doing well.

@cklunch
Copy link
Author

cklunch commented Feb 13, 2025

@lwasser No problem, thanks for the update!

@lwasser lwasser moved this from seeking-editor to under-review in peer-review-status Feb 28, 2025
@JuliMillan
Copy link

Hi @cklunch !
I wanted to let you know that I'll be serving as editor for NEONutilities.
We are currently looking for reviewers and will let you know as soon as that is settled.
Thank you for your patience.

@JuliMillan
Copy link

Editor response to review:


Editor comments

👋 Hi @ethanwhite and @benjamindonnachie! Thank you for volunteering to review for pyOpenSci! It's great to have you both here.

Please fill out our pre-review survey

Before beginning your review, please fill out our pre-review survey. This helps us improve all aspects of our review and better understand our community. No personal data will be shared from this survey - it will only be used in an aggregated format by our Executive Director to improve our processes and programs.

  • reviewer 1 survey completed.
  • reviewer 2 survey completed.

The following resources will help you complete your review:

  1. Here is the reviewers guide. This guide contains all of the steps and information needed to complete your review.
  2. Here is the review template that you will need to fill out and submit here as a comment, once your review is complete.

Please get in touch with any questions or concerns! Your review is due: April 4 2025

Reviewers: @ethanwhite, @benjamindonnachie
Due date: April 4 2025

@ethanwhite
Copy link

ethanwhite commented Mar 26, 2025

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README.
  • Installation instructions: for the development vers
  • Vignette(s) demonstrating major functionality that runs successfully locally.
  • Function Documentation: for all user-facing functions.
  • Examples for all user-facing functions.
  • Community guidelines including contribution guidelines in the README or CONTRIBUTING.
  • Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a pyproject.toml file or elsewhere.

Readme file requirements
The package meets the readme requirements below:

  • Package has a README.md file in the root directory.

The README should include, from top to bottom:

  • The package name
  • Badges for:
    • Continuous integration and test coverage,
    • Docs building (if you have a documentation website),
    • A repostatus.org badge,
    • Python versions supported,
    • Current package version (on PyPI / Conda).

NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)

  • Short description of package goals.
  • Package installation instructions
  • Any additional setup required to use the package (authentication tokens, etc.)
  • Descriptive links to all vignettes. If the package is small, there may only be a need for one vignette which could be placed in the README.md file.
    • Brief demonstration of package usage (as it makes sense - links to vignettes could also suffice here if package description is clear)
  • Link to your documentation website.
  • If applicable, how the package compares to other similar packages and/or how it relates to other packages in the scientific ecosystem.
  • Citation information

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.
Package structure should follow general community best-practices. In general please consider whether:

  • Package documentation is clear and easy to find and use.
  • The need for the package is clear
  • All functions have documentation and associated examples for use
  • The package is easy to install

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests:
    • All tests pass on the reviewer's local machine for the package version submitted by the author. Ideally this should be a tagged version making it easy for reviewers to install.
    • Tests cover essential functions of the package and a reasonable range of inputs and conditions.
  • Continuous Integration: Has continuous integration setup (We suggest using Github actions but any CI platform is acceptable for review)
  • Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines.
    A few notable highlights to look at:
    • Package supports modern versions of Python and not End of life versions.
    • Code format is standard throughout package and follows PEP 8 guidelines (CI tests for linting pass)

For packages also submitting to JOSS

Package not submitted to JOSS

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing:


Review Comments

The package is valuable, well-documented, and easy to use and I commend the maintainers for further improvements to this package since I gave them a friendly review of it a while back.
Having this functionality available in Python is a major contribution to the field.

There are a few checkboxes that still need to be addressed (plus a couple of things for the maintainers to optionally consider):

  • None of the recommended badges are present in the README.md file.
  • README lacks citation information
  • The CONTRIBUTING.md file indicates that the package isn't taking external PRs and to open an issue if you want something. I can see why NEON might need to manage expectations/contributions, but I wonder if a more welcoming version might be something like "If you are interested in contributing changes to the codebase, please open an issue for discussion prior to submitting a pull request." This would let the maintainers moderate contributions without closing the door as firmly.
  • There is currently no docs url in the pyproject.toml file. I've opened a PR that addresses this along with some other minor docs issues.
  • In the portion of the 'Download and Explore NEON Data' tutorial that involves manually downloading data, I would personally find it more straightforward if the code assumed the zip file had been downloaded to the working directory, but this is likely a matter of personal preference and so just something for the maintainers to consider.
  • In the 'Explore isotope data by species' section of the 'Download and Explore NEON Data' tutorial, the plt.show() command is missing to display the plot.
  • The flake8 checks included in the CI show lots of minor cleanup most of which could be automatically cleaned up with flake8

I have submitted a PR that makes a handful of small fixes to documentation and metadata aspects of the package:
NEONScience/NEON-utilities-python#16

@benjamindonnachie
Copy link

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README.
  • Installation instructions: for the development version of the package and any non-standard dependencies in README.
  • Vignette(s) demonstrating major functionality that runs successfully locally.
  • Function Documentation: for all user-facing functions.
  • Examples for all user-facing functions.
  • Community guidelines including contribution guidelines in the README or CONTRIBUTING.
  • Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a pyproject.toml file or elsewhere.

Readme file requirements
The package meets the readme requirements below:

  • Package has a README.md file in the root directory.

The README should include, from top to bottom:

  • The package name
  • Badges for:
    • Continuous integration and test coverage,
    • Docs building (if you have a documentation website),
    • A repostatus.org badge,
    • Python versions supported,
    • Current package version (on PyPI / Conda).

NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)

  • Short description of package goals.
  • Package installation instructions
  • Any additional setup required to use the package (authentication tokens, etc.)
  • Descriptive links to all vignettes. If the package is small, there may only be a need for one vignette which could be placed in the README.md file.
    • Brief demonstration of package usage (as it makes sense - links to vignettes could also suffice here if package description is clear)
  • Link to your documentation website.
  • If applicable, how the package compares to other similar packages and/or how it relates to other packages in the scientific ecosystem.
  • Citation information

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.
Package structure should follow general community best-practices. In general please consider whether:

  • Package documentation is clear and easy to find and use.
  • The need for the package is clear
  • All functions have documentation and associated examples for use
  • The package is easy to install

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests:
    • All tests pass on the reviewer's local machine for the package version submitted by the author. Ideally this should be a tagged version making it easy for reviewers to install.
    • Tests cover essential functions of the package and a reasonable range of inputs and conditions.
  • Continuous Integration: Has continuous integration setup (We suggest using Github actions but any CI platform is acceptable for review)
  • Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines.
    A few notable highlights to look at:
    • Package supports modern versions of Python and not End of life versions.
    • Code format is standard throughout package and follows PEP 8 guidelines (CI tests for linting pass)

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing:


Review Comments

I concur with @ethanwhite's comments above. Overall, I'm impressed by the level of comments in the code and found it easy to follow.

  • There's lots of great input validation before sending to the API. However, if the API ever changes, I wonder if this might be difficult to maintain. I appreciate you may wish to minimise calls to the service, but it might be less of a maintenance overhead to return errors from the API instead.

  • Console output has the potential to be confusing if included as part of a bigger program - for example, rather than "Finding available files", "Stacking data files", "Downloading files" etc., whether a message slightly more descriptive, such as "Finding available NEON files" (or similar) might be more informative for users?

  • I submitted a PR for a minor typo in the example file together with a suggestion for a potentially more maintainable version of convert_byte_size: Typo in nu_example.py NEONScience/NEON-utilities-python#13 and Potential revision to optimise convert_byte_size NEONScience/NEON-utilities-python#14

  • requirements.txt does not list any version numbers - could be an opportunity to specify minimum versions of major libraries - requirements.txt does not include version numbers NEONScience/NEON-utilities-python#15

  • Contributing.md states not accepting external pull requests and instead create an issue. As a result, not entirely clear on what basis contributions are accepted.

  • I'd be interested to learn more about the CSV files under resources - is the content static? If not, is there an opportunity to retrieve the latest version using the API instead?

  • test_aop_download.py under tests references https://github.com/NEONScience/nu-python-testing, but this returns a 404 error.

  • validate_year in aop_download.py uses a regex to ensure that it is within the range 2010 - 2099. However, I think that data is only available from January 2012 onwards? It may be better to revise this function or, alternatively, allow the API to complete the validation.

The above are minor recommendations. The documentation is comprehensive and it's been a joy to review your code.

@cklunch
Copy link
Author

cklunch commented Mar 26, 2025

@ethanwhite @benjamindonnachie Thank you both so much for your reviews! We will work on incorporating your suggestions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: under-review
Development

No branches or pull requests

6 participants