Contributions of any kind are greatly appreciated!
The Issue Tracker is the best place to post any feature ideas, requests and bug reports. This way, everyone has the opportunity to keep informed of changes and join the discussion on future plans.
If you are able to contribute changes yourself, just fork the source code on GitHub, make changes and file a pull request. All contributions are welcome, no matter how big or small.
The following are especially welcome:
- New document readers for patent formats and the website HTML of scientific publishers.
- Improvements to NLP components - tokenization, tagging and entity recognition.
- Parsers for extracting new compound properties.
- New or improved documentation of existing features.
-
Fork the ChemDataExtractor repository on GitHub, then clone your fork to your local machine:
git clone https://github.com/<your-username>/ChemDataExtractor.git
-
Install the development requirements:
cd ChemDataExtractor pip install -r requirements/development.txt
-
Create a new branch for your changes:
git checkout -b <name-for-branch>
-
Make your changes or additions. Ideally add some tests and ensure they pass by running:
pytest
The output should show all tests passing.
-
Commit your changes and push to your fork on GitHub:
git add . git commit -m "<description-of-changes>" git push origin <name-for-branch>
-
Submit a pull request. Then we can discuss your changes and merge them into the main ChemDataExtractor repository.
- Follow the PEP8 style guide.
- Include docstrings as described in PEP257.
- Try and include tests that cover your changes.
- Try to write good commit messages.
- Read the GitHub help page on Using pull requests.