-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
34 changed files
with
2,208 additions
and
2,683 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,175 +1,20 @@ | ||
# Byte-compiled / optimized / DLL files | ||
__pycache__/ | ||
*.py[cod] | ||
*$py.class | ||
|
||
# C extensions | ||
*.so | ||
|
||
# vim swap files | ||
*.swp | ||
# C++ files | ||
*.o | ||
*.d | ||
|
||
# Distribution / packaging | ||
.Python | ||
build/ | ||
develop-eggs/ | ||
dist/ | ||
downloads/ | ||
eggs/ | ||
.eggs/ | ||
lib/ | ||
lib64/ | ||
parts/ | ||
sdist/ | ||
var/ | ||
wheels/ | ||
share/python-wheels/ | ||
*.egg-info/ | ||
.installed.cfg | ||
*.egg | ||
MANIFEST | ||
|
||
# PyInstaller | ||
# Usually these files are written by a python script from a template | ||
# before PyInstaller builds the exe, so as to inject date/other infos into it. | ||
*.manifest | ||
*.spec | ||
|
||
# Installer logs | ||
pip-log.txt | ||
pip-delete-this-directory.txt | ||
|
||
# Unit test / coverage reports | ||
htmlcov/ | ||
.tox/ | ||
.nox/ | ||
.coverage | ||
.coverage.* | ||
.cache | ||
nosetests.xml | ||
coverage.xml | ||
*.cover | ||
*.py,cover | ||
.hypothesis/ | ||
.pytest_cache/ | ||
cover/ | ||
|
||
# Translations | ||
*.mo | ||
*.pot | ||
|
||
# Django stuff: | ||
*.log | ||
local_settings.py | ||
db.sqlite3 | ||
db.sqlite3-journal | ||
|
||
# Flask stuff: | ||
instance/ | ||
.webassets-cache | ||
|
||
# Scrapy stuff: | ||
.scrapy | ||
|
||
# Sphinx documentation | ||
docs/_build/ | ||
|
||
# PyBuilder | ||
.pybuilder/ | ||
target/ | ||
|
||
# Jupyter Notebook | ||
.ipynb_checkpoints | ||
|
||
# IPython | ||
profile_default/ | ||
ipython_config.py | ||
|
||
# pyenv | ||
# For a library or package, you might want to ignore these files since the code is | ||
# intended to run in multiple environments; otherwise, check them in: | ||
# .python-version | ||
|
||
# pipenv | ||
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. | ||
# However, in case of collaboration, if having platform-specific dependencies or dependencies | ||
# having no cross-platform support, pipenv may install dependencies that don't work, or not | ||
# install all needed dependencies. | ||
#Pipfile.lock | ||
|
||
# poetry | ||
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control. | ||
# This is especially recommended for binary packages to ensure reproducibility, and is more | ||
# commonly ignored for libraries. | ||
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control | ||
#poetry.lock | ||
|
||
# PEP 582; used by e.g. github.com/David-OConnor/pyflow | ||
__pypackages__/ | ||
|
||
# Celery stuff | ||
celerybeat-schedule | ||
celerybeat.pid | ||
|
||
# SageMath parsed files | ||
*.sage.py | ||
|
||
# Environments | ||
.env | ||
.venv | ||
env/ | ||
venv/ | ||
ENV/ | ||
env.bak/ | ||
venv.bak/ | ||
|
||
# Spyder project settings | ||
.spyderproject | ||
.spyproject | ||
|
||
# Rope project settings | ||
.ropeproject | ||
|
||
# mkdocs documentation | ||
/site | ||
|
||
# mypy | ||
.mypy_cache/ | ||
.dmypy.json | ||
dmypy.json | ||
|
||
# Pyre type checker | ||
.pyre/ | ||
|
||
# pytype static type analyzer | ||
.pytype/ | ||
|
||
# Cython debug symbols | ||
cython_debug/ | ||
|
||
# PyCharm | ||
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can | ||
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore | ||
# and can be added to the global gitignore or merged into this file. For a more nuclear | ||
# option (not recommended) you can uncomment the following to ignore the entire idea folder. | ||
#.idea/ | ||
|
||
# ctags | ||
tags | ||
|
||
data | ||
examples | ||
umap | ||
nnd | ||
# Debug files | ||
Makefile | ||
trash.* | ||
callgrind.out.* | ||
gmon.out | ||
|
||
# cpp files | ||
*.o | ||
*.d | ||
|
||
# ignore this file | ||
.gitignore | ||
|
||
# | ||
trash.* | ||
nnd |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
# Nearest Neighbor Descent (nndescent) | ||
|
||
Nearest Neighbor Descent (nndescent) is a C++ implementation of the pynndescent library, originally written by Leland McInnes, which performs approximate nearest neighbor search. The goal of this algorithm is to construct a k-nearest neighbor graph quickly and accurately. | ||
|
||
## Background | ||
|
||
The theoretical background of NND is based on the following paper: | ||
- Dong, Wei, Charikar Moses, and Kai Li. "Efficient k-nearest neighbor graph construction for generic similarity measures." Proceedings of the 20th International Conference on World Wide Web. 2011. | ||
|
||
In addition, the algorithm utilizes random projection trees for initializing the nearest neighbor graph, based on the following paper: | ||
- DASGUPTA, Sanjoy; FREUND, Yoav. Random projection trees and low dimensional manifolds. In: Proceedings of the Fortieth Annual ACM Symposium on Theory of Computing. 2008. | ||
|
||
## Features | ||
|
||
- C++ implementation utilizing OpenMP for efficient computation | ||
- Support for dense matrices | ||
- Implementation of a subset of distance functions | ||
|
||
## Installation | ||
|
||
1. Clone the repository: | ||
|
||
```sh | ||
git clone https://github.com/brj0/nndescent.git | ||
cd nndescent | ||
``` | ||
|
||
2. Build the project: | ||
|
||
```sh | ||
pip install . | ||
``` | ||
|
||
3. Run the examples in `tests`. To build the dataset you should first run `make_test_data.py` | ||
|
||
## Performance | ||
|
||
On my computer, the training phase of nndescent is approximately 5-10% faster than pynndescent. Additionally, the search query phase is approximately 75% faster. Below is the output obtained from running tests/benchmark.py: | ||
|
||
### Benchmark test pynndescent vs nndescent | ||
Data set | py train [ms] | c train [ms] | ratio | py vs c match | py test [ms] | c test [ms] | ratio | py accuracy | c accuracy | ||
----------|---------------|--------------|-------|---------------|--------------|-------------|-------|-------------|----------- | ||
faces | 191.8 | 190.0 | 0.991 | 1.000 | 1631.6 | 20.5 | 0.013 | 1.000 | 0.999 | ||
fmnist | 13587.5 | 12935.1 | 0.952 | 0.997 | 6751.2 | 1757.2 | 0.260 | 0.978 | 0.978 | ||
mnist | 14187.2 | 12712.9 | 0.896 | 0.997 | 6664.2 | 1665.1 | 0.250 | 0.969 | 0.968 | ||
|
||
The compilation time and the long numba loading time during import in Python for pynndescent are not taken into account here and are not required in nndescent. | ||
|
||
## Usage | ||
|
||
Please refer to the examples provided in the repository for instructions on how to use the NND library in your projects. | ||
|
||
## Contributing | ||
|
||
Contributions are welcome! If you have any bug reports, feature requests, or suggestions, please open an issue or submit a pull request. | ||
|
||
## License | ||
|
||
This project is licensed under the [BSD-2-Clause license](LICENSE). | ||
|
||
## Acknowledgements | ||
|
||
This implementation is based on the original pynndescent library by Leland McInnes. I would like to express my gratitude for his work. | ||
|
||
For more information, visit the [pynndescent GitHub repository](https://github.com/lmcinnes/pynndescent). | ||
|
Oops, something went wrong.