NeutronBench is a GNN system evaluation framework built on NeutronStar.

🔧 Install

Dependencies

cmake (>=3.14.2).
mpich (>=3.3.3) for inter-process communication.
libnuma for NUMA-aware memory allocation.
cub for GPU-based graph propagation.
libtorch version > 1.7 with gpu support for nn computation.

Building

First clone the repository and initialize the submodule:

git clone https://github.com/iDC-NEU/NeutronBench.git
cd NeutronBench
git submodule update --init --recursive

# or just use one command
git clone --recurse-submodules https://github.com/iDC-NEU/NeutronBench.git

To build:

mkdir build && cd build
cmake ..
make -j 10

To run:

# This is an example (you need to prepare a data, refer to the dataset section below).
./run_nts.sh 1 ./cfgs/gcn_sample_demo.cfg

📁 Datasets

All datasets we used:

Datasets	Nodes	Edges	#F	#L	#hidden
Reddit	232.96K	114.85M	602	41	128
OGB-Arxiv	169.34K	2.48M	128	40	128
OGB-Products	2.45M	126.17M	100	47	128
OGB-Papers	111.06M	1.6B	128	172	128
Amazon	1.57M	264,34M	200	107	128
LiveJournal	4.85M	90.55M	600	60	128
Lj-large	7.49M	232.1M	600	60	128
Lj-links	5.2M	205.25M	600	60	128
Enwiki-links	13.59M	1.37B	600	60	128

we provide a python script to generate the data files:

# craete a python enviroments
conda create -n neutronbench python=3.9 -y
conda activate neutronbench

# instll python dependencies
pip install -r ./data/requirements.txt

# process the dataset
python ./data/generate_nts_dataset.py --dataset ogbn-arxiv

For graph datasets that lack ground-truth attributes, we randomly generate features and labels, and split the data into training (65%), validation (25%), and testing (10%) sets.

We provide Google Drive link for downloading the Amazon, LiveJournal, Lj-large, Lj-links, and Enwiki-links datasets.

🚀 Experiments

Data partitioning experiments

# partitioning
python ./exp/exp-partition/exp-partition.py

Batch preparation experiments

# batch size
python ./exp/exp-batch-size/exp-batch-size.py

# sample rate
python ./exp/exp-sample-rate/sample-rate.py

Data Transferring experiments

# data partitioning
python ./exp/exp-partition/exp-partition.py

# batch size
python ./exp/exp-batch-size/exp-batch-size.py

# different optimization
python ./exp/exp-diff-optim/exp-diff-optim.py

# hybrid transfer
python ./exp/exp-hybrid-trans/exp-hybrid-trans.py

# pipeline
python ./exp/exp-diff-optim/exp-diff-pipe.py

# gpu cache 
python ./exp/exp-gpu-cache/exp-gpu-cache.py

📜Reference

If you find NeutronBench useful or relevant to your research, please cite our paper as below:

@article{yuan2024comprehensive,
  author       = {Hao Yuan and Yajiong Liu and Yanfeng Zhang and Xin Ai and Qiange Wang and Chaoyi Chen and Yu Gu and Ge Yu},
  title        = {Comprehensive Evaluation of GNN Training Systems: A Data Management Perspective},
  journal      = {Proc. VLDB Endow.},
  volume       = {17},
  number       = {6},
  pages        = {1241--1254},
  year         = {2024},
  url          = {https://www.vldb.org/pvldb/vol17/p1241-yuan.pdf},
}

📬 Contact

For any questions or feedback, feel free to contract Hao Yuan or create an issue in this repository.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

🔧 Install

📁 Datasets

🚀 Experiments

📜Reference

📬 Contact

Files

README.md

Latest commit

History

README.md

File metadata and controls

🔧 Install

📁 Datasets

🚀 Experiments

📜Reference

📬 Contact