TritonBench

TritonBench features two distinct channels: TritonBench-G and TritonBench-T, each with its own evaluation framework. For detailed information, refer to the paper TRITONBENCH: Benchmarking Large Language Model Capabilities for Generating Triton Operators.

Data

TritonBench-G offers two versions of Alpaca-format instructions:
- Simple instruction: TritonBench_G_simp_alpac_v1.json
- Complex instruction: TritonBench_G_comp_alpac_v1.json
It also includes executable folders (TritonBench_G_v1) and associated statistics (TritonBench_G_v1.json).
TritonBench-T offers two versions of Alpaca-format instructions:
- Simple instruction: TritonBench_T_simp_alpac_v1.json
- Complex instruction: TritonBench_T_comp_alpac_v1.json
It also includes executable folders (TritonBench_T_v1) and associated statistics (TritonBench_T_v1.json).
Additionally, there are two sets of filtered GitHub data:
- train_crawl.json (4024 entries) – de-duplicated using BERT score similarity.
- train_synth.json (4133 entries) – data synthesized using Jiuci.
The combined 8k dataset can be used for RAG (Retrieval-Augmented Generation).

LLM Generated

We also provide the output results from all major models used in the paper.

Python Environment

triton = 3.1.0
torch >= 2.5.1
After installation, update the py_interpreter paths in eval_G and eval_T.

Evaluation Process

TritonBench-G

Code Similarity Evaluation: First, use CodeBLEU to evaluate code similarity. For detailed instructions, refer to ../readme_4similarity.md.
Execution Accuracy:
- Run 0_call_acc.py with the following command:
```
0_call_acc.py --source source/path/or/folder --target target/path/or/folder --GPUs [0,1,2,3]
```
- Multiple GPUs can accelerate the execution.

Execution Performance:

Run 1_exe_acc.py with:

1_exe_acc.py --folder root/of/multiple/folders/or/folder --GPUs [0,1,2,3]

Efficiency:

First run the correctly executable operators and get the performance:

cd performance_metrics/perf_G
python run_bench/write_file.py --input_folder_path /folder/of/pyfiles --results_path /folder/of/output/results
python run_bench/multiprocess_gpu_run.py

Finally, run 2_efficiency.py to evaluate the performance:

cd EVAL/eval_G
python 2_efficiency.py --gen_folder /folder/of/output/results

TritonBench-T

For TritonBench-T, there is no code similarity evaluation. Only call accuracy, execution accuracy, and speedup are assessed. The process is similar:

Run 0_call_acc.py as above:

0_call_acc.py --source source/path/or/folder --target target/path/or/folder --GPUs [0,1,2,3]

Run 1_exe_acc.py with the appropriate folders and GPUs:

1_exe_acc.py --folder root/of/multiple/folders/or/folder --GPUs [0,1,2,3]

Get the performance and evaluate

First run the correctly executable operators and get the performance:

cd performance_metrics/perf_T
python run_bench/write_file.py --input_folder_path /folder/of/pyfiles --results_path /folder/of/output/results
python run_bench/multiprocess_gpu_run.py

Finally, run 2_efficiency.py to evaluate the performance:

cd EVAL/eval_T
python 2_efficiency.py --gen_folder /folder/of/output/results

Note: Ensure that accuracy and efficiency evaluations are performed sequentially.

Hugging face

We have published our dataset on Hugging Face.

📩 Contact Us

If you have any questions, feel free to reach out to us at:
✉️ Email: [[email protected]]

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
EVAL		EVAL
LLM_generated		LLM_generated
data		data
performance_metrics		performance_metrics
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TritonBench

Data

LLM Generated

Python Environment

Evaluation Process

TritonBench-G

TritonBench-T

Hugging face

📩 Contact Us

About

Releases

Packages

Contributors 3

Languages

License

thunlp/TritonBench

Folders and files

Latest commit

History

Repository files navigation

TritonBench

Data

LLM Generated

Python Environment

Evaluation Process

TritonBench-G

TritonBench-T

Hugging face

📩 Contact Us

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages