QTSumm: Query-Focused Summarization over Tabular Data

The code and data for EMNLP 2023 paper QTSumm: Query-Focused Summarization over Tabular Data.

Prepare Environment

We officially support python 3.8. You could use following commands to install the required packages

pip install -r requirements.txt

QTSumm Dataset

We released the dataset at https://huggingface.co/datasets/yale-nlp/QTSumm using HuggingFace's features of gated repositories to protect the data from automatic crawling.

Finetuning Experiments

We have released following fintuned models in Huggingface Hub.

We have prepared bash scripts for both finetuning and inference in the finetuning/scripts folder. Please ensure that you adjust and set CUDA_VISIBLE_DEVICES and batch size accordingly in finetuning/yamls/*.yaml. The current setting works for an 8x A6000-48G cluster with 1024G memory.

LLM Zero-/Few-Shot Experiments

We have prepared bash scripts for LLM (i.e., open-source models in HuggingFace and OpenAI models) inference in the llm_inference/scripts folder. We used the framework of vLLM for open-source LLM inference. Please ensure that you adjust and set CUDA_VISIBLE_DEVICES accordingly the bash scripts. The current setting works for an 8x A6000-48G cluster with 1024G memory.

Evaluation

We currently support following evaluation metrics:

ROUGE
BLEU
METEOR
BERTScore
TAPAS-Acc
AutoACU

Once all the inference results are ready in the folder xxx, you could use the following command to evaluate the results:

export PYTHONPATH=`pwd`; python evaluation/evaluate.py --prediction_dir xxx --dataset_name yale-nlp/QTSumm --split_name test

Model Outputs

To facilitate the reproducibility of our results by other researchers, we have released the output files of both finetuned models and LLMs in outputs folder.

Contact

For any issues or questions, kindly email us at: Yilun Zhao ([email protected]).

Citation

@misc{zhao2023qtsumm,
      title={QTSUMM: Query-Focused Summarization over Tabular Data}, 
      author={Yilun Zhao and Zhenting Qi and Linyong Nan and Boyu Mi and Yixin Liu and Weijin Zou and Simeng Han and Xiangru Tang and Yumo Xu and Arman Cohan and Dragomir Radev},
      year={2023},
      eprint={2305.14303},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
evaluation		evaluation
figures		figures
finetuning		finetuning
llm_inference		llm_inference
outputs		outputs
utils		utils
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

QTSumm: Query-Focused Summarization over Tabular Data

Prepare Environment

QTSumm Dataset

Finetuning Experiments

LLM Zero-/Few-Shot Experiments

Evaluation

Model Outputs

Contact

Citation

About

Releases

Packages

Contributors 2

Languages

yale-nlp/QTSumm

Folders and files

Latest commit

History

Repository files navigation

QTSumm: Query-Focused Summarization over Tabular Data

Prepare Environment

QTSumm Dataset

Finetuning Experiments

LLM Zero-/Few-Shot Experiments

Evaluation

Model Outputs

Contact

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages