Models and inference scripts for the paper: Direct Models for Simultaneous Translation and Automatic Subtitling: FBK@IWSLT2023.
We release the offline ST model used for the FBK participation to the Simultaneous Speech Translation task: model folder.
Please install SimulEval v1.1.0 (commit 3c19e1c) to run the evaluation.
Set the parameters as described in AlignAtt README and run the following code:
simuleval \
--agent-class examples.speech_to_text.simultaneous_translation.agents.v1_1.simul_offline_alignatt.AlignAttSTAgent \
--source ${SRC_LIST_OF_AUDIO} \
--target ${TGT_FILE} \
--data-bin ${DATA_ROOT} \
--config config_simul.yaml \
--model-path ${ST_SAVE_DIR}/avg7.pt --prefix-size 1 --prefix-token "nomt" \
--extract-attn-from-layer 3 --frame-num $FRAMES \
--source-segment-size 1000 \
--device cuda:0 \
--quality-metrics BLEU --latency-metrics LAAL AL ATD --computation-aware \
--output ${OUT_DIR}
Set the parameters as described in EDAtt README and run the following code:
simuleval \
--agent-class examples.speech_to_text.simultaneous_translation.agents.v1_1.simul_offline_edatt.EDAttSTAgent \
--source ${SRC_LIST_OF_AUDIO} \
--target ${TGT_FILE} \
--data-bin ${DATA_ROOT} \
--config config_simul.yaml \
--model-path ${ST_SAVE_DIR}/avg7.pt --prefix-size 1 --prefix-token "nomt" \
--extract-attn-from-layer 3 --frame-num 2 --attn-threshold ${ALPHA} \
--source-segment-size 1000 \
--device cuda:0 \
--quality-metrics BLEU --latency-metrics LAAL AL ATD --computation-aware \
--output ${OUT_DIR}
We release the Automatic Subtitling models for the FBK participation to the Automatic Subtitling task:
For instructions of use, please refer to the Direct Speech Translation for Automatic Subtitling README.
@inproceedings{papi-etal-2023-direct,
title = "Direct Models for Simultaneous Translation and Automatic Subtitling: {FBK}@{IWSLT}2023",
author = "Papi, Sara and
Gaido, Marco and
Negri, Matteo",
booktitle = "Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023)",
month = jul,
year = "2023",
address = "Toronto, Canada (in-person and online)",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2023.iwslt-1.11",
doi = "10.18653/v1/2023.iwslt-1.11",
pages = "159--168",
}