Skip to content

The repository contains the corpus data set required by the IMLIP profiling task.

Notifications You must be signed in to change notification settings

Ming-360/IMLIP-2022-Image-Caption-evaluation-task

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 
 
 

Repository files navigation

IMLIP-2022-Image-Caption-evaluation-task

The repository contains the corpus data set required by the IMLIP profiling task.

数据集介绍

该数据集由Flickr8k英文数据集翻译后经人工校对而成

文件结构

维语语料

  • Uighur_train.xlsx 为训练集对应的维语语料
  • Uighur_val.xlsx 为验证集对应的维语语料

蒙语语料

  • Mongol_train.xlsx 为训练集对应的蒙语语料
  • Mongol_val.xlsx 为验证集对应的蒙语语料

藏语语料

  • Tibetan_train.xlsx 为训练集对应的藏语语料
  • Tibetan_val.xlsx 为验证集对应的藏语语料

数据项解释

  • 每个txt文件中数据格式
    • 文件名#语料类型#编码# 描述文本
    • 语料类型
      • uyc:维语
      • mnc:蒙语
      • tic:藏语

Flickr8k图片及对应的中英文描述数据集获取方式

英文数据集见链接 提取码:s4be

About

The repository contains the corpus data set required by the IMLIP profiling task.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published