You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We found that some samples in the datasets contain one image of question and several images of the corresponding choices. But in the paper, it was not provide details about how to process visual features in this case. We have discussed with other researchers, and we guess that only one image will be used to generate vision features. Is that right?
And according to the official website, ScienceQA contains 10332 samples that have an image in the question. But the data length in detr.npy is 11208. Are the rest part generated from images in choices?
The text was updated successfully, but these errors were encountered:
------------------ 原始邮件 ------------------
发件人: "amazon-science/mm-cot" ***@***.***>;
发送时间: 2023年3月25日(星期六) 中午11:40
***@***.***>;
***@***.******@***.***>;
主题: Re: [amazon-science/mm-cot] Vision feature of questions that contains more than one image (Issue #46)
Hi you guys hould look into the their documentation carefully, link is here https://drive.google.com/file/d/13B0hc_F_45-UlqPLKSgRz-ALtFQ8kIJr/view?usp=share_link
You can use pip install gdown and gdown 13B0hc_F_45-UlqPLKSgRz-ALtFQ8kIJr
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
Hi all, thanks for the awesome work from authors.
We found that some samples in the datasets contain one image of question and several images of the corresponding choices. But in the paper, it was not provide details about how to process visual features in this case. We have discussed with other researchers, and we guess that only one image will be used to generate vision features. Is that right?
And according to the official website, ScienceQA contains 10332 samples that have an image in the question. But the data length in detr.npy is 11208. Are the rest part generated from images in choices?
The text was updated successfully, but these errors were encountered: