You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I wonder what the range value of arousal, valence, and dominance is. As far as I know, model output is a logit vector size of 3 representing that feature and looks like its values range [0, 1]. I see that you use MSP-Conversation Corpus for fine-tuning. But when I looked at The MSP-Conversation Corpus paper paperlink, they mentioned that
"Notice that the values of the traces are in the range between -100 and 100. The figure shows that extreme values are uncommon. Most of the annotations are concentrated between -40 to 40 for valence, -20 to 50 for arousal, and -20 to 40 for dominance"
Do you guys normalize that feature, or do something related?
The text was updated successfully, but these errors were encountered:
Yes, databases tend to use different scales for arousal/valence/dominance like 0..5.
We normalize all scales to 0..1 for training. During inference most of the values returned by the model are in this range, but it can happen that you also get some values outside of that range.
I wonder what the range value of arousal, valence, and dominance is. As far as I know, model output is a logit vector size of 3 representing that feature and looks like its values range [0, 1]. I see that you use MSP-Conversation Corpus for fine-tuning. But when I looked at The MSP-Conversation Corpus paper paperlink, they mentioned that
"Notice that the values of the traces are in the range between -100 and 100. The figure shows that extreme values are uncommon. Most of the annotations are concentrated between -40 to 40 for valence, -20 to 50 for arousal, and -20 to 40 for dominance"
Do you guys normalize that feature, or do something related?
The text was updated successfully, but these errors were encountered: