-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inference #13
base: main
Are you sure you want to change the base?
Conversation
Labels 标签对 display name 是啥? 我用 #12 (comment) |
if __name__ == "__main__": | ||
import sys | ||
ckpt_path = sys.argv[1] | ||
m = InferenceAudioSetStrong(ckpt_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m.eval()
to disable dropout
Hi Feiteng, thank you for the interest and sorry for the mess on this part. I forgot to use the sorted label in the label generation codes. You could try this file for the provided checkpoint, we will fix the commited codes later. |
Update the audioset-strong inference labels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind explaining the reason why chunk_len in line 63 of this code was chosen to be 1001 instead of 1000?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is because the logmel function in trochaudio pads the input sequence, thus resulting in 1001 rather than 1000 frames from 10 seconds audio.
Implement inference code for downstream models like audioset and audioset_strong.
fixes #12