-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Setup feluda to match for exact matches on video and audio #21
Setup feluda to match for exact matches on video and audio #21
Comments
End of Week Deliverables after Status Check :
|
Status of Audio FingerprintingWe have an operator working that finds the fingerprint of an given audio file using signal processing. Limitations and TODO:
|
Audio Embeddings using PANN (Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition)[Article Link] [GitHub] Given an audio file, this methods finds a vector of 2048 dimensions using PANNs. PANN is a CNN that is pre-trained on lot of audio files. They have been used for audio tagging and sound event detection. The PANNs have been used to fine-tune several audio pattern recognition tasks, and have outperformed several state-of-the-art systems. Embeddings for vector audio searchAudio embeddings are often generated using spectrograms or other audio signal features. In the context of audio signal processing for machine learning, the process of feature extraction from spectrograms is a crucial step. Spectrograms are visual representations of the frequency content of audio signals over time. The identified features in this context encompass three specific types:
Indexing and Searching Audio Vectors in ElasticsearchAll the audio files have to be of the I index and search for this vector using curl commands listed below. Step 1 - Create an index called "audio" with specific mappings curl -X PUT "es:9200/audio" -H 'Content-Type: application/json' -d '{"mappings": {"_source": {"excludes": ["audio-embedding"]},"properties": {"audio-embedding": {"type": "dense_vector","dims": 2048,"index": true,"similarity": "cosine"},"path": {"type": "text","fields": {"keyword": {"type": "keyword","ignore_above": 256}}},"timestamp": {"type": "date"},"title": {"type": "text"},"genre": {"type": "text"}}}}' Step 2 - see a list of all the indices, check if the audio index is created curl -X GET "http://es:9200/_cat/indices?v" Step 3 - Store a vector in the audio index curl -X POST "es:9200/audio/_doc" -H 'Content-Type: application/json' -d '{"audio-embedding": [0.0, 0.0, 0.029310517013072968, 0.02595067210495472, 0.023528538644313812], "path": "path1", "timestamp": "2024-02-07T12:00:00", "title": "title1", "genre": "genre1"}' Step 4 - Search for the indexed vector. We use cosine similarity to search for the vector curl -X GET "es:9200/audio/_search" -H 'Content-Type: application/json' -d '{"query": {"script_score": {"query": {"match_all": {}}, "script": {"source": "cosineSimilarity(params.query_vector, '"'"'audio-embedding'"'"') + 1.0", "params": {"query_vector": [0.0, 0.0, 0.029310517013072968, 0.02595067210495472, 0.023528538644313812]}}}}}' The pull request for this operators - tattle-made/feluda#59 |
Overview
Acceptance Criteria
The text was updated successfully, but these errors were encountered: