-
Meta Reality Labs
- New York
- https://nateanl.github.io
Highlights
- Pro
Stars
Unified automatic quality assessment for speech, music, and sound.
The first Large Audio Language Model that enables native in-depth thinking, which is trained on large-scale audio Chain-of-Thought data.
🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"
MoBA: Mixture of Block Attention for Long-Context LLMs
Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper
Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch
MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.
An Open-source Streaming High-fidelity Neural Audio Codec
Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorch
Audio Codec Speech processing Universal PERformance Benchmark
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…
A multi-voice TTS system trained with an emphasis on quality
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
Implementation of Meta-Voicebox : The first generative AI model for speech to generalize across tasks with state-of-the-art performance.
ImageBind One Embedding Space to Bind Them All
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Python loaders for many Real Room Impulse Response databases
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch