This sample repo serves as a PoC for conversational AI apps using python on the server side. It is built on giant shoulders of aiortc 🙏. So the streaming of audio is using webrtc.
The user speaks something in the browser after connection is there which gets recorded on the backend. In response, I've simply queued 3 audios, which play back to back as a response. This is close to how conversational apps in real life using LLM would be, since LLM respond with result token by token, so its possible to send partial response as soon it hits a punctuation mark or so.
- Example
- Oh, I understand! I think its best that you talk to my manager, who would have better knowhow of this.
- In the ☝️ AI response, there would be three parts
- Oh, I understand!
- I think its best that you talk to my manager
- who would have better knowhow of this.
To install, use the following steps (tested on macOS)
- python3.12 -m venv venv
- source venv/bin/activate
- pip install -r requirements.txt
- python server_<aiohttp|fastapi>.py
On a web-browser:
- Go to localhost:8080
- Click on green power button
- After some time Webrtc status shows 'connected'
- Click on audio button and speak, and then click on stop.