Banana Client is a powerful web application for AI-assisted video content creation, offering features like text generation, speech synthesis, video processing, and subtitle generation. This is a work-in-progress local PC WebUI client for use with a corresponding WhisperX and LLM servers.
Note: This application has only been tested on Windows operating systems.
- Text generation using AI models
- Text-to-speech conversion
- Video transcription and processing
- Subtitle generation and embedding
- Task management for batch processing
- Live2D integration for character animation
- YouTube video downloading and processing
- Image analysis for thumbnail generation
-
Clone the repository:
git clone https://github.com/lobsterchan27/banana-client.git cd banana-client
-
Install dependencies:
npm install
-
Set up environment variables: Create a
.env
file in the project root and add the following:HOSTNAME=127.0.0.1 PORT=8128 API_KEY=your_api_key_here
-
Install yt-dlp: The application will automatically download the yt-dlp binary on first run. Alternatively, you can manually place it in the
bin
directory.
-
Start the server:
node server.js
-
Open a web browser and navigate to
http://localhost:8128
(or the port you specified in the .env file). -
Use the web interface to:
- Generate text using AI models
- Convert text to speech
- Download and process YouTube videos
- Generate subtitles for videos
- Manage tasks for batch processing
- API servers can be configured in the web interface under the "API Servers" dropdown.
- Adjust text generation parameters using the sliders in the "Settings" dropdown.
- Customize prompt settings in the "Prompt Settings" dropdown.
- Implement chunk determination:
- Run a pass over the video using a scene change threshold with a minimum interval variable.
- Create an array containing timestamps of the overall activity level.
- Use this information along with timestamped transcriptions to determine how to trigger responses from the LLM.
- Add support for outputting video in different languages.
- Improve integration with WhisperX server for transcription.
- This application has only been tested on Windows. Compatibility with other operating systems is not guaranteed.
- Some features may require additional setup or external dependencies. Please refer to the documentation for specific requirements.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE.md file for details.