Sumarize-Content-Text-To-Speech

It is a solution to process PDF documents by extracting their content, summarizing it with OpenAI's GPT-3.5 Turbo model, and converting the summary into an audio file using Google Text-to-Speech (gTTS).

Text-to-Speech Summarization Project

This project provides an efficient way to process PDF documents by extracting their content, summarizing it using OpenAI's GPT-3.5 Turbo model, and converting the summary into an audio file with Google Text-to-Speech (gTTS). Built for ease of use within the Google Colab environment, the workflow automates text extraction, summarization, and audio playback, offering a quick solution for information retrieval and accessibility.

Features

PDF Text Extraction: Easily extract text content from uploaded PDF files.
Text Summarization: Generate concise summaries using OpenAI's GPT-3.5 Turbo.
Text-to-Speech Conversion: Convert text summaries to audio using gTTS.
Audio Playback: Automatically generate and play audio files for the summarized text.

Technologies Used

OpenAI GPT-3.5 Turbo: For high-quality text summarization.
LangChain: Handles text splitting and chain operations.
Google Text-to-Speech (gTTS): Converts text into speech audio.
Python Libraries:
- pdfx for extracting text from PDFs.
- tiktoken for text tokenization.
- IPython for seamless audio playback.
- Integration in the Google Colab environment.

Installation

Follow these steps to set up and run the project:

Clone the Repository

git clone <gh repo clone mmanikandan281/Sumarize-Content-Text-To-Speech>
cd <https://github.com/mmanikandan281/Sumarize-Content-Text-To-Speech>

Install Dependencies Install the required Python libraries:

pip install openai pdfx langchain tiktoken langchain-openai gtts ipython

Set Up API Keys
- Obtain your OpenAI API key from OpenAI.
- Use Google Colab's userdata module to securely input the key:
```
from google.colab import userdata
OPENAI_API_KEY = userdata.get('secretName')
```

Access My Colaboratory

https://colab.research.google.com/drive/19Y96L8z7wmO7GaLSPd6h3n6iLBkrJT_P?usp=drive_link

Usage

Upload a PDF File Upload your PDF using the files module in Google Colab:
```
from google.colab import files
uploaded = files.upload()
```

Extract Text from the PDF Use pdfx to extract text content:

import pdfx
pdf = pdfx.PDFx("example1.pdf")
pdf_content = pdf.get_text()

Summarize the Text Process and summarize the text:
```
summary = chain.invoke(docs)
```

Convert Summary to Speech Generate and play the audio file:

from gtts import gTTS
tts = gTTS(text=summary_text, lang='en')
tts.save('summary.mp3')
from IPython.display import Audio
Audio("summary.mp3", autoplay=True)

Example Output

Input: A sample PDF (example1.pdf).
Output:
- Summarized text displayed in Colab.
- Audio file (summary.mp3) for text-to-speech playback.

Contact

For inquiries or feedback, reach out to:

Name: Manikandan M
Email: [email protected]

Enjoy using the Text-to-Speech Summarization Project💌!

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
README.md		README.md
Sumarize_Content_Text_To_Speech.ipynb		Sumarize_Content_Text_To_Speech.ipynb
example1.pdf		example1.pdf
sumarize_content_text_to_speech.py		sumarize_content_text_to_speech.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sumarize-Content-Text-To-Speech

Text-to-Speech Summarization Project

Features

Technologies Used

Installation

Usage

Example Output

Contact

About

Releases

Packages

Languages

mmanikandan281/Sumarize-Content-Text-To-Speech

Folders and files

Latest commit

History

Repository files navigation

Sumarize-Content-Text-To-Speech

Text-to-Speech Summarization Project

Features

Technologies Used

Installation

Usage

Example Output

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages