This project implements a sentiment analysis model for Nepali text specifically focused on beauty product reviews. It uses a transformer-based architecture to analyze sentiments in customer feedback about beauty products.
The model is designed to understand and classify sentiments in Nepali language reviews of beauty products. This specialized focus allows for more accurate sentiment analysis in the beauty and cosmetics domain.
- Sentiment analysis of Nepali text for beauty product reviews
- Transformer-based architecture for advanced natural language processing
- Custom data preprocessing pipeline for Nepali language
main.py
: The main script to run the entire pipelineconfig.yaml
: Configuration file for model and training parametersdataset_initializer.py
: Initializes and prepares the beauty product review datasetdataset_preparer.py
: Prepares the dataset for trainingdataset_preprocessor.py
: Preprocesses the dataset, handling beauty-specific terminologybatch_iterator.py
: Handles batch creation for trainingtrainer.py
: Contains the training loop and logicencoder.py
: Defines the encoder model architectureembedding_layers.py
: Implements custom embedding layerstransformer_block.py
: Defines the transformer blockhelpers.py
: Utility functions for the projectpredict.py
: Script for making predictions on new beauty product reviewsdemo.py
: Demonstration script for the modeldeploy.py
: Script for deploying the model (if applicable)
-
Clone the repository:
git clone https://github.com/yourusername/nepali-beauty-sentiment-analysis.git cd nepali-beauty-sentiment-analysis
-
Install the required packages:
pip install -r requirements.txt
-
Configure the model and training parameters in
config.yaml
. -
Run the main script to train the model:
python main.py
-
For predictions on new beauty product reviews, use:
python predict.py
-
To run a demonstration:
python demo.py
The model uses a transformer-based architecture optimized for Nepali beauty product reviews:
- Encoder with multiple transformer blocks
- Custom embedding layers (defined in
embedding_layers.py
) - Transformer blocks (defined in
transformer_block.py
) - Tailored to capture nuances in beauty product terminology and expressions
The project uses a dataset of Nepali beauty product reviews. The data preprocessing pipeline includes:
- Initialization (
dataset_initializer.py
) - Preparation (
dataset_preparer.py
) - Preprocessing (
dataset_preprocessor.py
)
These steps ensure that beauty-specific terms and expressions are properly handled.
Training is managed by trainer.py
, which utilizes batch_iterator.py
for efficient data handling during the training process. The model is trained to recognize sentiments specific to beauty product reviews.