This project implements a stacked ensemble classifier for credit approval prediction. The model combines multiple machine learning algorithms including Gradient Boosting, Random Forest, AdaBoost, and Neural Networks to make accurate credit approval decisions.
- Advanced data preprocessing with KNN imputation
- Automated feature engineering including:
- Interaction features
- Polynomial features (squared and cubic terms)
- Stacked ensemble architecture using:
- Gradient Boosting Classifier
- Random Forest Classifier
- AdaBoost Classifier
- Multi-layer Perceptron Classifier
- Feature selection using Gradient Boosting
- Comprehensive model evaluation metrics
pandas
numpy
scikit-learn
jupyter
CREDIT-CARD-APPROVAL/
βββ credit/ # Virtual environment directory
βββ credit_approval/
β βββ crx.data # Dataset file
βββ credit_approval_model.ipynb # Main notebook with model implementation
βββ README.md
βββ requirements.txt
- Clone the repository:
git clone https://github.com/yourusername/credit-card-approval.git
cd credit-card-approval
- Create and activate a virtual environment:
python -m venv credit
source credit/bin/activate # On Windows, use: credit\Scripts\activate
- Install required packages:
pip install -r requirements.txt
- Ensure the dataset is in the correct directory (
credit_approval/crx.data
) - Run the Jupyter notebook:
jupyter notebook credit_approval_model.ipynb
The current implementation achieves:
- Accuracy: 89.86%
- Precision:
- Class 0: 0.86
- Class 1: 0.95
- Recall:
- Class 0: 0.96
- Class 1: 0.84
- F1-Score:
- Handles missing values using KNN imputation for numerical features
- Mode imputation for categorical features
- StandardScaler for feature scaling
- Automated categorical encoding
- Creates interaction features between numerical columns
- Generates polynomial features (squared and cubic terms)
- Implements feature selection using Gradient Boosting
- Base Models:
- Gradient Boosting Classifier (100 estimators)
- Random Forest Classifier (100 estimators)
- AdaBoost Classifier (100 estimators)
- Neural Network (100, 50 hidden layers)
- Meta Model:
- Gradient Boosting Classifier (50 estimators)
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
- Dataset source: UCI Machine Learning Repository
- This implementation was inspired by various ensemble learning techniques in machine learning literature