Welcome to my personal collection of hands-on data science projects! This repository showcases my journey exploring and mastering various data science concepts, tools, and techniques. 🌟 Stay tuned as I continue to expand this repository with more exciting projects!
Note
Have a look at this repo about my Machine Learning Subject wich contains even more projects written in R.
- 📄 Notebook
- 🛠️ Technologies: Pandas, PyTorch.
- 🧪 Summary: A classic classification problem using the Iris dataset to practice data manipulation, visualization, and building simple neural networks.
- 📄 Notebook
- 🛠️ Technologies: Scikit-learn, TensorFlow.
- 🧪 Summary: Predicting the likelihood of diabetes using machine learning models, focusing on data preprocessing and model evaluation.
- 📄 Notebook
- 🛠️ Technologies: Autokeras, Scikit-learn.
- 🧪 Summary: Automated approach to classify breast cancer cases. The project leverages AutoKeras to find optimal deep learning models with minimal manual tuning.
- 📄 Notebook
- 🛠️ Technologies: Pyspark, Pandas.
- 🧪 Summary: The script processes the Wine dataset using Apache Spark, performing data cleaning, exploration, and applying custom pandas UDFs for additional transformations.
- 📄 Notebook
- 🛠️ Technologies: Dask, Scikit-learn.
- 🧪 Summary: This time, we will use an alternative to Pandas so that parallel computing is considered when manipulating dataframes thanks to the library Dask.
- 💻 Repository
- 🛠️ Technologies: Pandas, Matplotlib, FPDF, Openpyxl, Streamlit.
- 🧪 Summary: A self-made tool for generating PDF reports from data files locally.
- 📋 Carprice report and Titanic report
- 🛠️ Technologies: Power BI.
- 🧪 Summary: My first two dashboards ever made with Power BI allowed me to learn the basics of visualizing and manipulating data.