With increasing demand of EdgeAI or ML in practices, this repo aims to bring high level of automation in MLOps Life cycle. This will help to set up Level-4 MLOps Infrastructure and solve lots of real world problems around.
It comes with best practices in Software Design and Architecture for ML like testing, CI/CD, Kubernetes, Github workflow and Terraform + AWS Services.
Read More about MLOps Maturity Level here
Currently, it is serving the following objectives:
- Image Dataset: Create Dataset from the csv and store them in directories under
dataset
. Directory Structure will look like as shown below and each directory will contain~1000 images
- Classification MLOps Pipeline: The pipeline should automatically retrain a new classification model as soon as there are
significant changes
in dataset.significant changes
here is the changes in files of a label is more than 200.
-
pipeline/ec2/*.jpg
: Replicating complete Infrastructure in AWS EC2 using docker-compose.- Images show ec2-setup, security-group,setting up code and complete config view
- Lastly, test.py to get the prediction.
- Check the images here.
-
pipeline/webserver/serverless/*.jpg
: to showcase the deployment of Inference Server in Serverless computing- Images show API Gateway, security-group, setting up ECR and using Lambda to server container
- It shows complete AWS config view.
- Lastly, test.py to get the prediction
- Check the images here.
-
Kubernetes (in Progress*):
pipeline/webserver/manifests/
: to showcase the deployment of Inference Server in Kubernetes Cluster- Images show Service and Deployment Stack, Config and Env Map, and port-forward to
9696
- It shows Kubernetes.
- Lastly, test.py to get the prediction
- Check the images here.
- Images show Service and Deployment Stack, Config and Env Map, and port-forward to
- CI/CD Infrastructure
- Github Job Flags
- pre-commit hooks
- tox
- pylint
- Integration with AWS Cloud
- AWS Serverless: API Gateway + Lambda
- AWS EC2: Docker-compose
- Github Issues Templates
- Kubernetes* (in progress ...)
- isort failing
- Optimise req.txt
- multiple python version in tox
- Add TerraForm as base IaaC
- make a Video
- Multi Input Data sources via Kafka, MQTT and AWS Kinesis.
- Data Validation
- Model Validation
- Deployment Validation
- Async Task Handling in Redis
- Robust CI/CD/CTraining/CDeployment Pipeline
- Model Serving in different hardware: raspberry pi, Android, javascript, onnx, tflite
- Auto Documentations and hosted at *.github.io
- Add More Status batch here like code-coverage, py version, maintainability,code-style, deploy, etc
- Docker version 20.10.17, build 100c701
- docker-compose version 1.29.2, build 5becea4c
- Distributor ID: Ubuntu ,Description: Ubuntu 22.04.1 LTS ,Release: 22.04 , Codename: jammy
pipeline/
: all the docker services is inside this folder
pipeline/data
: represent volumes used between containers
-
Run the latest version of the code with Docker and Docker Compose:
docker-compose up -d
-
Folder Structures
.
├── data # volumes that is shared among all services
│ ├── dataset # dataset for initial base model
│ ├── mlflow # mlruns and .db file, mlflow tracking uri
│ ├── monitored_dataset # watching this repo to track any changes in dataset
│ ├── prefect # for prefect flow uri
│ ├── saved-model
│ └── state # to manage state of docker-services
├── docker-compose.yml
├── mlflow/ # folder to build mlflow-server images
├── prefect/ # folder to build prefect-server images
├── README.md
├── test.py # a python script to trigger webserver and get output
├── watcher/ # folder to build watcher images
└── webserver/ # folder to build prediction server images
84 directories, 168 files
By default, the stack exposes the following ports:
<Ports>: <Service>
5000: MLFlow Webserver
4200: Prefect Web Server
9696: Prediction Server
- 0.0.2 : Automation in Coding Style (Currently)
- 0.0.1 : Initial Release
This project is licensed under the Saurav Solanki License - see the LICENSE.md file for details