Skip to content

Latest commit

 

History

History
5 lines (5 loc) · 474 Bytes

README.md

File metadata and controls

5 lines (5 loc) · 474 Bytes

ERCLabCrawler-backend (Java SpringBoot application)

This repository contains the code for ERCLab crawler back-end (Java SpringBoot, Spark, Kafka, MongoDB, & HBase)

  • Kafka broker implemented in ERCLabCrawler clinent in Python sends the extracted data to ERCLabCrawler backend.
  • Kafka consumer receives the data from Client and apply data cleaning and ML operations simultanously using Spark & SparkML.
  • Finally stores the cleaned data in MongoDB or HBase databases.