Skip to content

Latest commit

 

History

History
18 lines (12 loc) · 944 Bytes

README.md

File metadata and controls

18 lines (12 loc) · 944 Bytes

Clustering & NLP

Brazilian Laws analysis with TF-IDF and K-Means

This repository contains a few NLP and Clustering analysis of a dataset containing ~6400 Brazilian Ordinary Laws. The Source Code is in a Jupyter Notebook file.

Also, read the Medium Article related with this repository.

Main contents:

  • A PT-BR dataset ready-to-use in folder 📂 data

  • Feature Extraction with TF-IDF and Clustering with K-Means

  • TF-IDF visualizations to better data understanding

  • Creation of informative/visual plots like this: