dbs

to_csv.py This file compiles the text documents into a single csv file of format (File Name, Contents) for easier mainpulation

prior_to_model.py Preprocessing steps: Punctuation Lowercase stopwords commonwords lemmatization bagofwords(BOW) preprocessing before modelling using K Nearest Neighbour(2) Chose K-Nearest Neighbour since it is an unsupervised learning technique(Output defined by employment and amendment)

extract.py
TF-IDF points the relative importance of each word in each document.Hence took that for extracting informative parts of the documents it is done after the similar pro processing tasks as same as classifier and processed.

P.S: I haven't completed the assignment to the full extent

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dbs

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
README.md		README.md
extract.py		extract.py
prior_to_model.py		prior_to_model.py
to_csv.py		to_csv.py

arunajit/dbs-

Folders and files

Latest commit

History

Repository files navigation

dbs

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages