Skip to content

arunajit/dbs-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

dbs

to_csv.py This file compiles the text documents into a single csv file of format (File Name, Contents) for easier mainpulation

prior_to_model.py Preprocessing steps: Punctuation Lowercase stopwords commonwords lemmatization bagofwords(BOW) preprocessing before modelling using K Nearest Neighbour(2) Chose K-Nearest Neighbour since it is an unsupervised learning technique(Output defined by employment and amendment)

extract.py
TF-IDF points the relative importance of each word in each document.Hence took that for extracting informative parts of the documents it is done after the similar pro processing tasks as same as classifier and processed.

P.S: I haven't completed the assignment to the full extent

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages