One-Hot Encoding allows encoding categorical variables in numerical ones. This method represents each category of a variable as one column, and a 1 is assigned if the value belongs to the category or 0 otherwise.
Classes, functions, and methods:
df[x].to_dict(oriented='records')
- convert x series to dictionaries, oriented by rows.DictVectorizer().fit_transform(x)
- Scikit-Learn class for converting x dictionaries into a sparse matrix, and in this way doing the one-hot encoding. It does not affect the numerical variables.DictVectorizer().get_feature_names()
- returns the names of the columns in the sparse matrix.
The entire code of this project is available in this jupyter notebook.
The notes are written by the community. If you see an error here, please create a PR with a fix. |