Malware_detection

In the experiments I used static analysis to extract features from PE files. In the model selection, I choose several commonly used models, including GaussianNB, MLP, Linear Regression, Decision Tree, and Gradient Boosting. I design three different data preprocessing methods to train the model, the first one retains the extracted feature information, the second one uses feature selection to process the data set and filters out some features before training the model, and the third one uses an AutoEncoder to encode original features to their latent representations. Finally, conclusions are drawn by comparing the performance of different models on different data. I concluded that DecisionTree has the highest accuracy, the accuracy is 99.79%.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Malware_detection

Files

README.md

Latest commit

History

README.md

File metadata and controls

Malware_detection