dotnet add package Soenneker.Utils.String.CosineSimularity
Imagine you have two sentences or documents. Cosine similarity helps you figure out how similar they are by looking at the -words- they share. Here's why it's handy:
Cosine similarity is easy to understand. It's a number between 0 and 1 that represents how similar two documents are. The closer to 1, the more similar they are.
Whether a text is long or short doesn't throw off cosine similarity. It cares more about the words and their relationships than the total number of words.
It focuses on the meaning of words, not just how often they show up. So, even if one document has a lot more words than another, they might still be considered similar if they share important terms.
When you're dealing with lots of documents or a ton of text, cosine similarity is efficient. It doesn't get bogged down by complicated calculations, making it a practical choice for large datasets.
var text1 = "This is a test";
var text2 = "This is another test";
double result = CosineSimilarityStringUtil.CalculateSimilarityPercentage(text1, text2); // 75