Soenneker.Utils.String.CosineSimilarity

A utility library for comparing strings via Cosine Similarity

Installation

dotnet add package Soenneker.Utils.String.CosineSimularity

Why?

Imagine you have two sentences or documents. Cosine similarity helps you figure out how similar they are by looking at the -words- they share. Here's why it's handy:

Easy to Understand:

Cosine similarity is easy to understand. It's a number between 0 and 1 that represents how similar two documents are. The closer to 1, the more similar they are.

Not Bothered by Length:

Whether a text is long or short doesn't throw off cosine similarity. It cares more about the words and their relationships than the total number of words.

Meaning, Not Just Frequency:

It focuses on the meaning of words, not just how often they show up. So, even if one document has a lot more words than another, they might still be considered similar if they share important terms.

Efficient for Big Tasks:

When you're dealing with lots of documents or a ton of text, cosine similarity is efficient. It doesn't get bogged down by complicated calculations, making it a practical choice for large datasets.

Usage

var text1 = "This is a test";
var text2 = "This is another test";

double result = CosineSimilarityStringUtil.CalculateSimilarityPercentage(text1, text2); // 75

Name		Name	Last commit message	Last commit date
Latest commit History 624 Commits
.github		.github
src		src
test/Soenneker.Utils.String.CosineSimilarity.Tests		test/Soenneker.Utils.String.CosineSimilarity.Tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
Soenneker.Utils.String.CosineSimilarity.sln		Soenneker.Utils.String.CosineSimilarity.sln
Soenneker.Utils.String.CosineSimilarity.sln.DotSettings		Soenneker.Utils.String.CosineSimilarity.sln.DotSettings
icon.png		icon.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Soenneker.Utils.String.CosineSimilarity

A utility library for comparing strings via Cosine Similarity

Installation

Why?

Easy to Understand:

Not Bothered by Length:

Meaning, Not Just Frequency:

Efficient for Big Tasks:

Usage

About

Contributors 2

Languages

License

soenneker/soenneker.utils.string.cosinesimilarity

Folders and files

Latest commit

History

Repository files navigation

Soenneker.Utils.String.CosineSimilarity

A utility library for comparing strings via Cosine Similarity

Installation

Why?

Easy to Understand:

Not Bothered by Length:

Meaning, Not Just Frequency:

Efficient for Big Tasks:

Usage

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 2

Languages