Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Enhance Trie with New Features #617

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

asmit27rai
Copy link

@asmit27rai asmit27rai commented Mar 10, 2025

Enhance Trie with New Features

FIXES: #616

Description

This PR adds multiple new features to the Trie class, making it more versatile and functional. The following methods have been introduced:

  1. Count of Words (count_words()) - Returns the total number of words stored in the Trie.
  2. Longest Common Prefix (longest_common_prefix()) - Finds the longest common prefix among all words in the Trie.
  3. Autocomplete (autocomplete(prefix)) - Provides a list of words that match a given prefix.
  4. Bulk Insert (bulk_insert(words)) - Inserts multiple words into the Trie in a single operation.
  5. Clear Trie (clear()) - Removes all words from the Trie, resetting it.
  6. Check if Trie is Empty (is_empty()) - Returns True if the Trie is empty, otherwise False.
  7. Find All Words (find_all_words()) - Retrieves all words currently stored in the Trie.
  8. Find the Shortest Unique Prefix (shortest_unique_prefix(word)) - Determines the shortest unique prefix for a given word.
  9. Check if Any Word Starts with a Given Prefix (starts_with(prefix)) - Returns True if any word in the Trie starts with the given prefix.
  10. Find the Longest Word (longest_word()) - Finds and returns the longest word in the Trie.

Why This is Needed

The Trie previously supported only basic operations like insertion, deletion, and searching. These new features significantly improve its usability by adding useful functionalities for searching, autocompletion, and prefix analysis.

How These Features Were Implemented

  • A count variable was introduced to track the number of inserted words.
  • Recursive and iterative approaches were used to implement prefix and word retrieval functionalities.
  • Helper functions were added for efficient traversal and search optimizations.

How to Test

trie = Trie()
trie.bulk_insert(["apple", "app", "apricot", "bat", "ball"])

assert trie.count_words() == 5
assert trie.longest_common_prefix() == "a"
assert trie.autocomplete("ap") == ["apple", "app", "apricot"]
assert trie.find_all_words() == ["apple", "app", "apricot", "bat", "ball"]
assert trie.starts_with("ba") == True
assert trie.longest_word() == "apricot"
assert trie.shortest_unique_prefix("apple") == "appl"
assert trie.is_empty() == False

trie.clear()
assert trie.is_empty() == True

- Added `count_words()` to count the total number of words in the Trie.
- Implemented `longest_common_prefix()` to find the longest common prefix among all words.
- Added `autocomplete(prefix)` to provide autocomplete suggestions.
- Implemented `bulk_insert(words)` to insert multiple words at once.
- Added `clear()` method to remove all words from the Trie.
- Implemented `is_empty()` to check if the Trie is empty.
- Added `find_all_words()` to retrieve all stored words.
- Implemented `shortest_unique_prefix(word)` to find the shortest unique prefix of a word.
- Added `starts_with(prefix)` to check if any word starts with a given prefix.
- Implemented `longest_word()` to find the longest word in the Trie.
Copy link

codecov bot commented Mar 10, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.373%. Comparing base (f4c1677) to head (e49db80).

Additional details and impacted files
@@              Coverage Diff              @@
##              main      #617       +/-   ##
=============================================
+ Coverage   97.345%   97.373%   +0.027%     
=============================================
  Files           36        36               
  Lines         4445      4492       +47     
=============================================
+ Hits          4327      4374       +47     
  Misses         118       118               
Files with missing lines Coverage Δ
pydatastructs/strings/trie.py 100.000% <100.000%> (ø)

Impacted file tree graph

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@czgdp1807 czgdp1807 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many new methods added here. Would recommend to fork out multiple PRs from this one so that its easy to review.

if node.is_terminal:
strings.append(prefix)
for child in node._children:
_collect(node.get_child(child), prefix + child, strings)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A stack implementation is better than recursive one. Please revert this change.

"""
return self.strings_with_prefix(prefix)

def bulk_insert(self, words: list) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend to support iterators in insert method only.


None
"""
self.root = TrieNode()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well nothing is removed here. Just the root is reset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New Features To The Trie Class
2 participants