Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bitap Algorithm for Exact String Matching #599

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

asmit27rai
Copy link

@asmit27rai asmit27rai commented Mar 6, 2025

Implement Bitap Algorithm for Exact String Matching

Fixes: #589

This PR introduces the Bitap algorithm (also known as the Shift-Or algorithm) for exact string matching. The Bitap algorithm is particularly efficient for small patterns and uses bitwise operations to perform fast searches.

Key Changes

  1. Implementation of Bitap Algorithm:

    • Added the bitap_search function to perform exact string matching.
    • The algorithm uses bitwise operations to track potential matches efficiently.
    • Handles edge cases such as empty strings and patterns longer than 64 characters (throws a ValueError for unsupported cases).
  2. Test Cases:

    • Added comprehensive test cases to validate the correctness of the Bitap algorithm.
    • Test cases include:
      • Exact matches.
      • Partial matches.
      • Edge cases (empty text, empty pattern, and patterns longer than 64 characters).
  3. Code Improvements:

    • Improved readability and documentation of the Bitap algorithm.
    • Fixed minor issues in the initial implementation.

- Added the Bitap (Shift-Or) algorithm for efficient exact string matching.
- Included comprehensive test cases to validate the implementation.
- Fixed minor issues in the Bitap algorithm logic and improved readability.
- Ensured the algorithm handles edge cases such as empty strings and patterns longer than 64 characters.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: Implement Bitap Algorithm for approximate string matching
1 participant