Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: initial PSR-6 cache implementation #61

Open
wants to merge 25 commits into
base: master
Choose a base branch
from

Conversation

chr15k
Copy link
Contributor

@chr15k chr15k commented Jan 10, 2025

Description

This pull request introduces a caching system based on PSR-6 contract to improve the performance.

Note

Initial implementation introduces file system caching adapter only

Motivation and context

After some tests and profiling the performance of this package there's a bottleneck in my application when repeatedly calling aspell->check(). Although this cache feature does not address first-run, the completion time of subsequent calls are significantly improved (~16 seconds to ~1.5 seconds). The integration approach was to cache the output of the Aspell process.

How has this been tested?

make tests-dox
make phpstan
make phpcs

Cache implementation

<?php

use PhpSpellcheck\Cache\FileCache;

// Method 1: Using constructor
$cache = new FileCache(
    namespace: 'myapp',        // Cache namespace
    defaultLifetime: 3600,     // 1 hour default TTL
    directory: null            // Optional custom directory
);

// Method 2: Using static creator
$cache = FileCache::create('myapp', 3600);


// Core Methods


// Store item
$item = $cache->getItem('cache-key');
$item->set($value);
$cache->save($item);

// Retrieve item
$item = $cache->getItem('cache-key');
if ($item->isHit()) {
    $value = $item->get();
}

// Delete item
$cache->deleteItem('cache-key');

// Clear all items
$cache->clear();

// Deferred storage
$cache->saveDeferred($item);
$cache->commit();

Cache spellchecker usage

<?php
use PhpSpellcheck\Cache\FileCache;
use PhpSpellcheck\Spellchecker\Aspell;
use PhpSpellcheck\Spellchecker\CacheableSpellchecker;

// Create cache instance
$cache = FileCache::create(
    namespace: 'Aspell',
    defaultLifetime: 3600
);

// Create cached spellchecker
$spellchecker = new CacheableSpellchecker(
    cache: $cache,
    spellchecker: Aspell::create()
);

// Check spelling - first time hits spellchecker
$misspellings = $spellchecker->check('Hello wurld');

// Second check - returns cached results
$misspellings = $spellchecker->check('Hello wurld');

Checklist:

Go over all the following points before making your PR:

  • I have read the CONTRIBUTING document.
  • My pull request addresses exactly one patch/feature.
  • I have created a branch for this patch/feature.
  • I have added tests to cover my changes.
  • If my change requires a change to the documentation, I have updated it accordingly.

If you're unsure about any of these, don't hesitate to ask. We're here to help!

@chr15k
Copy link
Contributor Author

chr15k commented Jan 10, 2025

Hi @tigitz I've put together an initial approach to caching Aspell output. Let me know your thoughts and if this is the right direction to go here in terms of caching. Thanks!

As an aside phpcs seems to be behaving differently local vs CI. With native_function_invocation failures on CI - no idea why this is, I'm assuming it's using the same config file.

Copy link
Owner

@tigitz tigitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the initial implementation.

After some thoughts, I suggest using CacheableSpellchecker(SpellcheckerInterface, CacheInterface) to flexibly decorate any compliant Spellchecker. Additionally, using PSR's CacheInterface would better support broader standards than Symfony's contracts.

I'll fix the CI issues on master meanwhile, don't worry about it

@tigitz
Copy link
Owner

tigitz commented Jan 11, 2025

Hey @chr15k , I've updated the master branch to be up to date with latest php version and deps, it introduced some changes, could you please rebase before going further on this please.

@chr15k
Copy link
Contributor Author

chr15k commented Jan 11, 2025

Hey @chr15k , I've updated the master branch to be up to date with latest php version and deps, it introduced some changes, could you please rebase before going further on this please.

@tigitz Done - thanks for your feedback, I'll switch out of draft status when I've finished the updated PSR implementation.

@chr15k chr15k changed the title feature: initial cache implementation feature: initial PSR-16 cache implementation Jan 11, 2025
@chr15k
Copy link
Contributor Author

chr15k commented Jan 12, 2025

Thank you for the initial implementation.

After some thoughts, I suggest using CacheableSpellchecker(SpellcheckerInterface, CacheInterface) to flexibly decorate any compliant Spellchecker. Additionally, using PSR's CacheInterface would better support broader standards than Symfony's contracts.

I'll fix the CI issues on master meanwhile, don't worry about it

@tigitz I was wondering if you might have any suggestions for handling generators when caching. In this case, I’ve implemented SpellcheckerInterface for CacheableSpellchecker, which requires the check method to return iterable<PhpSpellcheck\MisspellingInterface>.

Since we're working with a generator, I’ve been caching the results by converting it to an array with iterator_to_array($misspellings). This works fine, but when fetching from the cache, I need to convert it back to the appropriate format to satisfy the contract.

I was originally considering caching the raw output from Aspell directly, as that seems more efficient. However, I was trying to avoid modifying the underlying spellchecker implementation.

Do you have any thoughts or suggestions for a cleaner or more efficient approach?

@tigitz
Copy link
Owner

tigitz commented Jan 12, 2025

@chr15k Good point, my first idea would be this:

<?php

namespace PhpSpellcheck\Spellchecker;

use Psr\Cache\CacheItemPoolInterface;

class CacheableSpellchecker implements SpellcheckerInterface
{
    public function __construct(
        private readonly CacheItemPoolInterface $cache,
        private readonly SpellcheckerInterface $spellchecker
    ) {
    }

    public function check(
        string $text,
        array $languages = [],
        array $context = []
    ): iterable {
        $cacheKey = md5(serialize([$this->spellchecker, $text, $languages, $context]));

        $cacheItem = $this->cache->getItem($cacheKey);

        if ($cacheItem->isHit()) {
            yield from $cacheItem->get();
            return;
        }

        $misspellings = iterator_to_array($this->spellchecker->check($text, $languages, $context));
        $this->cache->save($cacheItem->set($misspellings));

        yield from $misspellings;
    }

    public function getSupportedLanguages(): iterable
    {
        $cacheKey = md5(serialize([$this->spellchecker]));

        $cacheItem = $this->cache->getItem($cacheKey);

        if ($cacheItem->isHit()) {
            yield from $cacheItem->get();
            return;
        }

        $languages = iterator_to_array($this->spellchecker->getSupportedLanguages());
        $this->cache->save($cacheItem->set($languages));

        yield from $languages;
    }
}

It's using "psr/cache": "^3.0" and we could implement a FileCache like this:

<?php

namespace PhpSpellcheck\Cache;

use Psr\Cache\CacheItemInterface;
use Psr\Cache\CacheItemPoolInterface;
use RuntimeException;

class FileCache implements CacheItemPoolInterface
{
    private string $cacheFile;
    private array $cache = [];
    private array $deferred = [];

    public function __construct(?string $cacheFile = null)
    {
        $this->cacheFile = $cacheFile ?? $this->getDefaultCacheFile();
        $this->loadCache();
    }

    public function getItem(string $key): CacheItemInterface
    {
        return new FileCacheItem($key, $this->cache[$key] ?? null, isset($this->cache[$key]));
    }

    public function getItems(array $keys = []): iterable
    {
        return array_map(fn($key) => $this->getItem($key), $keys);
    }

    public function hasItem(string $key): bool
    {
        return isset($this->cache[$key]);
    }

    public function clear(): bool
    {
        $this->cache = [];
        $this->deferred = [];
        return $this->saveCache();
    }

    public function deleteItem(string $key): bool
    {
        unset($this->cache[$key]);
        return $this->saveCache();
    }

    public function deleteItems(array $keys): bool
    {
        foreach ($keys as $key) {
            unset($this->cache[$key]);
        }
        return $this->saveCache();
    }

    public function save(CacheItemInterface $item): bool
    {
        $this->cache[$item->getKey()] = $item->get();
        return $this->saveCache();
    }

    public function saveDeferred(CacheItemInterface $item): bool
    {
        $this->deferred[$item->getKey()] = $item->get();
        return true;
    }

    public function commit(): bool
    {
        foreach ($this->deferred as $key => $value) {
            $this->cache[$key] = $value;
        }
        $this->deferred = [];
        return $this->saveCache();
    }

    private function loadCache(): void
    {
        if (!file_exists($this->cacheFile)) {
            $this->cache = [];
            return;
        }

        $data = include $this->cacheFile;
        if (!is_array($data)) {
            throw new RuntimeException('Invalid cache file format');
        }

        $this->cache = $data;
    }

    private function saveCache(): bool
    {
        $dir = dirname($this->cacheFile);
        if (!is_dir($dir) && !mkdir($dir, 0777, true) && !is_dir($dir)) {
            throw new RuntimeException(sprintf('Directory "%s" could not be created', $dir));
        }

        $tmpFile = $this->cacheFile . '.tmp';
        $content = '<?php return ' . var_export($this->cache, true) . ';';

        if (file_put_contents($tmpFile, $content) === false) {
            return false;
        }

        if (!rename($tmpFile, $this->cacheFile)) {
            unlink($tmpFile);
            return false;
        }

        return true;
    }

    private function getDefaultCacheFile(): string
    {
        return '/.php-spellchecker.cache';
    }
}

@chr15k chr15k changed the title feature: initial PSR-16 cache implementation feature: initial PSR-6 cache implementation Jan 15, 2025
@chr15k chr15k marked this pull request as ready for review January 15, 2025 10:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants