This is a benchmark of Postgres FTS versus other solutions:
To run the tests, please ensure you have the following installed on your machine:
gunzip
(part of thegzip
software distribution)- Docker
- NodeJS (
node
andnpm
) pnpm
(i.e.npm install -g pnpm
)sqlite
To set up testing data and run the full benchmark with all FTS engines:
make # equivalent to `make setup run-all`
To run only a single benchmark (in this case, with Postgres FTS):
FTS_ENGINE=pg make setup run
(FTS_ENGINE = 'pg' | 'meilisearch' | 'typesense' | 'opensearch' | 'sqlite-disk'
)
To only install dependencies:
make setup
The benchmark in this repository uses the a public domain movie dataset:
-
On HuggingFace, in particular the following columns:
-
homepage
-
title
-
original_title
-
overview
-
production_companies
-
spoken_languages
-
tagline
Data is processed from CSV into newline delimited JSON (see movies.ndjson.json.gz
).
ENV Variable | Default | Example | Description |
---|---|---|---|
FTS_ENGINE |
N/A | pg |
The FTS engine to use |
DEBUG |
N/A | true |
Enable debug mode |
TIMING |
N/A | true |
Enable timing information display |
DATA_MOVIES_CSV_ZIPPED_PATH |
./movies.csv.gz |
/path/to/movies.csv.gz |
Path to the movie data set |
DATA_MOVIES_CSV_PATH |
./movies.csv |
/path/to/movies.csv |
Path to the movie data set, uncompressed |
DATA_MOVIES_NDJSON_PATH |
./movies.ndjson.json |
/path/to/movies.ndjson.json |
Path to the newline delimited JSON data for movies |
SEARCH_PHRASES_NDJSON_PATH |
./search-phrases.ndjson.json |
/path/to/search-phrases.ndjson.json |
Path to search phrases to use as newline delimited JSON |
Some variables are used per-run and are normally set by more ergonomic top-level Makefile
targets:
ENV Variable | Default | Example | Description |
---|---|---|---|
INPUT_CSV_PATH |
$(DATA_MOVIES_CSV_ZIPPED_PATH) |
/path/to/movies2.csv.gz |
Path to compressed CSV (normally unzipped by Makefile target) |
OP |
N/A | ingest |
Operation to perform |
SQLITE_DISK_DB_PATH |
./fts-sqlite-disk-db.sqlite |
:memory: |
SQLite DB path |
PG_URL |
postgres://$(PG_USER):$(PG_PASSWORD)@$(PG_HOST):$(PG_PORT)/$(PG_DB) |
postgres://localhost |
Postgres DB path |
TYPESENSE_HOST |
localhost |
typesense.domain.tld |
Hostname for Typesense server |
TYPESENSE_PORT |
8108 |
8109 |
Port for Typesense server |
TYPESENSE_API_KEY |
badtypesenseapikey |
tttttttttttttttt |
API key for Typesense server |
MEILI_HOST |
localhost |
meili.domain.tld |
Hostname for MeiliSearch server |
MEILI_PORT |
7700 |
7701 |
Port for MeiliSearch |
MEILI_URL |
http://$(MEILI_HOST):$(MEILI_PORT) |
https://meili.domain.tld |
Full URL to use when accessing Meilisearch |
MEILI_API_KEY |
$(MEILI_MASTER_KEY) |
xxxxxxxxxxxxxxxxxxx |
MeiliSearch API key |
OPENSEARCH_PROTOCOL |
http |
https |
Protocol to use when accessing OpenSearch service |
OPENSEARCH_HOST |
localhost |
opensearch.domain.tld |
Host for OpenSearch server |
OPENSEARCH_PORT |
9200 |
9201 |
Port for OpenSearch server |
OPENSEARCH_AUTH_USERNAME |
admin |
admin |
Admin username for OpenSearch server |
OPENSEARCH_AUTH_PASSWORD |
admin |
hunter2 |
Admin password for OpenSearch server |
See Makefile
for the code and other variables that might be excluded here.
A single benchmark can be run with the following command:
FTS_ENGINE=<engine> make setup run
Options for FTS_ENGINE
:
pg
meilisearch
typesense
sqlite
.
To run the ingest & query tests with Postgres:
TIMING=true FTS_ENGINE=pg make run
If an error occurs during set up, consider tearing down the existing FTS_ENGINE
:
FTS_ENGINE=pg make engine-stop
To control the setup/teardown of a single backing service, use the engine-start
and engine-stop
top level targets.
For example, if you wanted to start MeiliSearch and poke around on the instance:
FTS_ENGINE=meilisearch make engine-start
After this command returns, you should have an instance of meilisearch running with a stable name (fts-$(FTS_ENGINE)
):
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4d7c0efdf5cf getmeili/meilisearch:v0.28.1 "tini -- /bin/meilis…" 7 seconds ago Up 6 seconds 127.0.0.1:7700->7700/tcp fts-meili
To stop the service:
FTS_ENGINE=meilisearch make engine-stop
Ingesting data into each separate solution is different, and code to do each can be found under src/driver/<engine>.js
. For example, the src/driver/pg.mjs
contains the code to enable document ingestion to Postgres.
Queries to be performed in the test are specified via YAML and stored in search-phrases.ndjson.json
.
This file is read by the automation and related scripts.
To clear all the data inbetween runs:
sudo make clean # sudo is likely needed to clear docker container data folders