-
-
Notifications
You must be signed in to change notification settings - Fork 316
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Adds vector search and automatic embeddings generation documentation (#…
…569)
- Loading branch information
1 parent
9d89b9e
commit 28f885e
Showing
11 changed files
with
14,406 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
66 changes: 66 additions & 0 deletions
66
packages/docs/cloud/orama-ai/automatic-embeddings-generation.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
--- | ||
outline: deep | ||
--- | ||
|
||
# Automatic embeddings generation | ||
|
||
With Orama Cloud, you can automatically generate embeddings from your data at deployment. | ||
|
||
This guide will walk you through the process of creating embeddings using Open AI. We plan to support additional models in the near future. | ||
|
||
## What are text embeddings? | ||
|
||
Text embeddings are numerical representations of text that allow computers to understand the meaning and relationships between words, enabling applications like semantic search, machine translation, and sentiment analysis. | ||
|
||
In recent months, embeddings have gained popularity as they form the foundation of semantic search, which is crucial in developing generative AI experiences like ChatGPT, Google Bard, and others. | ||
|
||
Orama is a hybrid database capable of storing various types of data. It specializes in search capabilities, and its support for vector search enables semantic search among large sets of embeddings, which are presented as vectors. | ||
|
||
For a deeper understanding of text embeddings, the Tensorflow website offers a fantastic explanation. It can be accessed [here](https://www.tensorflow.org/text/guide/word_embeddings). | ||
|
||
## Automatic embeddings generation with Orama Cloud | ||
|
||
Creating text embeddings from a given text or set of texts can be complex. However, Orama Cloud simplifies this process by enabling automatic generation of these embeddings each time you deploy a new index. This makes it remarkably easy to conduct semantic searches through your data at a remarkable speed. | ||
|
||
### Connecting to OpenAI | ||
|
||
::: info | ||
To use this feature, you will need an OpenAI account and an OpenAI API Key. | ||
We will support more providers soon. | ||
::: | ||
|
||
Before you can start generating embeddings, you need to connect to OpenAI. This requires adding an OpenAI API Key to Orama Cloud. | ||
|
||
We will encrypt this API Key and store it securely. For safety reasons, we recommend creating a new API Key specifically for Orama Cloud from the OpenAI dashboard. | ||
|
||
As soon as you have your OpenAI API Key ready, you can add it to Orama Cloud by going to [https://cloud.oramasearch.com/developer-tools](https://cloud.oramasearch.com/developer-tools), and selecting "OpenAI API key" from the left menu. | ||
|
||
![Adding OpenAI API Key to Orama Cloud](/cloud/guides/automatic-embeddings-generation/automatic-embeddings-generation.png) | ||
|
||
After adding your API key, you won't be able to view it again due to security measures. You can always delete it, though this is discouraged because all operations dependent on vector search will cease to function. Alternatively, you can replace it with a new key. | ||
|
||
![Your OpenAI API Key](/cloud/guides/automatic-embeddings-generation/open-ai-api-key.png) | ||
|
||
### Creating the embeddings | ||
|
||
You can now create a new index by going to [https://cloud.oramasearch.com/indexes/new](https://cloud.oramasearch.com/indexes/new). For this guide, we will use a simple JSON file as a data source. | ||
|
||
You can download the same JSON file here: [download dataset](/cloud/guides/example-datasets/games.json). | ||
|
||
Once you have your dataset ready, you can create a new Index: | ||
|
||
![Creating a new index on Orama Cloud](/cloud/guides/automatic-embeddings-generation/new-index.png) | ||
|
||
After uploading the file, you can enable the "AI Search". This feature will scan the `"string"` properties in your schema and automatically select them to generate embeddings. You can always select or deselect different properties as needed. | ||
|
||
![Enablin Orama AI on Orama Cloud](/cloud/guides/automatic-embeddings-generation/enable-orama-ai.png) | ||
|
||
Once you have deployed your index containing embeddings, you can locate it in the indexes list where the "AI Search" column is marked with the icon of the chosen model, such as OpenAI. | ||
|
||
Congratulations! You've just deployed your first automatically-generated embeddings on Orama Cloud! | ||
|
||
## Querying the embeddings | ||
|
||
Now that you have your embeddings distributed on Orama Cloud, you can use the [official JavaScript client](/cloud/integrating-orama-cloud/javascript-sdk) to perform vector search on them. | ||
|
||
Read more about performing vector search on Orama Cloud [here](/cloud/performing-search/vector-search). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
--- | ||
outline: deep | ||
--- | ||
|
||
# Performing full-text search on Orama Cloud | ||
|
||
After deploying your index, Orama will distribute it to over 300 global points of presence across more than 100 countries worldwide. This will guarantee the lowest possible latency for any search query, at any scale. | ||
|
||
At the time of this writing, you can execute search queries using our official JavaScript SDK. | ||
|
||
This SDK manages connection, cache, telemetry, and type-safety for all your search operations. It is the official method for communicating with Orama Cloud. | ||
|
||
::: tip Installing the Orama SDK | ||
You can find the guide on installing the SDK [here](/cloud/integrating-orama-cloud/javascript-sdk). | ||
::: | ||
|
||
Make sure you have the Orama SDK installed to start performing full-text search at the edge! | ||
|
||
## Full-text search with Orama Cloud | ||
|
||
Once you have your SDK installed, you can use it in every JavaScript runtime (browser, server, mobile apps, etc.). | ||
|
||
The client exposes a simple `search` method that can be used to query the index: | ||
|
||
```typescript copy | ||
import { OramaClient } from '@oramacloud/client' | ||
|
||
const client = new OramaClient({ | ||
endpoint: '', | ||
api_key: '' | ||
}) | ||
|
||
const results = await client.search({ | ||
term: 'red shoes', | ||
where: { | ||
price: { | ||
gt: 99.99 | ||
} | ||
} | ||
}) | ||
``` | ||
|
||
The `search` method exposes the same interface you're used to implement with OSS Orama. That means that you can perform full-text query, group, sort, filter, run preflight requests, adjust the threshold, run pagination, etc. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
--- | ||
outline: deep | ||
--- | ||
|
||
# Performing vector search on Orama Cloud | ||
|
||
After deploying your index, Orama will distribute it to over 300 global points of presence across more than 100 countries worldwide. This will guarantee the lowest possible latency for any search query, at any scale. | ||
|
||
At the time of this writing, you can execute search queries using our official JavaScript SDK. | ||
|
||
This SDK manages connection, cache, telemetry, and type-safety for all your search operations. It is the official method for communicating with Orama Cloud. | ||
|
||
::: tip Installing the Orama SDK | ||
You can find the guide on installing the SDK [here](/cloud/integrating-orama-cloud/javascript-sdk). | ||
::: | ||
|
||
Make sure you have the Orama SDK installed to start performing vector search at the edge! | ||
|
||
## Performing vector search | ||
|
||
::: info | ||
The following guide assumes that you have an **Open AI API key** set on Orama Cloud, as it is needed to perform transform text into embeddings at search time. | ||
::: | ||
|
||
To perform a vector search with Orama Cloud, you need to populate an Orama index with vectors. Currently, this can be achieved through [automatic embeddings generation](/cloud/orama-ai/automatic-embeddings-generation) during the deployment process. | ||
|
||
Once you have at least one index containing vectors, you can perform vector search by using the `searchVector` function: | ||
|
||
```ts | ||
import { OramaClient } from '@oramacloud/client' | ||
|
||
const client = new OramaClient({ | ||
endpoint: '', | ||
api_key: '' | ||
}) | ||
|
||
const vectorSearchResults = await client.searchVector({ | ||
term: 'Super Mario videogame', | ||
threshold: 0.8, // Minimum similarity, between 0 and 1. Default is 0.8 (80% similar). | ||
limit: 5 // How many results to return. Default is 10. | ||
}) | ||
``` | ||
|
||
Orama will automatically convert your search term, for instance, `"Super Mario videogame"`, into an embedding using your OpenAI API Key. It will then search through your vectors and ultimately return the full documents in their original format. |
Binary file added
BIN
+769 KB
...loud/guides/automatic-embeddings-generation/automatic-embeddings-generation.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+972 KB
...es/docs/public/cloud/guides/automatic-embeddings-generation/enable-orama-ai.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+781 KB
packages/docs/public/cloud/guides/automatic-embeddings-generation/indexes-list.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+876 KB
packages/docs/public/cloud/guides/automatic-embeddings-generation/new-index.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+775 KB
...es/docs/public/cloud/guides/automatic-embeddings-generation/open-ai-api-key.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.