Skip to content

Embedding server is used to provide embedding service based on sentence transformers models and is fully aligned with OpenAI's embedding service. Embedding has been deployed on GPU server remotely to provide embedding service.

License

Notifications You must be signed in to change notification settings

linkedlist771/embedding-server-github

Repository files navigation

Embedding Server

Embedding server is used to provide embedding service based on sentence transformers models and is fully aligned with OpenAI's embedding service. Embedding has been deployed on GPU server remotely to provide embedding service.

Select your preferred language for further details on the embedding server:

🆕 New:

This project will be rebuild with Rust to provide a more efficient and faster service. Please check embedding-server-github-rust for details.

Installation

pip install -r requirements.txt

Download models

The embedding server is based on sentence transformers models. Here are the recommended models:

Take all-mpnet-base-v2 as an example:

mkdir embedding_models
git lfs install
export HF_ENDPOINT=https://hf-mirror.com

git clone $HF_ENDPOINT/sentence-transformers/all-mpnet-base-v2
mv all-mpnet-base-v2 embedding_models/

Deployment

  • Run server on GPU sever:
CUDA_VISIBLE_DEVICES=0 python embedding_server.py 
--models_dir_path embedding_models/ 
--use_gpu --port port --host host

for example:

CUDA_VISIBLE_DEVICES=0 python embedding_server.py 
--models_dir_path embedding_models/ 
--use_gpu --port 8000 --host 0.0.0.0

Usage

  • Using Openai's API to get embeddings:
from openai import OpenAI

if __name__ == "__main__":
    # assume you start the server at localhost:8848
    api_base = "http://localhost:8848/v1"

    client = OpenAI(base_url=api_base)
    text = "Hello, world!"
    # you can change the model to the actual model you use
    model = "text2vec-base-chinese"
    res = client.embeddings.create(input=[text], model=model).data[0].embedding
    print(res)
  • Curl
curl --location --request GET '127.0.0.1:8848/v1/get_collection_config' \
--header 'User-Agent: Apifox/1.0.0 (https://apifox.com)'

API reference

https://apifox.com/apidoc/shared-fb1805a7-e3e7-4fce-9b9e-3bd69c45e171/api-123096464

Contribution

PR is welcome, please follow the code style with format.sh.

ToDo

  • Support multiple instance to make work load balance.

About

Embedding server is used to provide embedding service based on sentence transformers models and is fully aligned with OpenAI's embedding service. Embedding has been deployed on GPU server remotely to provide embedding service.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published