vector-db-sink

This agent writes vector data to vector databases. LangStream currently supports AstraDB and Pinecone.

Astra DB and Pinecone are both of the type "vector-db-sink" in a LangStream pipeline, but the databases require different configuration values to map the vector data from the sink into the database.

Astra DB example

The Astra DB vector database connection is defined in configuration.yaml:

configuration:
  resources:
    - type: "vector-database"
      name: "AstraDatasource"
      configuration:
        service: "astra"
        username: "${ secrets.astra.username }"
        password: "${ secrets.astra.password }"
        secureBundle: "${ secrets.astra.secureBundle }"

The "Write to Astra DB" pipeline step takes embeddings as input from "input-topic" and writes them to the configured datasource "AstraDatasource":

name: "Write to Astra DB"
topics:
  - name: "input-topic"
    creation-mode: create-if-not-exists
pipeline:
  - name: "Write to Cassandra"
    type: "vector-db-sink"
    input: "input-topic"
    configuration:
      datasource: "AstraDatasource"
      table: "vsearch.products"
      mapping: "id=value.id,description=value.description,name=value.name"

AstraDB Topics

Input

Structured and unstructured text ?
Implicit topic ?
Templating ?

Output

None, it’s a sink.

AstraDB Configuration

Label	Type	Description
datasource	String	The datasource is defined in the Resources section of configuration.yaml.
table	String	The `keyspace.table-name` the vector data will be written to
mapping	String	How the data from the input records will be mapped to the corresponding columns in the database table. "id=value.id" maps the "id" value in the input record to the "id" value of the database.

Pinecone Example

The "Write to Pinecone" pipeline step takes embeddings as input from "vectors-topic" and writes them to a Pinecone datasource.

The Pinecone vector database connection is defined in configuration.yaml:

    - type: "vector-database"
      name: "PineconeDatasource"
      configuration:
        service: "pinecone"
        api-key: "${secrets.pinecone.api-key}"
        environment: "${secrets.pinecone.environment}"
        index-name: "${secrets.pinecone.index-name}"
        project-name: "${secrets.pinecone.project-name}"
        server-side-timeout-sec: 10

The "Write to Pinecone" pipeline step takes embeddings as input from "input-topic" and writes them to the configured datasource "PineconeDatasource":

name: "Write to Pinecone DB"
topics:
  - name: "vectors-topic"
    creation-mode: create-if-not-exists
pipeline:
  - name: "Write to Pinecone"
    type: "vector-db-sink"
    configuration:
      datasource: "PineconeDatasource"
      vector.id: "value.id"
      vector.vector: "value.embeddings"
      vector.namespace: "value.namespace"
      vector.metadata.genre: "value.genre"

Pinecone Topics

Input

Structured and unstructured text ?
Implicit topic ?
Templating ?

Output

None, it’s a sink.

Pinecone Configuration

Label	Type	Description
datasource	String	The datasource is defined in the Resources section of configuration.yaml.
vector.id	String	Maps id to vector.id
vector.vector	String	Maps the input value "vector" to "vector.vector" in the database.
vector.namespace	String	Maps the input value "namespace" to "vector.namespace" in the database.
vector.metadata.{metadataField}	String	Maps the input value "metadata.{metadataField}" to "vector.metadata.{metadataField}"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vector-db-sink.md

vector-db-sink.md

vector-db-sink

Astra DB example

AstraDB Topics

AstraDB Configuration

Pinecone Example

Pinecone Topics

Pinecone Configuration

Files

vector-db-sink.md

Latest commit

History

vector-db-sink.md

File metadata and controls

vector-db-sink

Astra DB example

AstraDB Topics

AstraDB Configuration

Pinecone Example

Pinecone Topics

Pinecone Configuration