Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CrateDBVectorSearch does not throw error when dimensions of docs to add and vector table do not align #24

Open
1 of 14 tasks
andnig opened this issue Nov 28, 2023 · 0 comments

Comments

@andnig
Copy link

andnig commented Nov 28, 2023

System Info

latest cratedb branch, langchain 0.0.339rc1

Who can help?

@Amot

Information

  • The official example notebooks/scripts
  • My own modified scripts

Related Components

  • LLMs/Chat Models
  • Embedding Models
  • Prompts / Prompt Templates / Prompt Selectors
  • Output Parsers
  • Document Loaders
  • Vector Stores / Retrievers
  • Memory
  • Agents / Agent Executors
  • Tools / Toolkits
  • Chains
  • Callbacks/Tracing
  • Async

Reproduction

  1. create an embedding table with 1024 dimensions
CREATE TABLE IF NOT EXISTS "repro"."embedding" (
   "collection_id" TEXT,
   "embedding" FLOAT_VECTOR(1024),
   "document" TEXT,
   "cmetadata" OBJECT(DYNAMIC),
   "custom_id" TEXT,
   "uuid" TEXT NOT NULL,
   PRIMARY KEY ("uuid")
)
  1. use the CrateDBVectorSearch interface to add documents with 1536 dimensions to this embedding table
from langchain.schema import Document
from langchain.vectorstores.cratedb import CrateDBVectorSearch
from langchain.embeddings.openai import OpenAIEmbeddings

doc = Document(page_content="this is such a nice text")
doc1 = Document(page_content="this is such a nice text")
vector_store = CrateDBVectorSearch.from_documents(
    [doc, doc1],
    OpenAIEmbeddings(api_key="<your-api-key>"),
    collection_name="wow_such_nice",
    connection_string="crate://localhost:4200?schema=repro",
)

No exception is thrown, even though the OpenAI embeddings have 1536 dimensions and therefore can't be inserted. It looks as if everyting worked as expected.

IMPORTANT: You need to have at least 2 documents to add (see the list of doc and doc1 above). With only one document, the exception is thrown as expected.

(Note: I'd not expect anyone to insert different dimension sizes on purpose. However this could happen on accident, so it might be good to notify the user, instead of swallowing the exception)

Expected behavior

An error should be provided, if the embeddings can't be inserted. Additionally the interface should behave the same for 1 or many documents.

@amotl amotl mentioned this issue Oct 31, 2024
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant