Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] I can't open chroma db #1089

Open
Sung-Jae-Seong opened this issue Feb 15, 2025 · 3 comments
Open

[BUG] I can't open chroma db #1089

Sung-Jae-Seong opened this issue Feb 15, 2025 · 3 comments
Labels
bug Something isn't working

Comments

@Sung-Jae-Seong
Copy link

Bug Description
After running the evaluation, I attempted to retrieve the stored results from the vector database using Python, but no data was loaded.

To Reproduce

  1. evaluate
    export OPENAI_API_KEY=".."
    export PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True
    trial(1) : autorag evaluate --config config.yaml --qa_data_path drl_qa.parquet --corpus_data_path drl_corpus.parquet --project_dir ./project
    trial(2) : autorag evaluate --config config.yaml --qa_data_path drl_qa.parquet --corpus_data_path drl_corpus.parquet --project_dir ./project --skip_validation true
PydanticDeprecatedSince20: The `__fields__` attribute is deprecated, use `model_fields` 
instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration 
Guide at https://errors.pydantic.dev/2.9/migration/
  warnings.warn(
Generating embeddings:   0%|                                      | 0/7 [00:00<?, ?it/s]
[02/15/25 20:31:10] INFO     [_client.py:1038] >> HTTP Request: POST     _client.py:1038
                             https://api.openai.com/v1/embeddings                       
                             "HTTP/1.1 200 OK"                                          
Generating embeddings: 100%|##############################| 7/7 [00:01<00:00,  5.34it/s]
Generating embeddings: 100%|##############################| 7/7 [00:01<00:00,  5.33it/s]

Generating embeddings:   0%|                                      | 0/7 [00:00<?, ?it/s]
[02/15/25 20:31:11] INFO     [_client.py:1038] >> HTTP Request: POST     _client.py:1038
                             https://api.openai.com/v1/embeddings                       
                             "HTTP/1.1 200 OK"                                          
Generating embeddings: 100%|##############################| 7/7 [00:00<00:00, 14.41it/s]
Generating embeddings: 100%|##############################| 7/7 [00:00<00:00, 14.39it/s]

[02/15/25 20:31:13] INFO     [evaluator.py:218] >> Evaluation complete. evaluator.py:218
Ingesting VectorDB... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 1/1 0:00:01
Evaluating...         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 6/6 0:01:14
  1. bring the db data with python
  2. there is no dataFull Error log
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

vectorstore = Chroma(persist_directory="./autorag/project/resources/chroma",
                     embedding_function=OpenAIEmbeddings(), 
                     )

print(vectorstore.get())
docs = vectorstore.similarity_search("question....")
print(len(docs))

if docs:
    print(docs[0].page_content)
else:
    print("검색된 문서가 없습니다.")

{'ids': [], 'embeddings': None, 'documents': [], 'uris': None, 'data': None, 'metadatas': [], 'included': [<IncludeEnum.documents: 'documents'>, <IncludeEnum.metadatas: 'metadatas'>]}
0
검색된 문서가 없습니다.

Code that bug is happened
config.yaml

node_lines:
- node_line_name: retrieve_node_line
  nodes:
    - node_type: retrieval
      strategy:
        batch_size: 1
        metrics: [ retrieval_f1, retrieval_recall, retrieval_precision,
                   retrieval_ndcg, retrieval_map, retrieval_mrr ]
        speed_threshold: 10
      top_k: 5
      modules:
        - module_type: bm25
          bm25_tokenizer: [ porter_stemmer, space, gpt2 ]
        - module_type: vectordb
          vectordb: default
        - module_type: hybrid_rrf
          weight_range: (4,80)
        - module_type: hybrid_cc
          normalize_method: [ mm, tmm, z, dbsf ]
          weight_range: (0.0, 1.0)
          test_weight_size: 101
    - node_type: passage_augmenter
      strategy:
        metrics: [ retrieval_f1, retrieval_recall, retrieval_precision ]
        speed_threshold: 5
      top_k: 5
      embedding_model: openai
      modules:
        - module_type: pass_passage_augmenter
        - module_type: prev_next_augmenter
          mode: next
    - node_type: passage_reranker
      strategy:
        metrics: [ retrieval_f1, retrieval_recall, retrieval_precision ]
        speed_threshold: 10
      top_k: 3
      modules:
        - module_type: pass_reranker
        - module_type: upr
        - module_type: rankgpt
        - module_type: sentence_transformer_reranker
        - module_type: flag_embedding_reranker
        - module_type: openvino_reranker
        - module_type: flashrank_reranker

    - node_type: passage_filter
      strategy:
        metrics: [ retrieval_f1, retrieval_recall, retrieval_precision ]
        speed_threshold: 5
      modules:
        - module_type: pass_passage_filter
        - module_type: similarity_threshold_cutoff
          threshold: 0.85
        - module_type: similarity_percentile_cutoff
          percentile: 0.6
        - module_type: threshold_cutoff
          threshold: 0.85
        - module_type: percentile_cutoff
          percentile: 0.6

- node_line_name: post_retrieve_node_line  # Arbitrary node line name
  nodes:
    - node_type: prompt_maker
      strategy:
        metrics:
          - metric_name: bleu
          - metric_name: meteor
          - metric_name: rouge
          - metric_name: sem_score
            embedding_model: openai
        speed_threshold: 10
        generator_modules:
          - module_type: llama_index_llm
            llm: openai
            model: [gpt-4o-mini]
      modules:
        - module_type: fstring
          prompt:
            - "Answer to given questions with the following passage: {retrieved_contents} \n\n Question: {query} \n\n Answer:"
            - "There is a passages related to user question. Please response carefully to the following question. \n\n Passage: {retrieved_contents} \n\n Question: {query} \n\n Answer the question. Think step by step." # Zero-shot CoT prompt
            - "{retrieved_contents} \n\n Read the passage carefully, and answer this question. \n\n Question: {query} \n\n Answer the question. Be concise." # concise prompt
        - module_type: long_context_reorder
          prompt:
            - "Answer to given questions with the following passage: {retrieved_contents} \n\n Question: {query} \n\n Answer:"
            - "There is a passages related to user question. Please response carefully to the following question. \n\n Passage: {retrieved_contents} \n\n Question: {query} \n\n Answer the question. Think step by step." # Zero-shot CoT prompt
            - "{retrieved_contents} \n\n Read the passage carefully, and answer this question. \n\n Question: {query} \n\n Answer the question. Be concise." # concise prompt
    - node_type: generator
      strategy:
        metrics:
          - metric_name: rouge
          - embedding_model: openai
            metric_name: sem_score
          - metric_name: bert_score
        speed_threshold: 10
      modules:
        - module_type: llama_index_llm
          llm: [openai]
          model: [gpt-4o-mini]
          temperature: [0.5, 1.0]

quantization_config:
  bits: 4
  group_size: 128
  dataset: "c4"
  model_seqlen: 2048
  desc_act: False
  device: "cpu"

model_load:
  low_cpu_mem_usage: True
  torch_dtype: "auto"
  trust_remote_code: True

Desktop (please complete the following information):

  • OS: [ubuntu 22.04]
  • Python version [3.10.12]

Additional context
my project tree
📦project
┣ 📂0
┃ ┣ 📂post_retrieve_node_line
┃ ┃ ┣ 📂generator
┃ ┃ ┃ ┣ 📜0.parquet
┃ ┃ ┃ ┣ 📜1.parquet
┃ ┃ ┃ ┣ 📜best_0.parquet
┃ ┃ ┃ ┗ 📜summary.csv
┃ ┃ ┣ 📂prompt_maker
┃ ┃ ┃ ┣ 📜0.parquet
┃ ┃ ┃ ┣ 📜1.parquet
┃ ┃ ┃ ┣ 📜2.parquet
┃ ┃ ┃ ┣ 📜3.parquet
┃ ┃ ┃ ┣ 📜4.parquet
┃ ┃ ┃ ┣ 📜5.parquet
┃ ┃ ┃ ┣ 📜best_0.parquet
┃ ┃ ┃ ┗ 📜summary.csv
┃ ┃ ┗ 📜summary.csv
┃ ┣ 📂retrieve_node_line
┃ ┃ ┣ 📂passage_augmenter
┃ ┃ ┃ ┣ 📜0.parquet
┃ ┃ ┃ ┣ 📜1.parquet
┃ ┃ ┃ ┣ 📜best_0.parquet
┃ ┃ ┃ ┗ 📜summary.csv
┃ ┃ ┣ 📂passage_filter
┃ ┃ ┃ ┣ 📜0.parquet
┃ ┃ ┃ ┣ 📜1.parquet
┃ ┃ ┃ ┣ 📜2.parquet
┃ ┃ ┃ ┣ 📜3.parquet
┃ ┃ ┃ ┣ 📜4.parquet
┃ ┃ ┃ ┣ 📜best_3.parquet
┃ ┃ ┃ ┗ 📜summary.csv
┃ ┃ ┣ 📂passage_reranker
┃ ┃ ┃ ┣ 📜0.parquet
┃ ┃ ┃ ┣ 📜1.parquet
┃ ┃ ┃ ┣ 📜2.parquet
┃ ┃ ┃ ┣ 📜3.parquet
┃ ┃ ┃ ┣ 📜4.parquet
┃ ┃ ┃ ┣ 📜5.parquet
┃ ┃ ┃ ┣ 📜6.parquet
┃ ┃ ┃ ┣ 📜best_0.parquet
┃ ┃ ┃ ┗ 📜summary.csv
┃ ┃ ┣ 📂retrieval
┃ ┃ ┃ ┣ 📜0.parquet
┃ ┃ ┃ ┣ 📜1.parquet
┃ ┃ ┃ ┣ 📜2.parquet
┃ ┃ ┃ ┣ 📜3.parquet
┃ ┃ ┃ ┣ 📜4.parquet
┃ ┃ ┃ ┣ 📜5.parquet
┃ ┃ ┃ ┣ 📜6.parquet
┃ ┃ ┃ ┣ 📜7.parquet
┃ ┃ ┃ ┣ 📜8.parquet
┃ ┃ ┃ ┣ 📜best_5.parquet
┃ ┃ ┃ ┗ 📜summary.csv
┃ ┃ ┗ 📜summary.csv
┃ ┣ 📜config.yaml
┃ ┗ 📜summary.csv
┣ 📂data
┃ ┣ 📜corpus.parquet
┃ ┗ 📜qa.parquet
┣ 📂resources
┃ ┣ 📂chroma
┃ ┃ ┣ 📂0ec9cd05-0d96-4fc7-9a7a-a1abea6c5ce8
┃ ┃ ┣ 📂50f4b08f-74bf-4fdc-9097-edbe9649cda4
┃ ┃ ┃ ┣ 📜data_level0.bin
┃ ┃ ┃ ┣ 📜header.bin
┃ ┃ ┃ ┣ 📜length.bin
┃ ┃ ┃ ┗ 📜link_lists.bin
┃ ┃ ┗ 📜chroma.sqlite3
┃ ┣ 📜bm25_gpt2.pkl
┃ ┣ 📜bm25_porter_stemmer.pkl
┃ ┣ 📜bm25_space.pkl
┃ ┗ 📜vectordb.yaml
┗ 📜trial.json
I don't know it is because of evaluate error, or config.yaml error, or python code error. can you find the cause of error?
cf. I made dataset myself (by hand), is it possible to occur error?

thank you for reading

@Sung-Jae-Seong Sung-Jae-Seong added the bug Something isn't working label Feb 15, 2025
@vkehfdl1
Copy link
Contributor

@Sung-Jae-Seong Hello!

Since from langchain_community.vectorstores import Chroma is deprecated, can you try to use this instead when you try to load a chroma vector DB?
If the evaluation process is well done without error, and looks like there are some data in the directory well.
It might be langchain issue.

@Sung-Jae-Seong
Copy link
Author

Thank you for your response.
I tried new chroma vector db, I couldn't solve this.

I checked my chroma.sqlite3, there was empty.

and I found warning message during evaluating

/home/rokey4090/.local/lib/python3.10/site-packages/autorag/nodes/retrieval/hybrid_cc.py:16: RuntimeWarning: invalid value encountered in divide
  norm_score = (arr - min_value) / (max_value - min_value)
/home/rokey4090/.local/lib/python3.10/site-packages/autorag/nodes/retrieval/hybrid_cc.py:16: RuntimeWarning: invalid value encountered in divide
  norm_score = (arr - min_value) / (max_value - min_value)
/home/rokey4090/.local/lib/python3.10/site-packages/autorag/nodes/retrieval/hybrid_cc.py:16: RuntimeWarning: invalid value encountered in divide
  norm_score = (arr - min_value) / (max_value - min_value)
...

and this is my whole log except above warning.

[02/17/25 09:05:24] INFO     [config.py:54] >> PyTorch version 2.5.1 available.                                                                                              config.py:54
                    INFO     [_client.py:1038] >> HTTP Request: GET https://api.gradio.app/gradio-messaging/en "HTTP/1.1 200 OK"                                          _client.py:1038
[02/17/25 09:05:26] INFO     [evaluator.py:228] >> Embedding BM25 corpus...                                                                                              evaluator.py:228
                    INFO     [evaluator.py:248] >> BM25 corpus embedding complete.                                                                                       evaluator.py:248
                    INFO     [posthog.py:22] >> Anonymized telemetry enabled. See                     https://docs.trychroma.com/telemetry for more information.            posthog.py:22
[02/17/25 09:05:27] INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"                                               _client.py:1786
                    INFO     [evaluator.py:205] >> Running node line retrieve_node_line...                                                                               evaluator.py:205
                    INFO     [node.py:55] >> Running node retrieval...                                                                                                         node.py:55
                    INFO     [run.py:165] >> Running retrieval node - semantic retrieval module...                                                                             run.py:165
                    INFO     [base.py:18] >> Initialize retrieval node - VectorDB                                                                                              base.py:18
                    INFO     [posthog.py:22] >> Anonymized telemetry enabled. See                     https://docs.trychroma.com/telemetry for more information.            posthog.py:22
                    INFO     [base.py:31] >> Running retrieval node - VectorDB module...                                                                                       base.py:31
                    INFO     [_base_client.py:1672] >> Retrying request to /embeddings in 0.408050 seconds                                                           _base_client.py:1672
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"                                               _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"                                               _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"                                               _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"                                               _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"                                               _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"                                               _client.py:1786
[02/17/25 09:05:28] INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"                                               _client.py:1786
                    INFO     [base.py:28] >> Deleting retrieval node - VectorDB module...                                                                                      base.py:28
                    INFO     [run.py:196] >> Running retrieval node - lexical retrieval module...                                                                              run.py:196
                    INFO     [base.py:18] >> Initialize retrieval node - BM25                                                                                                  base.py:18
                    INFO     [base.py:31] >> Running retrieval node - BM25 module...                                                                                           base.py:31
                    INFO     [base.py:28] >> Deleting retrieval node - BM25 module...                                                                                          base.py:28
                    INFO     [base.py:18] >> Initialize retrieval node - BM25                                                                                                  base.py:18
                    INFO     [base.py:31] >> Running retrieval node - BM25 module...                                                                                           base.py:31
                    INFO     [base.py:28] >> Deleting retrieval node - BM25 module...                                                                                          base.py:28
                    INFO     [base.py:18] >> Initialize retrieval node - BM25                                                                                                  base.py:18
                    INFO     [base.py:31] >> Running retrieval node - BM25 module...                                                                                           base.py:31
                    INFO     [base.py:28] >> Deleting retrieval node - BM25 module...                                                                                          base.py:28
                    INFO     [run.py:227] >> Running retrieval node - hybrid retrieval module...                                                                               run.py:227
                    INFO     [base.py:18] >> Initialize retrieval node - VectorDB                                                                                              base.py:18
                    INFO     [base.py:31] >> Running retrieval node - VectorDB module...                                                                                       base.py:31
                    INFO     [_client.py:1038] >> HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"                                               _client.py:1038
[02/17/25 09:05:29] INFO     [base.py:28] >> Deleting retrieval node - VectorDB module...                                                                                      base.py:28
                    INFO     [base.py:18] >> Initialize retrieval node - BM25                                                                                                  base.py:18
                    INFO     [base.py:31] >> Running retrieval node - BM25 module...                                                                                           base.py:31
                    INFO     [base.py:28] >> Deleting retrieval node - BM25 module...                                                                                          base.py:28
[02/17/25 09:05:34] INFO     [node.py:55] >> Running node passage_augmenter...                                                                                                 node.py:55
                    INFO     [base.py:21] >> Initialize passage augmenter node - PassPassageAugmenter module...                                                                base.py:21
                    INFO     [base.py:38] >> Running passage augmenter node - PassPassageAugmenter module...                                                                   base.py:38
                    INFO     [base.py:33] >> Initialize passage augmenter node - PassPassageAugmenter module...                                                                base.py:33
                    INFO     [base.py:21] >> Initialize passage augmenter node - PrevNextPassageAugmenter module...                                                            base.py:21
                    INFO     [base.py:38] >> Running passage augmenter node - PrevNextPassageAugmenter module...                                                               base.py:38
                    INFO     [_client.py:1038] >> HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"                                               _client.py:1038
[02/17/25 09:05:35] INFO     [_client.py:1038] >> HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"                                               _client.py:1038
[02/17/25 09:05:36] INFO     [base.py:33] >> Initialize passage augmenter node - PrevNextPassageAugmenter module...                                                            base.py:33
                    INFO     [node.py:55] >> Running node passage_reranker...                                                                                                  node.py:55
                    INFO     [base.py:16] >> Initialize passage reranker node - PassReranker module...                                                                         base.py:16
                    INFO     [base.py:26] >> Running passage reranker node - PassReranker module...                                                                            base.py:26
                    INFO     [base.py:21] >> Deleting passage reranker node - PassReranker module...                                                                           base.py:21
                    INFO     [base.py:16] >> Initialize passage reranker node - Upr module...                                                                                  base.py:16
[02/17/25 09:05:38] INFO     [base.py:26] >> Running passage reranker node - Upr module...                                                                                     base.py:26
[02/17/25 09:05:39] INFO     [base.py:21] >> Deleting passage reranker node - Upr module...                                                                                    base.py:21
                    INFO     [base.py:16] >> Initialize passage reranker node - RankGPT module...                                                                              base.py:16
                    INFO     [base.py:26] >> Running passage reranker node - RankGPT module...                                                                                 base.py:26
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
[02/17/25 09:05:40] INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [base.py:21] >> Deleting passage reranker node - RankGPT module...                                                                                base.py:21
                    INFO     [base.py:16] >> Initialize passage reranker node - SentenceTransformerReranker module...                                                          base.py:16
                    INFO     [base.py:26] >> Running passage reranker node - SentenceTransformerReranker module...                                                             base.py:26
                    INFO     [base.py:21] >> Deleting passage reranker node - SentenceTransformerReranker module...                                                            base.py:21
                    INFO     [base.py:16] >> Initialize passage reranker node - FlagEmbeddingReranker module...                                                                base.py:16
[02/17/25 09:05:41] INFO     [base.py:26] >> Running passage reranker node - FlagEmbeddingReranker module...                                                                   base.py:26
[02/17/25 09:05:43] INFO     [base.py:21] >> Deleting passage reranker node - FlagEmbeddingReranker module...                                                                  base.py:21
                    INFO     [base.py:16] >> Initialize passage reranker node - OpenVINOReranker module...                                                                     base.py:16
[02/17/25 09:05:58] INFO     [base.py:26] >> Running passage reranker node - OpenVINOReranker module...                                                                        base.py:26
[02/17/25 09:06:03] INFO     [base.py:21] >> Deleting passage reranker node - OpenVINOReranker module...                                                                       base.py:21
                    INFO     [base.py:16] >> Initialize passage reranker node - FlashRankReranker module...                                                                    base.py:16
                    INFO     [base.py:26] >> Running passage reranker node - FlashRankReranker module...                                                                       base.py:26
[02/17/25 09:06:04] INFO     [base.py:21] >> Deleting passage reranker node - FlashRankReranker module...                                                                      base.py:21
                    INFO     [node.py:55] >> Running node passage_filter...                                                                                                    node.py:55
                    INFO     [base.py:16] >> Initialize passage filter node - PassPassageFilter                                                                                base.py:16
                    INFO     [base.py:22] >> Running passage filter node - PassPassageFilter module...                                                                         base.py:22
                    INFO     [base.py:19] >> Prompt maker node - PassPassageFilter module is deleted.                                                                          base.py:19
                    INFO     [base.py:16] >> Initialize passage filter node - SimilarityThresholdCutoff                                                                        base.py:16
                    INFO     [base.py:22] >> Running passage filter node - SimilarityThresholdCutoff module...                                                                 base.py:22
                    INFO     [_client.py:1038] >> HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"                                               _client.py:1038
                    INFO     [_client.py:1038] >> HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"                                               _client.py:1038
[02/17/25 09:06:05] INFO     [base.py:19] >> Prompt maker node - SimilarityThresholdCutoff module is deleted.                                                                  base.py:19
                    INFO     [base.py:16] >> Initialize passage filter node - SimilarityPercentileCutoff                                                                       base.py:16
                    INFO     [base.py:22] >> Running passage filter node - SimilarityPercentileCutoff module...                                                                base.py:22
                    INFO     [_client.py:1038] >> HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"                                               _client.py:1038
[02/17/25 09:06:06] INFO     [_client.py:1038] >> HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"                                               _client.py:1038
                    INFO     [base.py:19] >> Prompt maker node - SimilarityPercentileCutoff module is deleted.                                                                 base.py:19
                    INFO     [base.py:16] >> Initialize passage filter node - ThresholdCutoff                                                                                  base.py:16
                    INFO     [base.py:22] >> Running passage filter node - ThresholdCutoff module...                                                                           base.py:22
                    INFO     [base.py:19] >> Prompt maker node - ThresholdCutoff module is deleted.                                                                            base.py:19
                    INFO     [base.py:16] >> Initialize passage filter node - PercentileCutoff                                                                                 base.py:16
                    INFO     [base.py:22] >> Running passage filter node - PercentileCutoff module...                                                                          base.py:22
                    INFO     [base.py:19] >> Prompt maker node - PercentileCutoff module is deleted.                                                                           base.py:19
                    INFO     [evaluator.py:205] >> Running node line post_retrieve_node_line...                                                                          evaluator.py:205
                    INFO     [node.py:55] >> Running node prompt_maker...                                                                                                      node.py:55
                    INFO     [base.py:15] >> Initialize prompt maker node - Fstring module...                                                                                  base.py:15
                    INFO     [base.py:23] >> Running prompt maker node - Fstring module...                                                                                     base.py:23
                    INFO     [base.py:20] >> Prompt maker node - Fstring module is deleted.                                                                                    base.py:20
                    INFO     [base.py:15] >> Initialize prompt maker node - Fstring module...                                                                                  base.py:15
                    INFO     [base.py:23] >> Running prompt maker node - Fstring module...                                                                                     base.py:23
                    INFO     [base.py:20] >> Prompt maker node - Fstring module is deleted.                                                                                    base.py:20
                    INFO     [base.py:15] >> Initialize prompt maker node - Fstring module...                                                                                  base.py:15
                    INFO     [base.py:23] >> Running prompt maker node - Fstring module...                                                                                     base.py:23
                    INFO     [base.py:20] >> Prompt maker node - Fstring module is deleted.                                                                                    base.py:20
                    INFO     [base.py:15] >> Initialize prompt maker node - LongContextReorder module...                                                                       base.py:15
                    INFO     [base.py:23] >> Running prompt maker node - LongContextReorder module...                                                                          base.py:23
                    INFO     [base.py:20] >> Prompt maker node - LongContextReorder module is deleted.                                                                         base.py:20
                    INFO     [base.py:15] >> Initialize prompt maker node - LongContextReorder module...                                                                       base.py:15
                    INFO     [base.py:23] >> Running prompt maker node - LongContextReorder module...                                                                          base.py:23
                    INFO     [long_context_reorder.py:65] >> If you use a summarizer, the reorder will not proceed.                                            long_context_reorder.py:65
                    INFO     [long_context_reorder.py:65] >> If you use a summarizer, the reorder will not proceed.                                            long_context_reorder.py:65
                    INFO     [long_context_reorder.py:65] >> If you use a summarizer, the reorder will not proceed.                                            long_context_reorder.py:65
                    INFO     [long_context_reorder.py:65] >> If you use a summarizer, the reorder will not proceed.                                            long_context_reorder.py:65
                    INFO     [long_context_reorder.py:65] >> If you use a summarizer, the reorder will not proceed.                                            long_context_reorder.py:65
                    INFO     [long_context_reorder.py:65] >> If you use a summarizer, the reorder will not proceed.                                            long_context_reorder.py:65
                    INFO     [long_context_reorder.py:65] >> If you use a summarizer, the reorder will not proceed.                                            long_context_reorder.py:65
                    INFO     [base.py:20] >> Prompt maker node - LongContextReorder module is deleted.                                                                         base.py:20
                    INFO     [base.py:15] >> Initialize prompt maker node - LongContextReorder module...                                                                       base.py:15
                    INFO     [base.py:23] >> Running prompt maker node - LongContextReorder module...                                                                          base.py:23
                    INFO     [long_context_reorder.py:65] >> If you use a summarizer, the reorder will not proceed.                                            long_context_reorder.py:65
                    INFO     [long_context_reorder.py:65] >> If you use a summarizer, the reorder will not proceed.                                            long_context_reorder.py:65
                    INFO     [long_context_reorder.py:65] >> If you use a summarizer, the reorder will not proceed.                                            long_context_reorder.py:65
                    INFO     [long_context_reorder.py:65] >> If you use a summarizer, the reorder will not proceed.                                            long_context_reorder.py:65
                    INFO     [long_context_reorder.py:65] >> If you use a summarizer, the reorder will not proceed.                                            long_context_reorder.py:65
                    INFO     [long_context_reorder.py:65] >> If you use a summarizer, the reorder will not proceed.                                            long_context_reorder.py:65
                    INFO     [long_context_reorder.py:65] >> If you use a summarizer, the reorder will not proceed.                                            long_context_reorder.py:65
                    INFO     [base.py:20] >> Prompt maker node - LongContextReorder module is deleted.                                                                         base.py:20
                    INFO     [base.py:19] >> Initialize generator node - LlamaIndexLLM                                                                                         base.py:19
                    INFO     [base.py:26] >> Running generator node - LlamaIndexLLM module...                                                                                  base.py:26
[02/17/25 09:06:07] INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
[02/17/25 09:06:08] INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
[02/17/25 09:06:09] INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
[02/17/25 09:06:10] INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
[02/17/25 09:06:13] INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
[02/17/25 09:06:14] INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
[02/17/25 09:06:15] INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
[02/17/25 09:06:16] INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
[02/17/25 09:06:17] INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
[02/17/25 09:06:18] INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
[02/17/25 09:06:19] INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
[02/17/25 09:06:20] INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [base.py:23] >> Deleting generator module - LlamaIndexLLM                                                                                         base.py:23
[02/17/25 09:06:24] INFO     [_client.py:1038] >> HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"                                               _client.py:1038
[02/17/25 09:06:25] INFO     [_client.py:1038] >> HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"                                               _client.py:1038
[02/17/25 09:06:26] INFO     [node.py:55] >> Running node generator...                                                                                                         node.py:55
                    INFO     [base.py:19] >> Initialize generator node - LlamaIndexLLM                                                                                         base.py:19
                    INFO     [base.py:26] >> Running generator node - LlamaIndexLLM module...                                                                                  base.py:26
[02/17/25 09:06:27] INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
[02/17/25 09:06:28] INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
[02/17/25 09:06:29] INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [base.py:23] >> Deleting generator module - LlamaIndexLLM                                                                                         base.py:23
                    INFO     [base.py:19] >> Initialize generator node - LlamaIndexLLM                                                                                         base.py:19
                    INFO     [base.py:26] >> Running generator node - LlamaIndexLLM module...                                                                                  base.py:26
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
[02/17/25 09:06:30] INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
[02/17/25 09:06:31] INFO     [_client.py:1786] >> HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"                                         _client.py:1786
                    INFO     [base.py:23] >> Deleting generator module - LlamaIndexLLM                                                                                         base.py:23
                    INFO     [_client.py:1038] >> HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"                                               _client.py:1038
[02/17/25 09:06:32] INFO     [_client.py:1038] >> HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"                                               _client.py:1038
[02/17/25 09:06:35] INFO     [_client.py:1038] >> HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"                                               _client.py:1038
                    INFO     [_client.py:1038] >> HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"                                               _client.py:1038
[02/17/25 09:06:37] INFO     [evaluator.py:218] >> Evaluation complete.                                                                                                  evaluator.py:218
Ingesting VectorDB... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 1/1 0:00:01
Evaluating...         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 6/6 0:01:10

I might suspect that there is a problem with my dataset.
Are commas (,) or other special characters not allowed in the dataset?

qid,query,retrieval_gt,generation_gt
q0,"M0609 로봇의 최대 가반 하중은 얼마인가?","doc_1","M0609 로봇의 최대 가반 하중은 6kg입니다."
q1,"M0609 로봇의 J4, J5, J6 관절의 허용 모멘트와 관성은 얼마인가?","doc_2","J4, J5, J6 관절의 허용 모멘트는 36 Nm이며, 허용 관성은 1.6 kg㎡입니다."
q2,M0609 로봇의 최대 작업 반경과 유효 작업 반경은 얼마인가?,"doc_3","최대 작업 반경은 1800mm이며, 유효 작업 반경은 1558mm입니다."
q3,M0609 로봇의 J3 관절의 가동각과 속도는 얼마인가?,"doc_4",m0609s joint specification
q4,M0609 로봇의 위치 반복 정밀도는 얼마인,"doc_5",M0609 로봇의 위치 반복 정밀도는 ±0.03 mm입니다.
q5,M0609 로봇의 J2 관절의 각도 제한과 속도 제한은 얼마인가?,"doc_6","J2 관절의 각도 제한은 -95° ~ 95°이며, 속도 제한은 0 ~ 150 °/s입니다."
doc_id,contents,metadata
doc_1," M0609 로봇의 최대 가반 하중은 6kg입니다", {'source': 'm0609_payload_info'}
doc_2,"M0609 모델의 각 관절(Joint)별 허용 관성을 정리하면 다음과 같습니다.

 J4 (4번 관절)  
   허용 모멘트: 36 Nm  
   허용 관성: 1.6 kg㎡  

 J5 (5번 관절)  
   허용 모멘트: 36 Nm  
   허용 관성: 1.6 kg㎡  

 J6 (6번 관절)  
   허용 모멘트: 36 Nm
   허용 관성: 1.6 kg㎡  

M0609의 J4, J5, J6 관절은 동일한 허용 모멘트와 관성을 가지며, 각각 최대 36 Nm의 모멘트와 1.6 kg㎡의 관성까지 견딜 수 있습니다.","{'source': 'm0609 j4, j5, j6 - inertia info'}"
doc_3," 최대 작업 반경은 1800mm이며 유효 작업 반경은 1558mm입니다",{'source': 'm0609s physical workspace'}
doc_4,"M0609의 **Joint(관절) 사양**은 다음과 같습니다.

### **1. 가동각 (운동 범위)**
- **J1**: ±360° (TP: ±360°)
- **J2**: ±360° (TP: ±95°)
- **J3**: ±150° (TP: ±125°)
- **J4**: ±360° (TP: ±360°)
- **J5**: ±360° (TP: ±135°)
- **J6**: ±360° (TP: ±360°)

### **2. 축별 최대 속도 (정격 가반 하중 운전 시)**
- **J1**: 150 °/s
- **J2**: 150 °/s
- **J3**: 180 °/s
- **J4**: 225 °/s
- **J5**: 225 °/s
- **J6**: 225 °/s

M0609는 높은 유연성과 속도를 제공하며, 특히 **J4~J6의 속도(225°/s)** 가 높아 빠른 움직임이 가능합니다.",{'source': 'm0609s joint specification'}
doc_5,"### **M0609 Robot Control Data**  

- **Number of Axes (축의 개수)**: 6  
- **Maximum TCP Speed (최대 TCP 속도)**: Over 1 m/s  
- **Position Repeatability (위치 반복 정밀도, ISO 9283)**: ±0.03 mm  
- **Vibration & Acceleration (진동 및 가속도)**:
  - 10 ≤ f < 57㎐ : 0.075 mm amplitude  
  - 57 ≤ f ≤ 150㎐ : 1G  
- **Shock Resistance (충격 내성)**:
  - Max Amplitude: 50㎨(5G)  
  - Time: 30㎳, Pulse: 3 of 3 (X, Y, Z)  

이 데이터는 **M0609의 로봇 제어 및 동작 성능과 관련된 정보**로, 정밀도, 속도, 진동 내성 등을 포함합니다.",{'source': 'm0609s basic specification'}
doc_6,"### **M0609 Safety Parameters**  

#### **1. Joint Angle Limits (관절 각도 제한)**  
- **J1**: -360° ~ 360° (Tolerance: ±3°)  
- **J2**: -95° ~ 95° (Tolerance: ±3°)  
- **J3**: -135° ~ 135° (Tolerance: ±3°)  
- **J4**: -360° ~ 360° (Tolerance: ±3°)  
- **J5**: -135° ~ 135° (Tolerance: ±3°)  
- **J6**: -360° ~ 360° (Tolerance: ±3°)  

#### **2. Joint Speed Limits (관절 속도 제한)**
- **J1**: 0 ~ 150 °/s (Default: 150 °/s, Tolerance: ±10 °/s)  
- **J2**: 0 ~ 150 °/s (Default: 150 °/s, Tolerance: ±10 °/s)  
- **J3**: 0 ~ 180 °/s (Default: 180 °/s, Tolerance: ±10 °/s)  
- **J4**: 0 ~ 225 °/s (Default: 225 °/s, Tolerance: ±10 °/s)  
- **J5**: 0 ~ 225 °/s (Default: 225 °/s, Tolerance: ±10 °/s)  
- **J6**: 0 ~ 225 °/s (Default: 225 °/s, Tolerance: ±10 °/s)  

#### **3. Robot/TCP Limits (로봇/TCP 제한)**
- **Force (N)**: 0 ~ 400 N (Default: 96 N)  
- **Power (W)**: 0 ~ 1600 W (Default: 300 W)  
- **Speed (mm/s)**: 0 ~ 7000 mm/s (Default: 2000 mm/s)  
- **Momentum (kgm/s)**: 0 ~ 75 kgm/s (Default: 38 kgm/s)  
- **Collision Detection Sensitivity (%)**: 1 ~ 100% (Default: 75%)  

#### **4. Safety I/O (안전 I/O)**
- **Speed Reduction Ratio (%)**: 1 ~ 100% (Default: 20%)  

M0609의 **안전 파라미터**는 **각도, 속도, 힘, 충돌 감지 민감도 및 속도 제한 비율**을 포함하여 **안전한 작업 환경을 유지하기 위한 설정**으로 구성됩니다.",{'source': 'm0609s safety parameter'}

@vkehfdl1
Copy link
Contributor

@Sung-Jae-Seong
Can you open the parquet file at the retrieval folder?
You can find in [your_project_dir] - [your_node_line_name] - retrieval - 0.parquet or any other parquet files.
If you can find the 'retrieved_content', 'retrieved_id', 'retrieve_score' with full list, that means the retrieval and ingest done well.
Otherwise, it might be wrong.


Plus, check your corpus_df that you use unique doc_id. If doc_id is somehow duplicated, it can be issue at the hybrid_cc like that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants