GitHub - C-anwoy/LLM2401-Assignment: Repository for assignment of the course ELL881/AIL821. Modified version of the original KG-RAG repository (https://github.com/BaranziniLab/KG_RAG)

Instructions for the Assignment

1. Set up the environment following Steps 1-4 in the original README.

2. Update your Google API key in gpt_config.env.

3. Replicate KG-RAG with gemini-1.5-flash via sh run_gemini.sh.

4. Evaluate the model via python data/assignment_results/evaluate_gemini.py

5. Implement three enhancement strategies in kg_rag/rag_based_generation/GPT/run_mcq_qa.py.

6. Evaluate these model variants by changing the model output path in the file data/assignment_results/evaluate_gemini.py and running it.

Original README

What is KG-RAG?

KG-RAG stands for Knowledge Graph-based Retrieval Augmented Generation.

Start by watching the video of KG-RAG

KG_RAG_schematics.mov

It is a task agnostic framework that combines the explicit knowledge of a Knowledge Graph (KG) with the implicit knowledge of a Large Language Model (LLM). Here is the arXiv preprint of the work.

Here, we utilize a massive biomedical KG called SPOKE as the provider for the biomedical context. SPOKE has incorporated over 40 biomedical knowledge repositories from diverse domains, each focusing on biomedical concept like genes, proteins, drugs, compounds, diseases, and their established connections. SPOKE consists of more than 27 million nodes of 21 different types and 53 million edges of 55 types [Ref]

The main feature of KG-RAG is that it extracts "prompt-aware context" from SPOKE KG, which is defined as:

the minimal context sufficient enough to respond to the user prompt.

Hence, this framework empowers a general-purpose LLM by incorporating an optimized domain-specific 'prompt-aware context' from a biomedical KG.

Example use case of KG-RAG

Following snippet shows the news from FDA website about the drug "setmelanotide" approved by FDA for weight management in patients with Bardet-Biedl Syndrome

Ask GPT-4 about the above drug:

WITHOUT KG-RAG

Note: This example was run using KG-RAG v0.3.0. We are prompting GPT from the terminal, NOT from the chatGPT browser. Temperature parameter is set to 0 for all the analysis. Refer this yaml file for parameter setting

bbsyndrome_without_kgrag.mov

WITH KG-RAG

Note: This example was run using KG-RAG v0.3.0. Temperature parameter is set to 0 for all the analysis. Refer this yaml file for parameter setting

bbsyndrome_with_kgrag.mov

You can see that, KG-RAG was able to give the correct information about the FDA approved drug.

How to run KG-RAG

Note: At the moment, KG-RAG is specifically designed for running prompts related to Diseases. We are actively working on improving its versatility.

Step 1: Clone the repo

Clone this repository. All Biomedical data used in the paper are uploaded to this repository, hence you don't have to download that separately.

Step 2: Create a virtual environment

Note: Scripts in this repository were run using python 3.10.9

conda create -n kg_rag python=3.10.9
conda activate kg_rag
cd KG_RAG

Step 3: Install dependencies

pip install -r requirements.txt

Step 4: Run the setup script

Note: Make sure you are in KG_RAG folder.

Running the setup script will create disease vector database for KG-RAG

python -m kg_rag.run_setup

Citation

@article{soman2023biomedical,
  title={Biomedical knowledge graph-enhanced prompt generation for large language models},
  author={Soman, Karthik and Rose, Peter W and Morris, John H and Akbas, Rabia E and Smith, Brett and Peetoom, Braian and Villouta-Reyes, Catalina and Cerono, Gabriel and Shi, Yongmei and Rizk-Jackson, Angela and others},
  journal={arXiv preprint arXiv:2311.17330},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
cachegpt/joblib/kg_rag/utility		cachegpt/joblib/kg_rag/utility
data		data
kg_rag		kg_rag
notebooks		notebooks
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
config.yaml		config.yaml
gpt_config.env		gpt_config.env
requirements.txt		requirements.txt
run_gemini.sh		run_gemini.sh
run_gpt4o.sh		run_gpt4o.sh
system_prompts.yaml		system_prompts.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Instructions for the Assignment

Original README

Table of Contents

What is KG-RAG?

Start by watching the video of KG-RAG

Example use case of KG-RAG

Ask GPT-4 about the above drug:

WITHOUT KG-RAG

WITH KG-RAG

How to run KG-RAG

Step 1: Clone the repo

Step 2: Create a virtual environment

Step 3: Install dependencies

Step 4: Run the setup script

Citation

About

Releases

Packages

Languages

License

C-anwoy/LLM2401-Assignment

Folders and files

Latest commit

History

Repository files navigation

Instructions for the Assignment

Original README

Table of Contents

What is KG-RAG?

Start by watching the video of KG-RAG

Example use case of KG-RAG

Ask GPT-4 about the above drug:

WITHOUT KG-RAG

WITH KG-RAG

How to run KG-RAG

Step 1: Clone the repo

Step 2: Create a virtual environment

Step 3: Install dependencies

Step 4: Run the setup script

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages