Follow the startup instructions in the README.md file IF NOT ALREADY DONE!
NOTE: To copy and paste in the codespace, you may need to use keyboard commands - CTRL-C and CTRL-V. Chrome may work best for this.
Lab 1 - Working with Neural Networks
Purpose: In this lab, we’ll learn more about neural networks by seeing how one is coded and trained.
- In our repository, we have a set of Python programs to help us illustrate and work with concepts in the labs. These are mostly in the genai subdirectory. Go to the TERMINAL tab in the bottom part of your codespace and change into that directory.
cd genai
- For this lab, we have a simple neural net coded in Python. The file name is nn.py. Open the file either by clicking on genai/nn.py or by entering the command below in the codespace's terminal.
code nn.py
- Scroll down to around line 55. Notice the training_inputs data and the training_outputs data. Each row of the training_outputs is what we want the model to predict for the corresponding input row. As coded, the output for the sample inputs ends up being the same as the first element of the array. For inputs [0,0,1] we are trying to train the model to predict [0]. For the inputs [1,0,1], we are trying to train the model to predict [1], etc. The table below may help to explain.
Dataset | Values | Desired Prediction |
---|---|---|
1 | 0 0 1 | 0 |
2 | 1 1 1 | 1 |
3 | 1 0 1 | 1 |
4 | 0 1 1 | 0 |
- When we run the program, it will train the neural net to try and predict the outputs corresponding to the inputs. You will see the random training weights to start and then the adjusted weights to make the model predict the output. You will then be prompted to put in your own training data. We'll look at that in the next step. For now, go ahead and run the program (command below) but don't put in any inputs yet. Just notice how the weights have been adjusted after the training process.
python nn.py
- What you should see is that the weights after training are now set in a way that makes it more likely that the result will match the expected output value. (The higher positive value for the first weight means that the model has looked at the training data and realized it should "weigh" the first input higher in its prediction.) To prove this out, you can enter your own input set - just use 1's and 0's for each input.
- After you put in your inputs, the neural net will process your input and because of the training, it should predict a result that is close to the first input value you entered (the one for Input one).
- Now, let's see what happens if we change the expected outputs to be different. In the editor for the genai_nn.py file, find the line for the training_outputs. Modify the values in the array to be ([[0],[1],[0],[1]]). These are the values of the second element in each of the training data entries. After you're done, save your changes as shown below, or use the keyboard shortcut.
- Now, run the neural net again. This time when the weights after training are shown, you should see a bias for a higher weight for the second item.
python nn.py
-
At the input prompts, just input any sequence of 0's and 1's as before.
-
When the trained model then processes your inputs, you should see that it predicts a value that is close to 0 or 1 depending on what your second input was.
- (Optional) If you get done early and want more to do, feel free to try other combinations of training inputs and training outputs.
**[END OF LAB]**
Lab 2 - Experimenting with Tokenization
Purpose: In this lab, we'll see how different models do tokenization.
- In the same genai directory, we have a simple program that can load a model and print out tokens generated by it. The file name is tokenizer.py. You can view the file either by clicking on genai/tokenizer.py or by entering the command below in the codespace's terminal (assuming you're still in the genai directory).
code tokenizer.py
- This program can be run and passed a model to use for tokenization. To start, we'll be using a model named bert-base-uncased. Let's look at this model on huggingface.co. Go to https://huggingface.co/models and in the Models search area, type in bert-base-uncased. Select the entry for google-bert/bert-base-uncased.
- Once you click on the selection, you'll be on the model card tab for the model. Take a look at the model card for the model and then click on the Files and Versions and Community tabs to look at those pages.
- Now let's switch back to the codespace and, in the terminal, run the tokenizer program with the bert-base-uncased model. Enter the command below. This will download some of the files you saw on the Files tab for the model in HuggingFace.
python tokenizer.py bert-base-uncased
- After the program starts, you will be at a prompt to Enter text. Enter in some text like the following to see how it will be tokenized.
This is sample text for tokenization and text for embeddings.
- After you enter this, you'll see the various subword tokens that were extracted from the text you entered. And you'll also see the ids for the tokens stored in the model that matched the subwords.
- Next, you can try out some other models. Repeat steps 4 - 6 for other tokenizers like the following. (You can use the same text string or different ones. Notice how the text is broken down depending on the model and also the meta-characters.)
python tokenizer.py roberta-base
python tokenizer.py gpt2
python tokenizer.py xlnet-large-cased
- (Optional) If you finish early and want more to do, you can look up the models from step 7 on huggingface.co/models.
**[END OF LAB]**
Lab 3 - Understanding embeddings, vectors and similarity measures
Purpose: In this lab, we'll see how tokens get mapped to vectors and how vectors can be compared.
- In the repository, we have a Python program that uses a Tokenizer and Model to create embeddings for three terms that you input. It then computes and displays the cosine similarity between each combination. Open the file to look at it by clicking on genai/vectors.py or by using the command below in the terminal.
code vectors.py
- Let's run the program. As we did for the tokenizer example, we'll pass in a model to use. We'll also pass in a second argument which is the number of dimensions from the vector for each term to show. Run the program with the command below. You can wait to enter terms until the next step.
python vectors.py bert-base-cased 5
- The command we just ran loads up the bert-base-cased model and tells it to show the first 5 dimensions of each vector for the terms we enter. The program will be prompting you for three terms. Enter each one in turn. You can try two closely related words and one that is not closely related. For example
- king
- queen
- duck
- Once you enter the terms, you'll see the first 5 dimensions for each term. And then you'll see the cosine similarity displayed between each possible pair. This is how similar each pair of words is. The two that are most similar should have a higher cosine similarity "score".
- Each vector in the bert-based models have 768 dimensions. Let's run the program again and tell it to display 768 dimensions for each of the three terms. Also, you can try another set of terms that are more closely related, like multiplication, division, addition.
python vectors.py bert-base-cased 768
-
You should see that the cosine similarities for all pair combinations are not as far apart this time.
-
As part of the output from the program, you'll also see the token id for each term. (It is above the print of the dimensions. If you don't want to scroll through all the dimensions, you can just run it again with a small number of dimensions like we did in step 2.) If you're using the same model as you did in lab 2 for tokenization, the ids will be the same.
- You can actually see where these mappings are stored if you look at the model on Hugging Face. For instance, for the bert-base-cased model, you can go to https://huggingface.co and search for bert-base-cased. Select the entry for google-bert/bert-base-cased.
- On the page for the model, click on the Files and versions tab. Then find the file tokenizer.json and click on it. The file will be too large to display, so click on the check the raw version link to see the actual content.
- You can search for the terms you entered previously with a Ctrl-F or Cmd-F and find the mapping between the term and the id. If you look for "##" you'll see mappings for parts of tokens like you may have seen in lab 2.
- If you want, you can try running the genai_vectors.py program with a different model to see results from other models (such as we used in lab 2) and words that are very close like embeddings, tokenization, subwords.
**[END OF LAB]**
Lab 4 - Working with transformer models
Purpose: In this lab, we’ll see how to interact with various models for different standard tasks
- In our repository, we have several different Python programs that utilize transformer models for standard types of LLM tasks. One of them is a simple translation example. The file name is translation.py. Open the file either by clicking on genai/translation.py or by entering the command below in the codespace's terminal.
code translation.py
-
Take a look at the file contents. Notice that we are pulling in a specific model ending with 'en-fr'. This is a clue that this model is trained for English to French translation. Let's find out more about it. In a browser, go to https://huggingface.co/models and search for the model name 'Helsinki-NLP/opus-mt-en-fr' (or you can just go to huggingface.co/Helsinki-NLP/opus-mt-en-fr).
-
You can look around on the model card for more info about the model. Notice that it has links to an OPUS readme and also links to download its original weights, translation test sets, etc.
-
When done looking around, go back to the repository and look at the rest of the translation.py file. What we are doing is loading the model, the tokenizer, and then taking a set of random texts and running them through the tokenizer and model to do the translation. Go ahead and execute the code in the terminal via the command below.
python translation.py
- There's also an example program for doing classification. The file name is classification.py. Open the file either by clicking on genai/classification.py or by entering the command below in the codespace's terminal.
code classification.py
-
Take a look at the model for this one joeddav/xlm-roberta-large-xnli on huggingface.co and read about it. When done, come back to the repo.
-
This uses a HuggingFace pipeline to do the main work. Notice it also includes a list of categories as candidate_labels that it will use to try and classify the data. Go ahead and run it to see it in action. (This will take awhile to download the model.) After it runs, you will see each topic, followed by the ratings for each category. The scores reflect how well the model thinks the topic fits a category. The highest score reflects which category the model thinks fit best.
python classification.py
- Finally, we have a program to do sentiment analysis. The file name is sentiment.py. Open the file either by clicking on genai/sentiment.py or by entering the command below in the codespace's terminal.
code sentiment.py
-
Again, you can look at the model used by this one distilbert-base-uncased-finetuned-sst-2-english in Hugging Face.
-
When ready, go ahead and run this one in the similar way and observe which ones it classified as positive and which as negative and the relative scores.
python sentiment.py
- If you're done early, feel free to change the texts, the candidate_labels in the previous model, etc. and rerun the models to see the results.
**[END OF LAB]**
Lab 5 - Using Ollama to run models locally
Purpose: In this lab, we’ll start getting familiar with Ollama, a way to run models locally.
- We already have a script that can download and start Ollama and fetch some models we'll need in later labs. Take a look at the commands being done in the ../scripts/startOllama.sh file.
cat ../scripts/startOllama.sh
- Go ahead and run the script to get Ollama and start it running.
../scripts/startOllama.sh &
-
Now let's find a model to use. Go to https://ollama.com and in the Search models box at the top, enter llava.
-
Click on the first entry to go to the specific page about this model. Scroll down and scan the various information available about this model.
-
Switch back to a terminal in your codespace. While it's not necessary to do as a separate step, first pull the model down with ollama. (This will take a few minutes.)
ollama pull llava
- Now you can run it with the command below.
ollama run llava
- Now you can query the model by inputting text at the >>>Send a message (/? for help) prompt. Since this is a multimodal model, you can ask it about an image too. Try the following prompt that references a smiley face file in the repo.
What's in this image? ../samples/smiley.jpg
(If you run into an error that the model can't find the image, try using the full path to the file as shown below.)
What's in this image? /workspaces/genai-dd/samples/smiley.jpg
- Now, let's try a call with the API. You can stop the current run with a Ctrl-D or switch to another terminal. Then put in the command below (or whatever simple prompt you want).
curl http://localhost:11434/api/generate -d '{
"model": "llava",
"prompt": "What causes wind?",
"stream": false
}'
- This will take a minute or so to run. You should see a single response object returned. You can try out some other prompts/queries if you want.
**[END OF LAB]**
Lab 6 - Working with Vector Databases
Purpose: In this lab, we’ll learn about how to use vector databases for storing supporting data and doing similarity searches.
- In our repository, we have a simple program built around a popular vector database called Chroma. The file name is vectordb.py. Open the file either by clicking on genai/vectordb.py or by entering the command below in the codespace's terminal.
code vectordb.py
-
For purposes of not having to load a lot of data and documents, we've seeded the same data strings in the file that we're loosely referring to as documents. These can be seen in the datadocs section of the file.
-
Likewise, we've added the metadata again for categories for the data items. These can be seen in the categories section.
-
Go ahead and run this program using the command shown below. This will take the document strings, create embeddings and vectors for them in the Chroma database section and then wait for us to enter a query.
python vectordb.py
- You can enter a query here about any topic and the vector database functionality will try to find the most similar matching data that it has. Since we've only given it a set of 10 strings to work from, the results may not be relevant or very good, but represent the best similarity match the system could find based on the query. Go ahead and enter a query. Some sample ones are shown below, but you can choose others if you want. Just remember it will only be able to choose from the data we gave it. The output will show the closest match from the doc strings and also the similarity and category.
Tell me about food.
Who is the most famous person?
How can I learn better?
-
After you've entered and run your query, you can add another one or just type exit to stop.
-
Now, let's update the number of results that are returned so we can query on multiple topics. In the file vectordb.py, change line 70 to say n_results=3, instead of n_results=1,. Make sure to save your changes afterwards.
- Run the program again with python vectordb.py. Now you can try more complex queries or try multiple queries (separated by commas).
-
When done querying the data, if you have more time, you can try modifying or adding to the document strings in the file, then save your changes and run the program again with queries more in-line with the data you provided. You can type in "exit" for the query to end the program.
-
In preparation for the next lab, remove the mistral model from Ollama's cache and download the llama3.2 model.
ollama rm mistral
ollama pull llama3.2
**[END OF LAB]**
Lab 7 - Working with RAG implemented with vector databases
Purpose: In this lab, we’ll build on the use of vector databases to parse a PDF and allow us to include it in context for LLM queries.
- In our repository, we have a simple program built for doing basic RAG processing. The file name is rag.py. Open the file either by clicking on genai/rag.py or by entering the command below in the codespace's terminal.
code rag.py
-
This program reads in a PDF, parses it into chunks, creates embeddings for the chunks and then stores them in a vector database. It then adds the vector database as additional context for the prompt to the LLM. There is an example pdf named data.pdf in the samples directory. It contains the same random document strings that were in some of the other programs. You can look at it in the GitHub repo if interested. Open up https://github.com/skillrepos/genai-dd/blob/main/samples/data.pdf if interested.
-
You can now run the program and pass in the ../samples/data.pdf file. This will read in the pdf and tokenize it and store it in the vector database. (Note: A different PDF file can be used, but it needs to be one that is primarily just text. The PDF parsing being used here isn't sophisticated enough to handle images, etc.)
python rag.py ../samples/data.pdf
- The program will be waiting for a query. Let's ask it for a query about something only in the document. As a suggestion, you can try the one below.
What does the document say about art and literature topics?
-
The response should include only conclusions based off the information in the document.
-
Now, let's ask it a query for some extended information. For example, try the query below. Then hit enter.
Give me 5 facts about the Mona Lisa
- In the data.pdf file, there is one (and only one) fact about the Mona Lisa - an obscure one about no eyebrows. In the output, you will probably see only this fact or you might see this one and others based on this one or noting a lack of other information.
- The reason the LLM couldn't add any other facts was due to the PROMPT_TEMPLATE we have in the file. Take a look at it starting around line 29. Note how it limits the LLM to only using the context that comes from our doc (line 51).
- To change this so the LLM can use our context and its own training, we need to change the PROMPT_TEMPLATE. Replace the existing PROMPT_TEMPLATE at lines 29-37 with the lines below. Afterwards, your changes should look like the screenshot below.
PROMPT_TEMPLATE = """
Answer the {question} based on this context:
{context} and whatever other information you have.
Your response must include any relevant information from {context}.
Provide a detailed answer.
"""
- Save your changes. Type "exit" to end the current run and then run the updated code. Enter the same query "Give me 5 facts about the Mona Lisa". This time, the program will run for several minutes and then the LLM should return 5 "real" facts about the Mona Lisa with our information included. Notice the highlighted part of the fourth item in the screenshot below. (If the answer isn't returned by the time the break is over, you can just leave it running and check back later.)
python rag.py ../samples/data.pdf
**[END OF LAB]**
Lab 8 - Working with Agents and Agentic RAG
Purpose: In this lab, we’ll see how to setup an agent using RAG with a tool.
- In this lab, we'll download a medical dataset, parse it into a vector database, and create an agent with a tool to help us get answers. First,let's take a look at a dataset of information we'll be using for our RAG context. We'll be using a medical Q&A dataset called keivalya/MedQuad-MedicalQnADataset. You can go to the page for it on HuggingFace.co and view some of it's data or explore it a bit if you want. To get there, either click on the link above in this step or go to HuggingFace.co and search for "keivalya/MedQuad-MedicalQnADataset" and follow the links.
- Now, let's create the Python file that will pull the dataset, store it in the vector database and invoke an agent with the tool to use it as RAG. First, create a new file for the project.
code lab8.py
- Now, add the imports.
from datasets import load_dataset
from langchain_community.document_loaders import DataFrameLoader
from langchain_community.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain_community.llms import Ollama
from langchain_community.embeddings.fastembed import FastEmbedEmbeddings
from langchain.prompts import ChatPromptTemplate
from langchain.chains.conversation.memory import ConversationBufferWindowMemory
from langchain.chains import RetrievalQA
from langchain.agents import Tool
from langchain.agents import create_react_agent
from langchain import hub
from langchain.agents import AgentExecutor
- Next, we pull and load the dataset.
data = load_dataset("keivalya/MedQuad-MedicalQnADataset", split='train')
data = data.to_pandas()
data = data[0:100]
df_loader = DataFrameLoader(data, page_content_column="Answer")
df_document = df_loader.load()
- Then, we split the text into chunks and load everything into our Chroma vector database.
from langchain.text_splitter import CharacterTextSplitter
text_splitter = CharacterTextSplitter(chunk_size=1250,
separator="\n",
chunk_overlap=100)
texts = text_splitter.split_documents(df_document)
# set some config variables for ChromaDB
CHROMA_DATA_PATH = "vdb_data/"
embeddings = FastEmbedEmbeddings()
# embed the chunks as vectors and load them into the database
db_chroma = Chroma.from_documents(df_document, embeddings, persist_directory=CHROMA_DATA_PATH)
- Set up memory for the chat, and choose the LLM.
conversational_memory = ConversationBufferWindowMemory(
memory_key='chat_history',
k=4, #Number of messages stored in memory
return_messages=True #Must return the messages in the response.
)
llm = Ollama(model="llama3.2",temperature=0.0)
- Now, define the mechanism to use for the agent and retrieving data. ("qa" = question and answer)
qa = RetrievalQA.from_chain_type(
llm=llm,
chain_type="stuff",
retriever=db_chroma.as_retriever()
)
- Define the tool itself (calling the "qa" function we just defined above as the tool). from langchain.agents import Tool
#Defining the list of tool objects to be used by LangChain.
tools = [
Tool(
name='Medical KB',
func=qa.run,
description=(
'use this tool when answering medical knowledge queries to get '
'more information about the topic'
)
)
]
- Create the agent using the LangChain project hwchase17/react-chat.
prompt = hub.pull("hwchase17/react-chat")
agent = create_react_agent(
tools=tools,
llm=llm,
prompt=prompt,
)
# Create an agent executor by passing in the agent and tools
from langchain.agents import AgentExecutor
agent_executor = AgentExecutor(agent=agent,
tools=tools,
verbose=True,
memory=conversational_memory,
max_iterations=30,
max_execution_time=600,
#early_stopping_method='generate',
handle_parsing_errors=True
)
- Add the input processing loop.
while True:
query = input("\nQuery: ")
if query == "exit":
break
if query.strip() == "":
continue
agent_executor.invoke({"input": query})
- Now, save the file and run the code.
python lab8.py
- You can prompt it with queries related to the info in the dataset, like:
I have a patient that may have Botulism. How can I confirm the diagnosis?
- In our limited environment, this may take up to 10 minutes to return a final answer, but you will be able to see it going through the "reasoning" process and ultimately providing a response using the Medical KB tool.
**[END OF LAB]**
**THANKS!**