Master RAG with LangChain: A Practical Guide

Master RAG with LangChain: A Practical Guide

Sakalya Mitra's photo
·

42 min read

Introduction

In the realm of Natural Language Processing (NLP), the quest for more human-like text comprehension and generation has led to innovative breakthroughs. One such advancement is the Retrieval-Augmented Generation (RAG) framework, which combines retrieval-based methods with generative models. By integrating retrieval mechanisms into the generation process, RAG allows models to access external knowledge sources, significantly enhancing the quality, coherence, and relevance of generated text. This makes RAG a versatile tool for various NLP tasks, including question answering, text summarization, and conversational agents.

Complementing RAG's capabilities is LangChain, which expands the scope of accessible knowledge and enhances context-aware reasoning in text generation. Together, RAG and LangChain form a powerful duo in NLP, pushing the boundaries of language understanding and generation. This synergy not only improves performance but also offers unprecedented potential for advancing state-of-the-art applications in the field.

In this blog, we go slow, step-by-step and understand from the very basics of what RAG is and move upto building our own RAG Based Chatbot from scratch.

Understanding RAG

Retrieval-Augmented Generation (RAG) revolutionizes natural language processing by merging retrieval-based and generation-based methods. Unlike traditional models that rely solely on their internal data, RAG introduces an innovative retrieval layer that fetches relevant information from external sources. This dual mechanism allows RAG to blend existing knowledge with new data, producing richer and more contextually accurate text. The retriever module sources pertinent information from databases or documents, while the generator synthesizes this data with internal representations to create coherent and contextually rich outputs.

RAG Architecure

Source: LangChain Official Documentation

RAG's versatility shines in tasks such as question answering, text summarization, and dialogue generation. It excels by using its retrieval capabilities to access factual information from vast corpora, ensuring responses are accurate and informative. By integrating external knowledge, RAG significantly enhances the relevance and accuracy of generated text, addressing issues of factual correctness and domain-specific knowledge. Its ability to seamlessly update or incorporate new sources makes RAG a highly adaptable tool, capable of evolving with changing contexts and requirements, offering a dynamic edge over traditional language models.

Introducing Langchain

LangChain is a robust framework designed to streamline the development of applications powered by large language models (LLMs). By offering a comprehensive structure, LangChain simplifies the process of integrating LLMs into various applications, whether it be for chatbots, automated content creation, or advanced data analysis tools. The framework focuses on modularity, allowing developers to utilize its components efficiently and effectively, which significantly reduces the complexity typically associated with LLM-based development. This modularity ensures that developers can build, customize, and extend their applications with ease, leveraging the full potential of cutting-edge language models.

One of the standout features of LangChain is its ability to abstract complex functionalities into manageable pieces. This not only accelerates the development process but also enhances the scalability and maintainability of applications. LangChain provides a variety of built-in tools and libraries that cater to different aspects of language model integration, making it a versatile choice for developers aiming to harness the power of LLMs. With its user-friendly design and powerful capabilities, LangChain is set to become a key player in the field of natural language processing and AI-driven application development. For more detailed information, visit the official LangChain website.

Understanding Chains in LangChain

First we will get into how a LangChain chain is constructed and it’s functioning. A chain is basically a pipeline that processes an input by using a specific combination of primitives. Intuitively, it can be thought of as a ‘step’ that performs a certain set of operations on an input and returns the result. They can be anything from a prompt-based pass through a LLM to applying a Python function to an text.

Well how to build the chains might be your question right? The LangChain Expression Language (LCEL) is here to make it easy for you.

LCEL offers a declarative way to create LangChain chains, allowing you to focus on the "what" rather than the "how" of your workflow. Think of it as legos for LangChain components – you simply snap them together using a clear and concise syntax.

While LangChain offers the power of custom code, LCEL provides several advantages:

  • Superfast Development: LCEL allows you to rapidly prototype and build chains without getting bogged down in complex code.

  • Advanced Features, Simplified: LCEL offers built-in support for powerful features like streaming, asynchronous execution, and parallel processing – all without writing a single line of extra code.

  • Seamless Integration: LCEL integrates perfectly with LangSmith and LangServe, making it easy to monitor, debug, and deploy your chains.

Let's get our hands dirty and code our very first RAG Chain and understand how it works under the hood! We will build a simple RAG chain that uses knowledge from this website and answers your queries.

0. Installling Libraries

You need to have LangChain and OpenAI installed. You can simply run the command and install it:

!pip install -qU langchain langchain-openai langchain_chroma langchain_community langchainhub
  1. Setting Up LangChain and OpenAI

Before proceeding with the coding part, make sure you have your API Key for OpenAI Generated.

import getpass
import os

os.environ["OPENAI_API_KEY"] = getpass.getpass()

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-3.5-turbo-0125")

2. Building the Chain

We’ll start by importing all the libraries that we’ll be using in this example.

import bs4
from langchain import hub
from langchain_chroma import Chroma
from langchain_community.document_loaders import WebBaseLoader
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

That's a lot of things we have imported right! Let's have a look what each is being used for:

  • BeautifulSoup (bs4): This library serves as our web scraping workhorse. It parses HTML content retrieved from websites, allowing us to navigate and extract the specific data we need for the RAG chain.

  • LangChain Hub (hub): This acts as the central hub for LangChain, providing access to a vast library of pre-built components we can leverage in our chain.

  • Chroma (from langchain_chroma): Chroma is a LangChain in-built Vector Database component specifically designed for building RAG chains. It excels at managing the retrieval and integration of relevant information from external sources, making it crucial for our web scraping RAG setup.

  • WebBaseLoader (from langchain_community.document_loaders): This library provides a foundation for loading documents from web sources.

  • StrOutputParser (from langchain_core.output_parsers): As the name suggests, this library helps us parse the final output of our chain into a human-readable string format.

  • RunnablePassthrough (from langchain_core.runnables): This acts as a utility component, allowing us to pass data along the chain without any modifications. It's helpful for maintaining the flow of information between different parts of the chain.

  • OpenAIEmbeddings (from langchain_openai): This library opens the door to the power of OpenAI's large language models (LLMs) for tasks like text embedding and generation. In our RAG chain, it might be used to generate text responses based on the retrieved information.

  • RecursiveCharacterTextSplitter (from langchain_text_splitters): This library helps with splitting text into smaller chunks for processing. This will be beneficial for certain LLM tasks or for situations where we need to handle large amounts of text data.

3. Setup our Knowledge Base

The most crucial step in RAG is setting up our knowledge base which will act as a source from which our agent will retrieve information and utilize it to answer our queries.

# Load, chunk and index the contents of the blog.
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2017-06-21-overview/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)
vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())

We first fetch the content from our target website and focus on specific parts of the blog posts like titles, headers, and the main content.

In the second stage, we split the extracted text into smaller chunks for better handling. It is done while maintaining some context between chunks to ensure meaning is preserved, which is nothing but the chunk_overlap. Finally, we use pre-trained embeddings from OpenAI for text embedding and finally storing in the Chroma Database.

4. Retrieve and Generate

Now that we are ready with our knowledge base, we will now retrieve relevant information based on our query and generate a proper answer through our agent.

# Retrieve and generate using the relevant snippets of the blog.
retriever = vectorstore.as_retriever()
prompt = hub.pull("rlm/rag-prompt")

The retriever is initialized from the vectorstore. This vector store is the database optimized for fast retrieval of vectors (numerical representations of text). The predefined prompt is fetched from a hub, which is likely a repository of prompts designed for various tasks. The pulled prompt will be used to structure the input for the language model.

💡
You can read out the various other prompts from the LangChain Hub.
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

We define this function to take a list of documents and concatenate their content. The content of each document is joined with double newlines (\n\n), formatting them into a single string suitable for input into the language model.

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

Now comes the interesting part using LECL. We combine all the individual steps into a single sequential chain to happen one after another. This is the power of LECL.

Components:

  • retriever | format_docs: Combines the retriever with the format_docs function to retrieve and format documents based on a query.

  • "question":RunnablePassthrough(): Passes the input question through without modification. Whatever the question is asked, is passed as it is without any modification.

  • prompt: The prompt pulled earlier is used to structure the input.

  • llm: The large language model generates a response based on the formatted documents and the prompt.

  • StrOutputParser(): Parses the output from the language model into a usable string format.

    The chain processes an input question by first retrieving relevant documents, formatting them, and then using the prompt to generate a coherent response from the language model. The final output is parsed into a string for easy use.

Let's now ask our query and see the magic happen!

rag_chain.invoke("What is Convolutional Neural Networks?")

OUTPUT

Convolutional Neural Networks (CNN) are a type of feed-forward artificial 
neural networks inspired by the organization of the visual cortex system. 
They use convolutional layers with fixed small matrices called kernels to 
process images for tasks like edge detection, blurring, and sharpening 
efficiently. The connectivity pattern between neurons in CNNs mimics the 
flow of visual information in the human brain, allowing for effective object 
recognition.

Wohooo!! You have your first chain working and generating answers for your query. Isn't that fascinating? It is right!

Let's not stop our excitement and explore how we can use our own documents as knowledge base and retrieve information from them using the power of LangChain.

Imagine you have got many Documents from your school/college and a list of questions to be answered after reading all of them as your summer vacation assignments. Well one way is sitting there your entire vacation and getting it done.

But, you have superpowers right!! With LangChain you can simply build your knowledge base with the documents and get your questions answered within few minutes.

Then why wait? Let's go ahead and build it so that you can be done with your summer homeworks!!

Leveraging Custom Documents for Powerful Information Retrieval

In my 12th, I have always hated Inorganic Chemistry as it was all mugging up and no practical. Well I would definitely not have done the homework and read through 100s of inorganic theory stuff!

So what's the solution? Simple! Build my own knowledge base from the documents and get all the questions answered utilizing LangChain.

0. Installing required libraries

Now the first step is always to install necessary libraries that will help us in the process.

As we are dealing with our own PDFs, we will need some libraries to read the content of the PDF and help me store in knowledge base. So we have something called as Document Loaders in LangChain that help us in that task.

!pip install pypdf
from langchain_community.document_loaders import PyPDFLoader

In this example we will be using PyPDFLoader. But we have tons of more document loaders provided by LangChain which you can use according to your pdf and data. You can read about all of them here.

Now once we have the necessary things imported, let's split, index and store our document.

1. Creating our Knowledge Base

# Loading our data
file_path = 'd and f block.pdf'
loader = PyPDFLoader(file_path)
documents = loader.load()
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings

# Splitting the documents into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=150)
docs = text_splitter.split_documents(documents)

# Creating the embedding model for embedding our text data
model_name = 'text-embedding-ada-002'

embeddings = OpenAIEmbeddings(
    model=model_name,
    openai_api_key=openai_api_key
)

# Storing our data into vector database
vectorstore = Chroma.from_documents(documents=docs, embedding=embeddings)

I hope the above steps are all clear as we have done the exact same thing as in our previous experiment. We have just modified the input to take document and extracting data using document loaders.

Great!!!

Now comes the part where we create the chain and query it

3. Retrieve and Generate

retriever = vectorstore.as_retriever()
prompt = hub.pull("rlm/rag-prompt")

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

The chain and the prompt are exactly the same as previous. But you can deifnitely try with some different prompts and play around with it to see how the output changes.

rag_chain.invoke("What is d block?")

OUTPUT

The d-block elements are those in the middle of the periodic table, belonging to groups 3 to 12. They have a general electronic configuration of (n – 1)d1-10 ns1-2. Transition elements are defined as having incompletely filled d orbitals in their ground state or oxidation states. Zinc, cadmium, mercury are not transition metals due to completely filled d orbitals.

Yayyy! I got my answer. That's amazing isn't it. With so little effort now you can get answers to all the questions and get your homework done with ease.

💡
What if you are also asked to return the sources from where you have read about the topic and answered the questions? 😭

Well worry not! LangChain has got you covered even in this situation. You can access the source of the documents retrieved from the vector database based on which the answer is being generated by the rag chain.

LangChain allows executing runnable components parallelly that allows you to fetch the sources as well. It can be easily achieved using the RunnableParallel class of LangChain.

In the first step of Retrieve and Generate, we have declared a retriever right. That retriever basically retrieves relevant sources from the entire document and then that is used as a context. So using that retriever we can easily access the sources of documents.

from langchain_core.runnables import RunnableParallel
rag_chain = (
    RunnablePassthrough.assign(context=(lambda x: format_docs(x["context"])))
    | prompt
    | llm
    | StrOutputParser()
)

rag_chain_with_source = RunnableParallel(
    {"context": retriever_2, "question": RunnablePassthrough()}
).assign(answer=rag_chain)

rag_chain_with_source.invoke("What is d-block?")
  • RunnablePassthrough.assign(context=(lambda x: format_docs(x["context"]))): This ensures the original context (the sources) is passed through the chain without modification.

  • | prompt: This represents a predefined prompt that we fetched from the hub.

  • | llm: This refers to the large language model. It will consider the formatted context (the documents) and the prompt to generate a response.

  • | StrOutputParser(): This part ensures the final output from the LLM is parsed as a string, making it easier to work with and interpret.

The next part involves using the chain we declared and pass it through a RunnableParallel component for accessing sources as well as using it for generating the response.

  • RunnableParallel({...}): This creates a parallel processing branch. This means both functionalities within the curly braces will be executed simultaneously.

  • "context": retriever_2 - This utilises the retriever we defined earlier to find relevant information from your documents and appending the relevant information as a value to the key "context".

  • "question": RunnablePassthrough() - This ensures the user's question is passed through this branch without modification.

  • .assign(answer=rag_chain): This part assigns the output from the previous rag_chain (the snippet we explained earlier) to the key "answer". This means the answer generated using the formatted context, prompt, and LLM becomes available within this parallel branch.

OUTPUT

{'context': [Document(page_content='Portal for CBSE Notes, Test Papers, Sample Papers, Tips and Tricks \nCBSE Class-12 Chemistry Quick Revision Notes \nChapter-08: The D and F-Block Elements  \n \n \n• The d -Block elements:  \na) The elements lying in the middle of periodic table belonging to groups 3 to 12 are \nknown as d – block elements.  \nb) Their general electronic configuration is (n – 1)d1-10 ns1-2 where (n – 1) stands for \npenultimate (last but one) shell.  \n• Transition element:  \na) A transition element is defined as the one which has incompletely filled d orbitals in \nits ground state or in any one of its oxidation states.  \nb) Zinc, cadmium, mercury are not regarded as transition metals due to completely \nfilled d – orbital.  \n•\nThe f-Block elements:  \nThe elements constituting the f -block are those in which the 4 f and 5 f orbitals are \nprogressively filled in the latter two long periods.  \n•\nLanthanoids:  \nThe 14 elements immediately following lanthanum, i.e., Cerium (58) to Lutetium (71)', metadata={'author': 'Elpis', 'creationDate': "D:20141218110508+05'30'", 'creator': 'PDFCreator Version 1.5.0(Foxit Advanced PDF Editor)', 'file_path': '/content/d and f block.pdf', 'format': 'PDF 1.4', 'keywords': '', 'modDate': 'D:20171202153403', 'page': 0, 'producer': 'GPL Ghostscript 9.05', 'source': '/content/d and f block.pdf', 'subject': '', 'title': '12_chemistry_notes_ch08_the_dblock_f-block_elements', 'total_pages': 4, 'trapped': ''}),
  Document(page_content='c) 5d – transition series. It consists of elements with atomic number 57(La), 72(Hf) to \n80(Hg) having incomplete 5d orbitals. It is called third transition series.  \nd) 6d – transition series. It consists of elements with atomic number 89(Ac), 104(Rf) to \n112(Uub) having incomplete 6d orbitals. It is called fourth transition series.  \n•\nGeneral Characteristics of transition elements:  \na) Metallic character:  \nAll transition elements are metallic in nature, i.e. they have strong metallic bonds. \nThis is because of presence of unpaired electrons. This gives rise to properties like \nhigh density, high enthalpies of atomization, and high melting and boiling points.  \nb) Atomic radii:', metadata={'author': 'Elpis', 'creationDate': "D:20141218110508+05'30'", 'creator': 'PDFCreator Version 1.5.0(Foxit Advanced PDF Editor)', 'file_path': '/content/d and f block.pdf', 'format': 'PDF 1.4', 'keywords': '', 'modDate': 'D:20171202153403', 'page': 0, 'producer': 'GPL Ghostscript 9.05', 'source': '/content/d and f block.pdf', 'subject': '', 'title': '12_chemistry_notes_ch08_the_dblock_f-block_elements', 'total_pages': 4, 'trapped': ''}),
  Document(page_content='•\nLanthanoids:  \nThe 14 elements immediately following lanthanum, i.e., Cerium (58) to Lutetium (71) \nare called lanthanoids. They belong to first inner transition series. Lanthanum (57) has \nsimilar properties. Therefore, it is studied along with lanthanoids.  \n•\nActinoids:  \nThe 14 elements immediately following actinium (89), with atomic numbers 90 \n(Thorium) to 103 (Lawrencium) are called actinoids. They belong to second inner \ntransition series. Actinium (89) has similar properties. Therefore, it is studied along with \nactinoids.  \n•\nFour transition series:  \na) 3d – transition series. The transition elements with atomic number 21(Sc) to 30(Zn) and \nhaving incomplete 3d orbitals is called the first transition series.  \nb) 4d – transition series. It consists of elements with atomic number 39(Y) to 48 (Cd) and \nhaving incomplete 4d orbitals. It is called second transition series.  \nc) 5d – transition series. It consists of elements with atomic number 57(La), 72(Hf) to', metadata={'author': 'Elpis', 'creationDate': "D:20141218110508+05'30'", 'creator': 'PDFCreator Version 1.5.0(Foxit Advanced PDF Editor)', 'file_path': '/content/d and f block.pdf', 'format': 'PDF 1.4', 'keywords': '', 'modDate': 'D:20171202153403', 'page': 0, 'producer': 'GPL Ghostscript 9.05', 'source': '/content/d and f block.pdf', 'subject': '', 'title': '12_chemistry_notes_ch08_the_dblock_f-block_elements', 'total_pages': 4, 'trapped': ''}),
  Document(page_content='transition elements are almost similar to those of the third row of transition \nelements.  \nd) Ionisation enthalpy:  \nThere is slight and irregular variation in ionization energies of transition metals due \nto irregular variation of atomic size. The I.E. of 5d transition series is higher than 3d \nand 4d transition series because of Lanthanoid Contraction.  \ne) Oxidation state:  \nTransition metals show variable oxidation states due to tendency of (n-1)d as well as \nns electrons to take part in bond formation.  \nf)\n Magnetic properties:  \nMost of transition metals are paramagnetic in nature due to presence of unpaired \nelectrons. It increase s from Sc to Cr and then decreases because number of unpaired \nand then decrease because number of unpaired electrons increases from Sc to Cr and \nthen decreases.  \ng) Catalytic properties:  \nMost of transition metals are used as catalyst because of (i) presence of incomplete', metadata={'author': 'Elpis', 'creationDate': "D:20141218110508+05'30'", 'creator': 'PDFCreator Version 1.5.0(Foxit Advanced PDF Editor)', 'file_path': '/content/d and f block.pdf', 'format': 'PDF 1.4', 'keywords': '', 'modDate': 'D:20171202153403', 'page': 1, 'producer': 'GPL Ghostscript 9.05', 'source': '/content/d and f block.pdf', 'subject': '', 'title': '12_chemistry_notes_ch08_the_dblock_f-block_elements', 'total_pages': 4, 'trapped': ''})],
 'question': 'What is d-block?',
 'answer': 'The d-block elements are those lying in the middle of the periodic table belonging to groups 3 to 12 with a general electronic configuration of (n – 1)d1-10 ns1-2. Transition elements are defined as having incompletely filled d orbitals in their ground state or oxidation states. Lanthanoids and actinoids are inner transition elements that follow lanthanum and actinium, respectively.'}

As you can see from the output, the chain returns the exact document sources which is used as context for generating the output based on your question. Also the 'question' and 'answer' are marked separately as a key value pair.

Hurray! Now you have got everything for completing your homework and we have also learnt how to use custom documents for building RAG and how to access the source of documents used in the RAG Chain.

But here comes another question? Have we really completed building our RAG based chatbot? Are we missing on some very key element? Can you guess it!

If your guess was Memory then you are absolutely correct. We have built a RAG Agent that can retrieve relevant information from the knowledge base and help generate answer to our queries. But it lacks the key element of memory. Without memory it is not possible to have a conversation that we expect from a chatbot.

Suppose, you want to know what was the previous question you asked. Without memory it would be difficult to access the past conversations as the new ones will always overwrite over them. So adding that element is crucial for completing our Chatbot. Let's get it done then!

Providing our Chatbot Superpowers: Conversational Memory

Conversational memory is how chatbots can respond to our queries in a chat-like manner. It enables a coherent conversation, and without it, every query would be treated as an entirely independent input without considering past interactions.

The memory allows a "agent" to remember previous interactions with the user. By default, agents are stateless — meaning each incoming query is processed independently of other interactions. The only thing that exists for a stateless agent is the current input, nothing else.

from langchain.memory import ChatMessageHistory
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.prompts import ChatPromptTemplate
from langchain.prompts.chat import MessagesPlaceholder

We have imported a lot of things here. We will understand the usage of each one of them as we go on and add memory to our chatbot.

We will update our previous prompt pulled from the hub, to accept documents as context. We'll create a helper function called create_stuff_documents_chain to "stuff" all the input documents into the prompt, which will also take care of formatting. We will use the ChatPromptTemplate.from_messages method to format the message input we want to send to the model, including a MessagesPlaceholder where chat history messages will be directly injected.

chat = ChatOpenAI(model="gpt-3.5-turbo")

question_answering_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "Answer the user's questions based on the below context:\n\n{context}",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

document_chain = create_stuff_documents_chain(chat, question_answering_prompt)

Basically we create a custom prompt chain, that takes into account the context based on the user query as well as the user question and then generates the answer. This is a better approach as we are explicitly providing the source documents fetched based on our query as well as the previous messages/conversations till now to the chain for answering. This helps us providing memory to our chatbot.

But now the main part is how to store the entire conversation history. Well we have ChatMessageHistory in LangChain that takes care of this functionality.

from langchain.memory import ChatMessageHistory

chat_history = ChatMessageHistory()

chat_history.add_user_message("What is d block?")

document_chain.invoke(
    {
        "messages": chat_history.messages,
        "context": docs,
    }
)

So the instance of the ChatMessageHistory, chat_history has functions like add_user_message for adding the mesaage user asks, as well as add_ai_message to maintain the LLM response in the history. It's that simple.

Now if you carefully observe the invoking of the document chain we created, the variables looks quite similar. The messages is the message placeholder we created in the document chain and the context is the placeholder to hold our source docs.

So our chain uses both of these resources and help generate the user query.

The d-block elements refer to the elements in the middle of the periodic table, specifically belonging to groups 3 to 12. They have a general electronic configuration of (n – 1)d1-10 ns1-2, where (n – 1) stands for the penultimate (last but one) shell. These elements are also known as transition elements due to their incompletely filled d orbitals in their ground state or in any one of their oxidation states.

Now we will integrate our retriever into the chain. Our retriever should retrieve information relevant to the last message we pass in from the user, so we extract it and use that as input to fetch relevant docs, which we add to the current chain as context. We pass context plus the previous messages into our document chain to generate a final answer.

We also use the RunnablePassthrough.assign() method to pass intermediate steps through at each invocation.

from typing import Dict

from langchain_core.runnables import RunnablePassthrough


def parse_retriever_input(params: Dict):
    return params["messages"][-1].content


retrieval_chain = RunnablePassthrough.assign(
    context=parse_retriever_input | retriever,
).assign(
    answer=document_chain,
)

Now let's retrieve all our conversations

response = retrieval_chain.invoke(
    {
        "messages": chat_history.messages,
    }
)
response

OUTPUT

{'messages': [HumanMessage(content='What is d block?')],
 'context': [Document(page_content='Portal for CBSE Notes, Test Papers, Sample Papers, Tips and Tricks \nCBSE Class-12 Chemistry Quick Revision Notes \nChapter-08: The D and F-Block Elements  \n \n \n• The d -Block elements:  \na) The elements lying in the middle of periodic table belonging to groups 3 to 12 are \nknown as d – block elements.  \nb) Their general electronic configuration is (n – 1)d1-10 ns1-2 where (n – 1) stands for \npenultimate (last but one) shell.  \n• Transition element:  \na) A transition element is defined as the one which has incompletely filled d orbitals in \nits ground state or in any one of its oxidation states.  \nb) Zinc, cadmium, mercury are not regarded as transition metals due to completely \nfilled d – orbital.  \n•\nThe f-Block elements:  \nThe elements constituting the f -block are those in which the 4 f and 5 f orbitals are \nprogressively filled in the latter two long periods.  \n•\nLanthanoids:  \nThe 14 elements immediately following lanthanum, i.e., Cerium (58) to Lutetium (71)', metadata={'author': 'Elpis', 'creationDate': "D:20141218110508+05'30'", 'creator': 'PDFCreator Version 1.5.0(Foxit Advanced PDF Editor)', 'file_path': '/content/d and f block.pdf', 'format': 'PDF 1.4', 'keywords': '', 'modDate': 'D:20171202153403', 'page': 0, 'producer': 'GPL Ghostscript 9.05', 'source': '/content/d and f block.pdf', 'subject': '', 'title': '12_chemistry_notes_ch08_the_dblock_f-block_elements', 'total_pages': 4, 'trapped': ''}),
  Document(page_content='c) 5d – transition series. It consists of elements with atomic number 57(La), 72(Hf) to \n80(Hg) having incomplete 5d orbitals. It is called third transition series.  \nd) 6d – transition series. It consists of elements with atomic number 89(Ac), 104(Rf) to \n112(Uub) having incomplete 6d orbitals. It is called fourth transition series.  \n•\nGeneral Characteristics of transition elements:  \na) Metallic character:  \nAll transition elements are metallic in nature, i.e. they have strong metallic bonds. \nThis is because of presence of unpaired electrons. This gives rise to properties like \nhigh density, high enthalpies of atomization, and high melting and boiling points.  \nb) Atomic radii:', metadata={'author': 'Elpis', 'creationDate': "D:20141218110508+05'30'", 'creator': 'PDFCreator Version 1.5.0(Foxit Advanced PDF Editor)', 'file_path': '/content/d and f block.pdf', 'format': 'PDF 1.4', 'keywords': '', 'modDate': 'D:20171202153403', 'page': 0, 'producer': 'GPL Ghostscript 9.05', 'source': '/content/d and f block.pdf', 'subject': '', 'title': '12_chemistry_notes_ch08_the_dblock_f-block_elements', 'total_pages': 4, 'trapped': ''}),
  Document(page_content='•\nLanthanoids:  \nThe 14 elements immediately following lanthanum, i.e., Cerium (58) to Lutetium (71) \nare called lanthanoids. They belong to first inner transition series. Lanthanum (57) has \nsimilar properties. Therefore, it is studied along with lanthanoids.  \n•\nActinoids:  \nThe 14 elements immediately following actinium (89), with atomic numbers 90 \n(Thorium) to 103 (Lawrencium) are called actinoids. They belong to second inner \ntransition series. Actinium (89) has similar properties. Therefore, it is studied along with \nactinoids.  \n•\nFour transition series:  \na) 3d – transition series. The transition elements with atomic number 21(Sc) to 30(Zn) and \nhaving incomplete 3d orbitals is called the first transition series.  \nb) 4d – transition series. It consists of elements with atomic number 39(Y) to 48 (Cd) and \nhaving incomplete 4d orbitals. It is called second transition series.  \nc) 5d – transition series. It consists of elements with atomic number 57(La), 72(Hf) to', metadata={'author': 'Elpis', 'creationDate': "D:20141218110508+05'30'", 'creator': 'PDFCreator Version 1.5.0(Foxit Advanced PDF Editor)', 'file_path': '/content/d and f block.pdf', 'format': 'PDF 1.4', 'keywords': '', 'modDate': 'D:20171202153403', 'page': 0, 'producer': 'GPL Ghostscript 9.05', 'source': '/content/d and f block.pdf', 'subject': '', 'title': '12_chemistry_notes_ch08_the_dblock_f-block_elements', 'total_pages': 4, 'trapped': ''}),
  Document(page_content='transition elements are almost similar to those of the third row of transition \nelements.  \nd) Ionisation enthalpy:  \nThere is slight and irregular variation in ionization energies of transition metals due \nto irregular variation of atomic size. The I.E. of 5d transition series is higher than 3d \nand 4d transition series because of Lanthanoid Contraction.  \ne) Oxidation state:  \nTransition metals show variable oxidation states due to tendency of (n-1)d as well as \nns electrons to take part in bond formation.  \nf)\n Magnetic properties:  \nMost of transition metals are paramagnetic in nature due to presence of unpaired \nelectrons. It increase s from Sc to Cr and then decreases because number of unpaired \nand then decrease because number of unpaired electrons increases from Sc to Cr and \nthen decreases.  \ng) Catalytic properties:  \nMost of transition metals are used as catalyst because of (i) presence of incomplete', metadata={'author': 'Elpis', 'creationDate': "D:20141218110508+05'30'", 'creator': 'PDFCreator Version 1.5.0(Foxit Advanced PDF Editor)', 'file_path': '/content/d and f block.pdf', 'format': 'PDF 1.4', 'keywords': '', 'modDate': 'D:20171202153403', 'page': 1, 'producer': 'GPL Ghostscript 9.05', 'source': '/content/d and f block.pdf', 'subject': '', 'title': '12_chemistry_notes_ch08_the_dblock_f-block_elements', 'total_pages': 4, 'trapped': ''})],
 'answer': 'The d-block elements are the elements lying in the middle of the periodic table, belonging to groups 3 to 12. They are also known as transition elements. Their general electronic configuration is (n – 1)d1-10 ns1-2, where (n – 1) stands for the penultimate (last but one) shell. Transition elements are defined as ones that have incompletely filled d orbitals in their ground state or in any one of their oxidation states. This characteristic gives rise to their unique properties and behavior.'}

As you can see we have everything stored in the memory starting from your question to the context and the answer as well.

Let's add the answer as well to the history as the ai response

chat_history.add_ai_message(response["answer"])

Now the chatbot has complete memory about what the context is and what we have been discussing. So let's go ahead and try asking it a question referring something a little vaguely.

chat_history.add_user_message("tell me more about that!")

retrieval_chain.invoke(
    {
        "messages": chat_history.messages,
    },
)
{'messages': [HumanMessage(content='What is d block?'),
  AIMessage(content='The d-block elements are the elements lying in the middle of the periodic table, belonging to groups 3 to 12. They are also known as transition elements. Their general electronic configuration is (n – 1)d1-10 ns1-2, where (n – 1) stands for the penultimate (last but one) shell. Transition elements are defined as ones that have incompletely filled d orbitals in their ground state or in any one of their oxidation states. This characteristic gives rise to their unique properties and behavior.'),
  HumanMessage(content='tell me more about that!')],
 'context': [Document(page_content='However, simple perceptron neurons that linearly combine the current input element and the last unit state may easily lose the long-term dependencies. For example, we start a sentence with “Alice is working at …” and later after a whole paragraph, we want to start the next sentence with “She” or “He” correctly. If the model forgets the character’s name “Alice”, we can never know. To resolve the issue, researchers created a special neuron with a much more complicated internal structure for memorizing long-term context, named “Long-short term memory (LSTM)” cell. It is smart enough to learn for how long it should memorize the old information, when to forget, when to make use of the new data, and how to combine the old memory with new input. This introduction is so well written that I recommend everyone with interest in LSTM to read it. It has been officially promoted in the Tensorflow documentation ;-)', metadata={'source': 'https://lilianweng.github.io/posts/2017-06-21-overview/'}),
  Document(page_content='Meanwhile, many companies are spending resources on pushing the edges of AI applications, that indeed have the potential to change or even revolutionize how we are gonna live. Familiar examples include self-driving cars, chatbots, home assistant devices and many others. One of the secret receipts behind the progress we have had in recent years is deep learning.\nWhy Does Deep Learning Work Now?#\nDeep learning models, in simple words, are large and deep artificial neural nets. A neural network (“NN”) can be well presented in a directed acyclic graph: the input layer takes in signal vectors; one or multiple hidden layers process the outputs of the previous layer. The initial concept of a neural network can be traced back to more than half a century ago. But why does it work now? Why do people start talking about them all of a sudden?', metadata={'source': 'https://lilianweng.github.io/posts/2017-06-21-overview/'}),
  Document(page_content='(The post was originated from my talk for WiMLDS x Fintech meetup hosted by Affirm.)\nI believe many of you have watched or heard of the games between AlphaGo and professional Go player Lee Sedol in 2016. Lee has the highest rank of nine dan and many world championships. No doubt, he is one of the best Go players in the world, but he lost by 1-4 in this series versus AlphaGo. Before this, Go was considered to be an intractable game for computers to master, as its simple rules lay out an exponential number of variations in the board positions, many more than what in Chess. This event surely highlighted 2016 as a big year for AI. Because of AlphaGo, much attention has been attracted to the progress of AI.', metadata={'source': 'https://lilianweng.github.io/posts/2017-06-21-overview/'}),
  Document(page_content='Fig. 1. A three-layer artificial neural network. (Image source: http://cs231n.github.io/convolutional-networks/#conv)\nThe reason is surprisingly simple:\n\nWe have a lot more data.\nWe have much powerful computers.', metadata={'source': 'https://lilianweng.github.io/posts/2017-06-21-overview/'})],
 'answer': 'Certainly! Transition elements, or d-block elements, have some distinctive characteristics due to the presence of incompletely filled d orbitals. Some of these properties include variable oxidation states, the formation of colored compounds, complex formation, catalytic activity, and magnetic behavior. \n\nTheir variable oxidation states allow them to form a wide variety of compounds with different stoichiometries and properties. The formation of colored compounds is often due to the d-d electronic transitions within the d orbitals. \n\nTransition elements also have the ability to form coordination complexes due to their ability to accept and donate electrons, leading to the formation of complex ions.\n\nAdditionally, many transition metals exhibit catalytic activity due to their ability to undergo redox reactions. This property makes them important in industrial processes and biological systems.\n\nFurthermore, some transition elements are magnetic, which is attributed to the presence of unpaired electrons in their d orbitals.\n\nOverall, the d-block elements exhibit a wide range of properties and play crucial roles in various industrial, biological, and environmental processes.'}

Woowww!! Isn't that awesome. The chatbot was able to comprehend that I meant to specify d-block elements with "that" and answered about it correctly. Also you can see the entire conversation history and it just makes sense.

Now you can continue your conversation after adding the answer of this chain to the history and continuing your chatting with your personal chatbot.

If you want to learn more and go deep into the other memory types, you can follow this amazing comprehensive blog on Langchain Memory with LLMs for Advanced Conversational AI and Chatbots.

Well till now we have done a lot and have come a long way. You just have your own personal RAG Chatbot ready built with the superpowers of LangChain.

Before concluding and ending the fun stuff, let's try one last thing so that acts as an alternative of the current memory method we learnt. 😊.

The next part is going to be a small introduction on how user queries can be improved while passing to the chains.

💡
A very important principle of LLMs is, the better you are at querying, the better answer you receive!

Walkthrough on Refine Query

Our retrieval chain is capable of answering questions about d-block elements, but there's a problem - chatbots interact with users conversationally, and therefore have to deal with follow-up questions.

The chain in its current form will struggle with this. Consider a follow-up question to our original question like tell me more about that!. If we invoke our retriever with that query directly, we get documents irrelevant to what our query was for.

retriever.invoke("Tell me more!")
[Document(page_content='•\nLanthanoids:  \nThe 14 elements immediately following lanthanum, i.e., Cerium (58) to Lutetium (71) \nare called lanthanoids. They belong to first inner transition series. Lanthanum (57) has \nsimilar properties. Therefore, it is studied along with lanthanoids.  \n•\nActinoids:  \nThe 14 elements immediately following actinium (89), with atomic numbers 90 \n(Thorium) to 103 (Lawrencium) are called actinoids. They belong to second inner \ntransition series. Actinium (89) has similar properties. Therefore, it is studied along with \nactinoids.  \n•\nFour transition series:  \na) 3d – transition series. The transition elements with atomic number 21(Sc) to 30(Zn) and \nhaving incomplete 3d orbitals is called the first transition series.  \nb) 4d – transition series. It consists of elements with atomic number 39(Y) to 48 (Cd) and \nhaving incomplete 4d orbitals. It is called second transition series.  \nc) 5d – transition series. It consists of elements with atomic number 57(La), 72(Hf) to', metadata={'author': 'Elpis', 'creationDate': "D:20141218110508+05'30'", 'creator': 'PDFCreator Version 1.5.0(Foxit Advanced PDF Editor)', 'file_path': '/content/d and f block.pdf', 'format': 'PDF 1.4', 'keywords': '', 'modDate': 'D:20171202153403', 'page': 0, 'producer': 'GPL Ghostscript 9.05', 'source': '/content/d and f block.pdf', 'subject': '', 'title': '12_chemistry_notes_ch08_the_dblock_f-block_elements', 'total_pages': 4, 'trapped': ''}),
 Document(page_content='transition elements are almost similar to those of the third row of transition \nelements.  \nd) Ionisation enthalpy:  \nThere is slight and irregular variation in ionization energies of transition metals due \nto irregular variation of atomic size. The I.E. of 5d transition series is higher than 3d \nand 4d transition series because of Lanthanoid Contraction.  \ne) Oxidation state:  \nTransition metals show variable oxidation states due to tendency of (n-1)d as well as \nns electrons to take part in bond formation.  \nf)\n Magnetic properties:  \nMost of transition metals are paramagnetic in nature due to presence of unpaired \nelectrons. It increase s from Sc to Cr and then decreases because number of unpaired \nand then decrease because number of unpaired electrons increases from Sc to Cr and \nthen decreases.  \ng) Catalytic properties:  \nMost of transition metals are used as catalyst because of (i) presence of incomplete', metadata={'author': 'Elpis', 'creationDate': "D:20141218110508+05'30'", 'creator': 'PDFCreator Version 1.5.0(Foxit Advanced PDF Editor)', 'file_path': '/content/d and f block.pdf', 'format': 'PDF 1.4', 'keywords': '', 'modDate': 'D:20171202153403', 'page': 1, 'producer': 'GPL Ghostscript 9.05', 'source': '/content/d and f block.pdf', 'subject': '', 'title': '12_chemistry_notes_ch08_the_dblock_f-block_elements', 'total_pages': 4, 'trapped': ''})]

This is because the retriever has no innate concept of state, and will only pull documents most similar to the query given. To solve this, we can transform the query into a standalone query without any external references to the LLM.

from langchain_core.messages import AIMessage, HumanMessage
chat = ChatOpenAI(model="gpt-3.5-turbo-1106")

query_transform_prompt = ChatPromptTemplate.from_messages(
    [
        MessagesPlaceholder(variable_name="messages"),
        (
            "user",
            "Given the above conversation, generate a search query to look up in order to get information relevant to the conversation. Only respond with the query, nothing else.",
        ),
    ]
)

query_transformation_chain = query_transform_prompt | chat

query_transformation_chain.invoke(
    {
        "messages": [
            HumanMessage(content="What is d block?"),
            AIMessage(
                content="The d-block elements are those in groups 3 to 12 of the periodic table with electronic configuration (n – 1)d1-10 ns1-2. Transition elements have incompletely filled d orbitals in their ground state or oxidation states. The f-block elements consist of those with 4f and 5f orbitals progressively filled in the latter long periods."
            ),
            HumanMessage(content="Tell me more about that!"),
        ],
    }
)

What we basically did was pass the previous query and the response as a context for the upcoming query leading it to understand what actually we are referring to here.

AIMessage(content='"d-block elements properties and characteristics")

Let's add this to our retrieval chain so that it can answer follow-up questions:

from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableBranch

query_transforming_retriever_chain = RunnableBranch(
    (
        lambda x: len(x.get("messages", [])) == 1,
        # If only one message, then we just pass that message's content to retriever
        (lambda x: x["messages"][-1].content) | retriever,
    ),
    # If messages, then we pass inputs to LLM chain to transform the query, then pass to retriever
    query_transform_prompt | chat | StrOutputParser() | retriever,
).with_config(run_name="chat_retriever_chain")

Let's breakdown what is happening inside this code snippet:

  • RunnableBranch: It creates a branch in the processing chain based on a condition.

  • The condition checks if the input (x) has a key named "messages" and if it contains only one element (len(x.get("messages", [])) == 1).

Single Message Scenario:

  • If there's only one message, it assumes a simple retrieval case.

  • It extracts the content of the single message using lambda x: x["messages"][-1].content.

  • This content is then directly piped (|) to the retriever for information retrieval.

Multiple Message Scenario:

  • If there are multiple messages (implying a chat conversation), a more elaborate process is used.

  • The code references a pre-defined chain named query_transform_prompt | chat | StrOutputParser(). This likely involves transforming the chat history using an LLM (large language model) with a specific prompt (query_transform_prompt) and the chat module. The output is then parsed as a string (StrOutputParser()).

  • This transformed query is then piped to the retriever for information retrieval based on the transformed chat history.

Then, we can use this query transformation chain to make our retrieval chain better able to handle such followup questions:

SYSTEM_TEMPLATE = """
Answer the user's questions based on the below context. 
If the context doesn't contain any relevant information to the question, don't make something up and just say "I don't know":

<context>
{context}
</context>
"""

question_answering_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            SYSTEM_TEMPLATE,
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

document_chain = create_stuff_documents_chain(chat, question_answering_prompt)

conversational_retrieval_chain = RunnablePassthrough.assign(
    context=query_transforming_retriever_chain,
).assign(
    answer=document_chain,
)

Similar to our document_chain which we built earlier, we are stuffing the documents with the create_stuff_documents_chain. But we are passing the transformed query now instead of the entire context which was the message history in our previous examples.

conversational_retrieval_chain.invoke(
    {
        "messages": [
            HumanMessage(content="Can LangSmith help test my LLM applications?"),
        ]
    }
)

Invoking this chain essentially means we are asking our first query and it comes under the single message scenario. So the query is passed as it is not a followup.

{'messages': [HumanMessage(content='Can d block elements differ from f block?')],
 'context': [Document(page_content='Portal for CBSE Notes, Test Papers, Sample Papers, Tips and Tricks \nCBSE Class-12 Chemistry Quick Revision Notes \nChapter-08: The D and F-Block Elements  \n \n \n• The d -Block elements:  \na) The elements lying in the middle of periodic table belonging to groups 3 to 12 are \nknown as d – block elements.  \nb) Their general electronic configuration is (n – 1)d1-10 ns1-2 where (n – 1) stands for \npenultimate (last but one) shell.  \n• Transition element:  \na) A transition element is defined as the one which has incompletely filled d orbitals in \nits ground state or in any one of its oxidation states.  \nb) Zinc, cadmium, mercury are not regarded as transition metals due to completely \nfilled d – orbital.  \n•\nThe f-Block elements:  \nThe elements constituting the f -block are those in which the 4 f and 5 f orbitals are \nprogressively filled in the latter two long periods.  \n•\nLanthanoids:  \nThe 14 elements immediately following lanthanum, i.e., Cerium (58) to Lutetium (71)', metadata={'author': 'Elpis', 'creationDate': "D:20141218110508+05'30'", 'creator': 'PDFCreator Version 1.5.0(Foxit Advanced PDF Editor)', 'file_path': '/content/d and f block.pdf', 'format': 'PDF 1.4', 'keywords': '', 'modDate': 'D:20171202153403', 'page': 0, 'producer': 'GPL Ghostscript 9.05', 'source': '/content/d and f block.pdf', 'subject': '', 'title': '12_chemistry_notes_ch08_the_dblock_f-block_elements', 'total_pages': 4, 'trapped': ''}),
  Document(page_content='Portal for CBSE Notes, Test Papers, Sample Papers, Tips and Tricks \nCBSE Class-12 Chemistry Quick Revision Notes \nChapter-08: The D and F-Block Elements  \n \n \n• The d -Block elements:  \na) The elements lying in the middle of periodic table belonging to groups 3 to 12 are \nknown as d – block elements.  \nb) Their general electronic configuration is (n – 1)d1-10 ns1-2 where (n – 1) stands for \npenultimate (last but one) shell.  \n• Transition element:  \na) A transition element is defined as the one which has incompletely filled d orbitals in \nits ground state or in any one of its oxidation states.  \nb) Zinc, cadmium, mercury are not regarded as transition metals due to completely \nfilled d – orbital.  \n•\nThe f-Block elements:  \nThe elements constituting the f -block are those in which the 4 f and 5 f orbitals are \nprogressively filled in the latter two long periods.  \n•\nLanthanoids:  \nThe 14 elements immediately following lanthanum, i.e., Cerium (58) to Lutetium (71)', metadata={'author': 'Elpis', 'creationDate': "D:20141218110508+05'30'", 'creator': 'PDFCreator Version 1.5.0(Foxit Advanced PDF Editor)', 'file_path': '/content/d and f block.pdf', 'format': 'PDF 1.4', 'keywords': '', 'modDate': 'D:20171202153403', 'page': 0, 'producer': 'GPL Ghostscript 9.05', 'source': '/content/d and f block.pdf', 'subject': '', 'title': '12_chemistry_notes_ch08_the_dblock_f-block_elements', 'total_pages': 4, 'trapped': ''}),
  Document(page_content='transition elements are almost similar to those of the third row of transition \nelements.  \nd) Ionisation enthalpy:  \nThere is slight and irregular variation in ionization energies of transition metals due \nto irregular variation of atomic size. The I.E. of 5d transition series is higher than 3d \nand 4d transition series because of Lanthanoid Contraction.  \ne) Oxidation state:  \nTransition metals show variable oxidation states due to tendency of (n-1)d as well as \nns electrons to take part in bond formation.  \nf)\n Magnetic properties:  \nMost of transition metals are paramagnetic in nature due to presence of unpaired \nelectrons. It increase s from Sc to Cr and then decreases because number of unpaired \nand then decrease because number of unpaired electrons increases from Sc to Cr and \nthen decreases.  \ng) Catalytic properties:  \nMost of transition metals are used as catalyst because of (i) presence of incomplete', metadata={'author': 'Elpis', 'creationDate': "D:20141218110508+05'30'", 'creator': 'PDFCreator Version 1.5.0(Foxit Advanced PDF Editor)', 'file_path': '/content/d and f block.pdf', 'format': 'PDF 1.4', 'keywords': '', 'modDate': 'D:20171202153403', 'page': 1, 'producer': 'GPL Ghostscript 9.05', 'source': '/content/d and f block.pdf', 'subject': '', 'title': '12_chemistry_notes_ch08_the_dblock_f-block_elements', 'total_pages': 4, 'trapped': ''}),
  Document(page_content='transition elements are almost similar to those of the third row of transition \nelements.  \nd) Ionisation enthalpy:  \nThere is slight and irregular variation in ionization energies of transition metals due \nto irregular variation of atomic size. The I.E. of 5d transition series is higher than 3d \nand 4d transition series because of Lanthanoid Contraction.  \ne) Oxidation state:  \nTransition metals show variable oxidation states due to tendency of (n-1)d as well as \nns electrons to take part in bond formation.  \nf)\n Magnetic properties:  \nMost of transition metals are paramagnetic in nature due to presence of unpaired \nelectrons. It increase s from Sc to Cr and then decreases because number of unpaired \nand then decrease because number of unpaired electrons increases from Sc to Cr and \nthen decreases.  \ng) Catalytic properties:  \nMost of transition metals are used as catalyst because of (i) presence of incomplete', metadata={'author': 'Elpis', 'creationDate': "D:20141218110508+05'30'", 'creator': 'PDFCreator Version 1.5.0(Foxit Advanced PDF Editor)', 'file_path': '/content/d and f block.pdf', 'format': 'PDF 1.4', 'keywords': '', 'modDate': 'D:20171202153403', 'page': 1, 'producer': 'GPL Ghostscript 9.05', 'source': '/content/d and f block.pdf', 'subject': '', 'title': '12_chemistry_notes_ch08_the_dblock_f-block_elements', 'total_pages': 4, 'trapped': ''})],
 'answer': 'Yes, d-block elements differ from f-block elements in terms of their electronic configurations and the orbitals they fill. The d-block elements have incompletely filled d orbitals in their ground state or in any one of their oxidation states, while the f-block elements have progressively filled 4f and 5f orbitals in the latter two long periods.'}

Now to ask follow-up question we need to pass the user query as well as ai response with the next invoke command

conversational_retrieval_chain.invoke(
    {
        "messages": [
            HumanMessage(content="Can d block elements differ from f block?"),
            AIMessage(
                content="Yes, d-block elements differ from f-block elements in terms of their electronic configurations and the orbitals they fill. The d-block elements have incompletely filled d orbitals in their ground state or in any one of their oxidation states, while the f-block elements have progressively filled 4f and 5f orbitals in the latter two long periods."
            ),
            HumanMessage(content="Tell me more about their difference!"),
        ],
    }
)

Now if you observe carefully, we have 3 elements inside the dictionary "messages" and hence it hit the query transformation. We pass it in the sequence: query_transform_prompt | chat | StrOutputParser() | retriever and hence finally generate the answer based on the retrieved sources.

{'messages': [HumanMessage(content='Can d block elements differ from f block?'),
  AIMessage(content='Yes, d-block elements differ from f-block elements in terms of their electronic configurations and the orbitals they fill. The d-block elements have incompletely filled d orbitals in their ground state or in any one of their oxidation states, while the f-block elements have progressively filled 4f and 5f orbitals in the latter two long periods.'),
  HumanMessage(content='Tell me more about their difference!')],
 'context': [Document(page_content='Portal for CBSE Notes, Test Papers, Sample Papers, Tips and Tricks \nCBSE Class-12 Chemistry Quick Revision Notes \nChapter-08: The D and F-Block Elements  \n \n \n• The d -Block elements:  \na) The elements lying in the middle of periodic table belonging to groups 3 to 12 are \nknown as d – block elements.  \nb) Their general electronic configuration is (n – 1)d1-10 ns1-2 where (n – 1) stands for \npenultimate (last but one) shell.  \n• Transition element:  \na) A transition element is defined as the one which has incompletely filled d orbitals in \nits ground state or in any one of its oxidation states.  \nb) Zinc, cadmium, mercury are not regarded as transition metals due to completely \nfilled d – orbital.  \n•\nThe f-Block elements:  \nThe elements constituting the f -block are those in which the 4 f and 5 f orbitals are \nprogressively filled in the latter two long periods.  \n•\nLanthanoids:  \nThe 14 elements immediately following lanthanum, i.e., Cerium (58) to Lutetium (71)', metadata={'author': 'Elpis', 'creationDate': "D:20141218110508+05'30'", 'creator': 'PDFCreator Version 1.5.0(Foxit Advanced PDF Editor)', 'file_path': '/content/d and f block.pdf', 'format': 'PDF 1.4', 'keywords': '', 'modDate': 'D:20171202153403', 'page': 0, 'producer': 'GPL Ghostscript 9.05', 'source': '/content/d and f block.pdf', 'subject': '', 'title': '12_chemistry_notes_ch08_the_dblock_f-block_elements', 'total_pages': 4, 'trapped': ''}),
  Document(page_content='Portal for CBSE Notes, Test Papers, Sample Papers, Tips and Tricks \nCBSE Class-12 Chemistry Quick Revision Notes \nChapter-08: The D and F-Block Elements  \n \n \n• The d -Block elements:  \na) The elements lying in the middle of periodic table belonging to groups 3 to 12 are \nknown as d – block elements.  \nb) Their general electronic configuration is (n – 1)d1-10 ns1-2 where (n – 1) stands for \npenultimate (last but one) shell.  \n• Transition element:  \na) A transition element is defined as the one which has incompletely filled d orbitals in \nits ground state or in any one of its oxidation states.  \nb) Zinc, cadmium, mercury are not regarded as transition metals due to completely \nfilled d – orbital.  \n•\nThe f-Block elements:  \nThe elements constituting the f -block are those in which the 4 f and 5 f orbitals are \nprogressively filled in the latter two long periods.  \n•\nLanthanoids:  \nThe 14 elements immediately following lanthanum, i.e., Cerium (58) to Lutetium (71)', metadata={'author': 'Elpis', 'creationDate': "D:20141218110508+05'30'", 'creator': 'PDFCreator Version 1.5.0(Foxit Advanced PDF Editor)', 'file_path': '/content/d and f block.pdf', 'format': 'PDF 1.4', 'keywords': '', 'modDate': 'D:20171202153403', 'page': 0, 'producer': 'GPL Ghostscript 9.05', 'source': '/content/d and f block.pdf', 'subject': '', 'title': '12_chemistry_notes_ch08_the_dblock_f-block_elements', 'total_pages': 4, 'trapped': ''}),
  Document(page_content='c) 5d – transition series. It consists of elements with atomic number 57(La), 72(Hf) to \n80(Hg) having incomplete 5d orbitals. It is called third transition series.  \nd) 6d – transition series. It consists of elements with atomic number 89(Ac), 104(Rf) to \n112(Uub) having incomplete 6d orbitals. It is called fourth transition series.  \n•\nGeneral Characteristics of transition elements:  \na) Metallic character:  \nAll transition elements are metallic in nature, i.e. they have strong metallic bonds. \nThis is because of presence of unpaired electrons. This gives rise to properties like \nhigh density, high enthalpies of atomization, and high melting and boiling points.  \nb) Atomic radii:', metadata={'author': 'Elpis', 'creationDate': "D:20141218110508+05'30'", 'creator': 'PDFCreator Version 1.5.0(Foxit Advanced PDF Editor)', 'file_path': '/content/d and f block.pdf', 'format': 'PDF 1.4', 'keywords': '', 'modDate': 'D:20171202153403', 'page': 0, 'producer': 'GPL Ghostscript 9.05', 'source': '/content/d and f block.pdf', 'subject': '', 'title': '12_chemistry_notes_ch08_the_dblock_f-block_elements', 'total_pages': 4, 'trapped': ''}),
  Document(page_content='c) 5d – transition series. It consists of elements with atomic number 57(La), 72(Hf) to \n80(Hg) having incomplete 5d orbitals. It is called third transition series.  \nd) 6d – transition series. It consists of elements with atomic number 89(Ac), 104(Rf) to \n112(Uub) having incomplete 6d orbitals. It is called fourth transition series.  \n•\nGeneral Characteristics of transition elements:  \na) Metallic character:  \nAll transition elements are metallic in nature, i.e. they have strong metallic bonds. \nThis is because of presence of unpaired electrons. This gives rise to properties like \nhigh density, high enthalpies of atomization, and high melting and boiling points.  \nb) Atomic radii:', metadata={'author': 'Elpis', 'creationDate': "D:20141218110508+05'30'", 'creator': 'PDFCreator Version 1.5.0(Foxit Advanced PDF Editor)', 'file_path': '/content/d and f block.pdf', 'format': 'PDF 1.4', 'keywords': '', 'modDate': 'D:20171202153403', 'page': 0, 'producer': 'GPL Ghostscript 9.05', 'source': '/content/d and f block.pdf', 'subject': '', 'title': '12_chemistry_notes_ch08_the_dblock_f-block_elements', 'total_pages': 4, 'trapped': ''})],
 'answer': 'The d-block elements are known as transition elements and have general electronic configurations of (n – 1)d1-10 ns1-2, where (n – 1) stands for the penultimate (last but one) shell. These elements typically belong to groups 3 to 12 in the periodic table. On the other hand, the f-block elements consist of the lanthanoids and actinoids, where the 4f and 5f orbitals are progressively filled in the latter two long periods. Lanthanoids are the 14 elements immediately following lanthanum, from Cerium (58) to Lutetium (71), while actinoids are the 14 elements from actinium (89) to lawrencium (103). Additionally, the f-block elements also have 5d and 6d transition series with incomplete 5d and 6d orbitals, respectively.'}

So the chatbot is able to generate response for the follow-up. This was possible because we have transformed our query and refined it in such a manner that the context is fetched and used for answer generation.

Without using ChatMessageHistory, we were also able to create a conversational chatbot through the power of Refine Query.

Both the techniques for conversational behavior are useful as per the use-case. For example if you are building a chatbot to answer simple unrelated questions, context about the previous step would enough and maintaining memory unnecessarily would not make sense.

But if you are using a chatbot for production and at scale, follow-up questions are common and a user should have the flexibility to refer any part of their conversation for their next query. In those cases, maintaining and passing the entire conversation history is practical and accurate as well.

Conclusion

We've covered a lot in this post, from understanding about RAG to knowing about LangChain. We started with absolute basics on how to setup chains and have went all the way up to implementing our own RAG based chatbot and further enhanced it with memory. We have learnt 2 different ways of implementing memory: ChatMessageHistory and Refine Queries.

With further experimentation and understanding other integral components of LangChain, you can leverage these tools for your own use cases and projects. With LangChain the accuracy and ease of implementation further improves.

If you found this guide helpful and you're looking to learn more or get the latest tips on using language models and other cool tech, don't forget to follow me.

Resources and Further Reading

For additional information and resources on LangChain, check out the following:

LagChain Official Website

LagChain GitHub Repository

Tutorial Code Notebook

These resources provide comprehensive documentation and community support to assist you in delving deeper into the functionalities of LangChain and RAG in chatbot development.