<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Building Custom NLP Solutions using state of the art NLP models | FutureSmart AI]]></title><description><![CDATA[Learn how to build custom Natural Language Processing (NLP) solutions using state-of-the-art models. Use Pytorch, Amazon Lex, FastAPI, and Hugging Face Transformers to create powerful applications.]]></description><link>https://blog.futuresmart.ai</link><generator>RSS for Node</generator><lastBuildDate>Mon, 13 Apr 2026 07:20:53 GMT</lastBuildDate><atom:link href="https://blog.futuresmart.ai/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Building AI Knowledge Graph Using Graphiti & Neo4j]]></title><description><![CDATA[Graphiti is a compact yet powerful Python library that converts raw text or JSON into an AI Knowledge Graph—a structured store of facts that acts as AI Agent Memory. Below is the exact workflow I used to load FutureSmart AI data into Neo4j, explore t...]]></description><link>https://blog.futuresmart.ai/building-ai-knowledge-graph-using-graphiti-and-neo4j</link><guid isPermaLink="true">https://blog.futuresmart.ai/building-ai-knowledge-graph-using-graphiti-and-neo4j</guid><category><![CDATA[graphiti]]></category><category><![CDATA[ai agent memory]]></category><category><![CDATA[ai-knowledge-graphs]]></category><dc:creator><![CDATA[Pradip Nichite]]></dc:creator><pubDate>Fri, 01 Aug 2025 08:32:34 GMT</pubDate><content:encoded><![CDATA[<p>Graphiti is a compact yet powerful Python library that converts raw text or JSON into an <strong>AI Knowledge Graph</strong>—a structured store of facts that acts as <strong>AI Agent Memory</strong>. Below is the exact workflow I used to load <strong>FutureSmart AI</strong> data into Neo4j, explore the resulting <strong>AI Graph Memory</strong>, and run hybrid (semantic + keyword) searches.</p>
<hr />
<h2 id="heading-1-why-graphiti-for-ai-agent-memory">1. Why Graphiti for AI Agent Memory?</h2>
<ul>
<li><p><strong>Real‑time inserts</strong> – Add new facts without bulk re‑processing.</p>
</li>
<li><p><strong>LLM‑powered parsing</strong> – Entity extraction, relationship mapping, summaries, and embeddings handled automatically.</p>
</li>
<li><p><strong>Neo4j under the hood</strong> – First‑class graph database with Cypher, indexes, and visualization tools.</p>
</li>
<li><p><strong>Perfect for AI Agent Memory</strong> – Your agent can recall structured facts on demand.</p>
</li>
</ul>
<hr />
<h2 id="heading-2-quick-refresher-knowledgegraph-basics">2. Quick Refresher: Knowledge‑Graph Basics</h2>
<pre><code class="lang-plaintext">“Pradip Nichite founded FutureSmart AI.”
</code></pre>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Component</td><td>Example</td></tr>
</thead>
<tbody>
<tr>
<td><strong>Entity</strong></td><td><code>Pradip Nichite</code></td></tr>
<tr>
<td><strong>Entity</strong></td><td><code>FutureSmart AI</code></td></tr>
<tr>
<td><strong>Edge</strong></td><td><code>founded</code></td></tr>
<tr>
<td><strong>Triple</strong></td><td>(Pradip Nichite, founded, FutureSmart AI)</td></tr>
</tbody>
</table>
</div><p>Graphiti stores each triple as Neo4j nodes (<strong>Entity</strong>) and edges (<strong>RELATES_TO</strong>), plus an <strong>Episode</strong> node that records the original text—together forming your <strong>AI Graph Memory</strong>.</p>
<hr />
<h2 id="heading-3-setup">3. Setup</h2>
<pre><code class="lang-bash">pip install graphiti-core  <span class="hljs-comment"># build your AI Knowledge Graph in minutes</span>
</code></pre>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> os
<span class="hljs-keyword">from</span> google.colab <span class="hljs-keyword">import</span> userdata

os.environ[<span class="hljs-string">"OPENAI_API_KEY"</span>] = userdata.get(<span class="hljs-string">"OPENAI_API_KEY"</span>)
neo4j_uri      = <span class="hljs-string">"neo4j+s://&lt;your‑instance&gt;.databases.neo4j.io"</span>
neo4j_user     = <span class="hljs-string">"neo4j"</span>
neo4j_password = userdata.get(<span class="hljs-string">"NEO4J_PASSWORD"</span>)
</code></pre>
<hr />
<h2 id="heading-4-initialise-graphiti-clean-slate">4. Initialise Graphiti (clean slate)</h2>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> graphiti_core <span class="hljs-keyword">import</span> Graphiti
<span class="hljs-keyword">from</span> graphiti_core.utils.maintenance.graph_data_operations <span class="hljs-keyword">import</span> clear_data

graphiti = Graphiti(neo4j_uri, neo4j_user, neo4j_password)
<span class="hljs-keyword">await</span> clear_data(graphiti.driver)              <span class="hljs-comment"># optional for a fresh start</span>
<span class="hljs-keyword">await</span> graphiti.build_indices_and_constraints() <span class="hljs-comment"># creates indexes once</span>
</code></pre>
<hr />
<h2 id="heading-5-insert-episodes-into-your-ai-knowledge-graph">5. Insert Episodes into Your AI Knowledge Graph</h2>
<pre><code class="lang-python">episodes = [
    {
        <span class="hljs-string">"name"</span>: <span class="hljs-string">"About Me"</span>,
        <span class="hljs-string">"content"</span>: <span class="hljs-string">"Hi, I'm Pradip Nichite. I am the founder and CEO of FutureSmart AI."</span>,
        <span class="hljs-string">"type"</span>: EpisodeType.text,
        <span class="hljs-string">"description"</span>: <span class="hljs-string">"intro"</span>
    },
    {
        <span class="hljs-string">"name"</span>: <span class="hljs-string">"About FutureSmart AI"</span>,
        <span class="hljs-string">"content"</span>: <span class="hljs-string">"FutureSmart AI builds custom AI solutions for clients."</span>,
        <span class="hljs-string">"type"</span>: EpisodeType.text,
        <span class="hljs-string">"description"</span>: <span class="hljs-string">"company overview"</span>
    }
]

<span class="hljs-keyword">await</span> add_data(episodes)  <span class="hljs-comment"># helper loops and calls graphiti.add_episode()</span>
</code></pre>
<h3 id="heading-underthehood-workflow">Under‑the‑Hood Workflow</h3>
<ol>
<li><p><strong>Entity extraction</strong> (people, orgs, products)</p>
</li>
<li><p><strong>Relationship extraction</strong> (<code>founded</code>, <code>associated_with</code>, …)</p>
</li>
<li><p><strong>Deduplication</strong> of entities</p>
</li>
<li><p><strong>Summary</strong> + <strong>embedding</strong> generation</p>
</li>
<li><p><strong>Graph write</strong> → Neo4j (your <strong>AI Graph Memory</strong>)</p>
</li>
</ol>
<p>All steps run via LLM calls you can inspect in your OpenAI usage logs.</p>
<hr />
<h2 id="heading-6-visual-exploration-of-ai-graph-memory">6. Visual Exploration of AI Graph Memory</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754036009862/076718dd-f8c0-42a6-8d7a-8e2d8fcb491b.png" alt class="image--center mx-auto" /></p>
<ul>
<li><p><strong>Blue</strong> = Episode nodes (source text)</p>
</li>
<li><p><strong>Brown</strong> = Entity nodes (deduplicated concepts)</p>
</li>
<li><p>Edges labelled <strong>MENTIONS</strong> &amp; <strong>RELATES_TO</strong> connect everything</p>
</li>
</ul>
<p>Run <code>MATCH (n) RETURN n</code> in Neo4j Browser to explore your <strong>AI Knowledge Graph</strong>. Click any node to view its <code>summary</code> and vector <code>embedding</code>.</p>
<hr />
<h2 id="heading-7-csv-export-at-a-glance">7. CSV Export at a Glance</h2>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Metric</td><td>Count</td></tr>
</thead>
<tbody>
<tr>
<td>Nodes total</td><td>9</td></tr>
<tr>
<td>• Entity</td><td>5</td></tr>
<tr>
<td>• Episode</td><td>4</td></tr>
<tr>
<td>Relationships total</td><td>13</td></tr>
<tr>
<td>• <code>MENTIONS</code></td><td>8</td></tr>
<tr>
<td>• <code>RELATES_TO</code></td><td>5</td></tr>
</tbody>
</table>
</div><p>Sample from <code>node‑export.csv</code>:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>name</td><td>label</td><td>summary</td></tr>
</thead>
<tbody>
<tr>
<td>Pradip Nichite</td><td>Entity</td><td>Pradip Nichite is the founder and CEO of FutureSmart AI.</td></tr>
<tr>
<td>FutureSmart AI</td><td>Entity</td><td>FutureSmart AI is a company that develops AI solutions for clients and is known for creating AI demos.</td></tr>
<tr>
<td>About Me</td><td>Episodic</td><td></td></tr>
<tr>
<td>AI Demos</td><td>Entity</td><td>AI Demos is associated with FutureSmart AI and was founded by Pradip Nichite.</td></tr>
<tr>
<td>FutureSmart Agent</td><td>Entity</td><td>FutureSmart Agent is a product developed by FutureSmart AI.</td></tr>
<tr>
<td>About FutureSmart AI</td><td>Episodic</td><td></td></tr>
<tr>
<td>About AI Demos</td><td>Episodic</td><td></td></tr>
<tr>
<td>FutureSmart Agent</td><td>Episodic</td><td></td></tr>
<tr>
<td><a target="_blank" href="https://agent.futuresmart.ai/">https://agent.futuresmart.ai/</a></td><td>Entity</td><td>FutureSmart AI is the company behind the product FutureSmart Agent. Their website is <a target="_blank" href="https://agent.futuresmart.ai/">https://agent.futuresmart.ai/</a>.</td></tr>
</tbody>
</table>
</div><hr />
<h2 id="heading-8-query-your-ai-agent-memory-hybrid-search">8. Query Your AI Agent Memory (Hybrid Search)</h2>
<pre><code class="lang-python">query   = <span class="hljs-string">"What products FutureSmart has"</span>
results = <span class="hljs-keyword">await</span> graphiti.search(query=query, num_results=<span class="hljs-number">3</span>)
print_result(results)
</code></pre>
<pre><code class="lang-plaintext">Search Results:
UUID: dae5cb4d-adb2-401f-991f-aa5259127245
Fact: FutureSmart Agent is a product of FutureSmart AI
Valid from: 2025-08-01 07:54:17+00:00
---
UUID: afc8af04-0af2-4fd8-940a-91cf93da6943
Fact: https://agent.futuresmart.ai/ is the website of FutureSmart Agent
Valid from: 2025-08-01 07:54:17+00:00
---
UUID: c6d353e8-6fb1-4263-a0e9-4df21ea5805f
Fact: FutureSmart AI is the company behind AI Demos
Valid from: 2025-08-01 07:54:08+00:00
</code></pre>
<p>Graphiti blends embedding similarity with BM25 keywords to surface relevant facts for downstream agents.</p>
<hr />
<h2 id="heading-9-practical-uses-of-an-ai-graph-memory">9. Practical Uses of an AI Graph Memory</h2>
<ul>
<li><p><strong>AI Agent Memory</strong> for chatbots and assistants</p>
</li>
<li><p>Enterprise <strong>AI Knowledge Graph</strong> for internal search</p>
</li>
<li><p>Neo4j Bloom dashboards for leadership teams</p>
</li>
<li><p>Light‑weight RAG pipelines powered by graph facts</p>
</li>
</ul>
<hr />
<h2 id="heading-10-whats-next">10. What’s Next?</h2>
<ul>
<li><p>Integrate Graphiti memory into LangGraph / LangChain agents</p>
</li>
<li><p>Combine vector store retrieval with graph queries</p>
</li>
<li><p>Package everything in FastAPI + Streamlit for a live demo</p>
</li>
</ul>
<hr />
<h3 id="heading-prefer-video-over-text">Prefer video over text?</h3>
<p>Watch the full walkthrough here  </p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/H2Cb5wbcRzo">https://youtu.be/H2Cb5wbcRzo</a></div>
<p> </p>
<hr />
<h3 id="heading-need-a-custom-ai-agent-or-graph-solution">Need a Custom AI Agent or Graph Solution?</h3>
<p>FutureSmart AI helps companies turn raw data into production-ready <strong>AI Knowledge Graphs</strong> and <strong>AI Agent Memory</strong> systems.</p>
<ul>
<li><p>End-to-end Graphiti + Neo4j pipelines</p>
</li>
<li><p>LangChain / LangGraph agent integrations</p>
</li>
<li><p>Scalable FastAPI &amp; Streamlit front-ends</p>
</li>
<li><p>Secure, cloud-native deployment</p>
</li>
</ul>
<p>Want proof we deliver? 👉 <strong>See our real-world results:</strong> <a target="_blank" href="https://futuresmart.ai/case-studies">https://futuresmart.ai/case-studies</a></p>
<blockquote>
<p><strong>Book a free consult:</strong> <a target="_blank" href="mailto:contact@futuresmart.ai">contact@futuresmart.ai</a></p>
</blockquote>
<hr />
<h3 id="heading-questions">Questions?</h3>
<p>Drop a comment on YouTube or email <a target="_blank" href="mailto:contact@futuresmart.ai"><strong>contact@futuresmart.ai</strong></a>.</p>
]]></content:encoded></item><item><title><![CDATA[AI Agents Memory: Mem0 + LangGraph Agent Integration]]></title><description><![CDATA[In this blog we’ll walk through practical steps to add long‑term memory to your AI agents using Mem0 and LangGraph. We’ll build incrementally, tackling one section at a time so you can follow along and run the code as you read.
Table of Contents

Mem...]]></description><link>https://blog.futuresmart.ai/ai-agents-memory-mem0-langgraph-agent-integration</link><guid isPermaLink="true">https://blog.futuresmart.ai/ai-agents-memory-mem0-langgraph-agent-integration</guid><category><![CDATA[ai agent memory]]></category><category><![CDATA[ai memory]]></category><category><![CDATA[langgraph]]></category><category><![CDATA[Mem0 ]]></category><dc:creator><![CDATA[Pradip Nichite]]></dc:creator><pubDate>Mon, 14 Jul 2025 11:24:29 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1752482471202/82431fa5-8b73-4d51-8811-f09e6e921957.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In this blog we’ll walk through practical steps to add long‑term memory to your AI agents using <strong>Mem0</strong> and <strong>LangGraph</strong>. We’ll build incrementally, tackling one section at a time so you can follow along and run the code as you read.</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ol>
<li><p><strong>Mem0 Basics</strong> – Adding, updating, and searching memories</p>
</li>
<li><p><strong>LangGraph Integration</strong> – Wiring Mem0 into a LangGraph agent</p>
</li>
<li><p><strong>Vector DB Setup</strong> – Swapping the default SQLite store for Qdrant</p>
</li>
<li><p><strong>Cloud Usage</strong> – Using the Mem0 Cloud Platform for scalable memory management</p>
</li>
</ol>
<hr />
<h2 id="heading-why-do-ai-agents-need-memory">Why do AI agents need memory?</h2>
<p>When an LLM‑powered agent starts a brand‑new conversation it has no context about who it’s talking to or what happened in earlier sessions. Relying on the raw <em>chat history</em> works only inside a single session and quickly bloats your prompt window.<br />Long‑term <strong>memory</strong> lets the agent:</p>
<ul>
<li><p><strong>Remember user‑level facts</strong> (name, preferences, past actions) across sessions</p>
</li>
<li><p><strong>Personalise responses</strong> without re‑asking the same questions</p>
</li>
<li><p><strong>Stay efficient</strong> by storing distilled facts instead of the entire transcript</p>
</li>
</ul>
<h4 id="heading-chat-history-vs-memory">Chat history vs memory</h4>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Aspect</td><td>Chat history (session)</td><td>Long‑term memory (Mem0)</td></tr>
</thead>
<tbody>
<tr>
<td>Lifespan</td><td>Only current session</td><td>Persists across sessions</td></tr>
<tr>
<td>Granularity</td><td>Full message text</td><td>Distilled facts &amp; metadata</td></tr>
<tr>
<td>Storage</td><td>In‑prompt list of messages</td><td>External DB / vector store</td></tr>
<tr>
<td>Cost impact</td><td>Grows token count quickly</td><td>Minimal extra tokens</td></tr>
</tbody>
</table>
</div><hr />
<h2 id="heading-1-mem0-basics-adding-updating-amp-searching-memories">1. Mem0 Basics – Adding, Updating &amp; Searching Memories</h2>
<h3 id="heading-quick-setup">Quick setup</h3>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> mem0 <span class="hljs-keyword">import</span> Memory
memory = Memory.from_config({<span class="hljs-string">"history_db_path"</span>: <span class="hljs-string">"history.db"</span>})  <span class="hljs-comment"># local SQLite file</span>
</code></pre>
<p><strong>Why the explicit config?</strong> Mem0 defaults to a <em>read‑only</em> temp database, so writes will fail. Pointing it to <code>history.db</code> (or any path you prefer) gives the library a place to persist memories. You can extend the same <code>config</code> dict to</p>
<ul>
<li><p><strong>Override the LLM</strong> (provider, model, temperature, etc.)</p>
</li>
<li><p><strong>Plug in a vector store</strong> for semantic search (we’ll wire up Qdrant in Section 3).</p>
</li>
</ul>
<p>Example – switching to GPT‑4.1‑mini:</p>
<pre><code class="lang-python">config = {
    <span class="hljs-string">"history_db_path"</span>: <span class="hljs-string">"history.db"</span>,
    <span class="hljs-string">"llm"</span>: {
        <span class="hljs-string">"provider"</span>: <span class="hljs-string">"openai"</span>,
        <span class="hljs-string">"config"</span>: {
            <span class="hljs-string">"model"</span>: <span class="hljs-string">"gpt-4.1-mini"</span>,
            <span class="hljs-string">"temperature"</span>: <span class="hljs-number">0.2</span>,
            <span class="hljs-string">"max_tokens"</span>: <span class="hljs-number">2000</span>
        }
    }
}
memory = Memory.from_config(config)
</code></pre>
<h3 id="heading-add-your-first-memories">Add your first memories</h3>
<pre><code class="lang-python">memory.add([
    {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"Hi, I'm Pradip Nichite. I run FutureSmart AI, where we build custom AI solutions."</span>}
], user_id=<span class="hljs-string">"pradip"</span>)

memory.add([
    {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"I love building RAG and AI Agent solutions that actually work in production."</span>}
], user_id=<span class="hljs-string">"pradip"</span>, metadata={<span class="hljs-string">"category"</span>: <span class="hljs-string">"preferences"</span>})
</code></pre>
<p>Sample response:</p>
<pre><code class="lang-python">{<span class="hljs-string">'results'</span>: [{<span class="hljs-string">'id'</span>: <span class="hljs-string">'5408e326‑b26b‑4737‑a404‑299887b8d597'</span>,
  <span class="hljs-string">'memory'</span>: <span class="hljs-string">'Loves building RAG and AI Agent solutions that work in production'</span>,
  <span class="hljs-string">'event'</span>: <span class="hljs-string">'ADD'</span>}]}
</code></pre>
<p>Mem0 distills each raw chat message into a concise fact so retrieval stays lightweight.</p>
<hr />
<h3 id="heading-search">Search</h3>
<pre><code class="lang-python">related = memory.search(<span class="hljs-string">"who am i"</span>, user_id=<span class="hljs-string">"pradip"</span>)
related
</code></pre>
<p>Full output:</p>
<pre><code class="lang-python">{<span class="hljs-string">'results'</span>: [{<span class="hljs-string">'id'</span>: <span class="hljs-string">'647935d5-f913-496d-96e3-2233d7459f38'</span>,
   <span class="hljs-string">'memory'</span>: <span class="hljs-string">'Name is Pradip Nichite'</span>,
   <span class="hljs-string">'hash'</span>: <span class="hljs-string">'fa942a6331bb89da286d4a9e296d1008'</span>,
   <span class="hljs-string">'metadata'</span>: <span class="hljs-literal">None</span>,
   <span class="hljs-string">'score'</span>: <span class="hljs-number">0.2294486506181006</span>,
   <span class="hljs-string">'created_at'</span>: <span class="hljs-string">'2025-07-12T10:58:49.132915-07:00'</span>,
   <span class="hljs-string">'updated_at'</span>: <span class="hljs-literal">None</span>,
   <span class="hljs-string">'user_id'</span>: <span class="hljs-string">'pradip'</span>},
  {<span class="hljs-string">'id'</span>: <span class="hljs-string">'c763c19a-7e9f-4180-8c82-012f4da5f637'</span>,
   <span class="hljs-string">'memory'</span>: <span class="hljs-string">'Runs FutureSmart AI'</span>,
   <span class="hljs-string">'hash'</span>: <span class="hljs-string">'68a143a88a3e67ae9ebfb9575bcf49a7'</span>,
   <span class="hljs-string">'metadata'</span>: <span class="hljs-literal">None</span>,
   <span class="hljs-string">'score'</span>: <span class="hljs-number">0.1551292009096673</span>,
   <span class="hljs-string">'created_at'</span>: <span class="hljs-string">'2025-07-12T10:58:49.158843-07:00'</span>,
   <span class="hljs-string">'updated_at'</span>: <span class="hljs-literal">None</span>,
   <span class="hljs-string">'user_id'</span>: <span class="hljs-string">'pradip'</span>},
  {<span class="hljs-string">'id'</span>: <span class="hljs-string">'5408e326-b26b-4737-a404-299887b8d597'</span>,
   <span class="hljs-string">'memory'</span>: <span class="hljs-string">'Loves building RAG and AI Agent solutions that work in production'</span>,
......
   <span class="hljs-string">'user_id'</span>: <span class="hljs-string">'pradip'</span>},
  {<span class="hljs-string">'id'</span>: <span class="hljs-string">'1baa6793-507f-46b1-8e01-b90dfa1e73b6'</span>,
   <span class="hljs-string">'memory'</span>: <span class="hljs-string">'Builds custom AI solutions'</span>,
.......
   <span class="hljs-string">'user_id'</span>: <span class="hljs-string">'pradip'</span>}]}
</code></pre>
<p><code>score</code> is cosine similarity—higher means closer semantic match.</p>
<h3 id="heading-get-all-memories-for-a-user">Get all memories for a user</h3>
<pre><code class="lang-python">all_memories = memory.get_all(user_id=<span class="hljs-string">"pradip"</span>)
</code></pre>
<p>Returns the full list (same schema as <code>search</code>, without scores).</p>
<h3 id="heading-retrieve-a-single-memory">Retrieve a single memory</h3>
<pre><code class="lang-python">mem_id = <span class="hljs-string">"1baa6793-507f-46b1-8e01-b90dfa1e73b6"</span>
memory.get(mem_id)
</code></pre>
<p>Full output:</p>
<pre><code class="lang-python">{<span class="hljs-string">'id'</span>: <span class="hljs-string">'1baa6793-507f-46b1-8e01-b90dfa1e73b6'</span>,
 <span class="hljs-string">'memory'</span>: <span class="hljs-string">'Builds custom Gen AI solutions'</span>,
 <span class="hljs-string">'hash'</span>: <span class="hljs-string">'502bdf5771e4ef9a812453b51870f0b2'</span>,
 <span class="hljs-string">'metadata'</span>: <span class="hljs-literal">None</span>,
 <span class="hljs-string">'score'</span>: <span class="hljs-literal">None</span>,
 <span class="hljs-string">'created_at'</span>: <span class="hljs-string">'2025-07-12T10:58:49.182159-07:00'</span>,
 <span class="hljs-string">'updated_at'</span>: <span class="hljs-string">'2025-07-12T11:02:34.594521-07:00'</span>,
 <span class="hljs-string">'user_id'</span>: <span class="hljs-string">'pradip'</span>}
{<span class="hljs-string">'id'</span>: <span class="hljs-string">'1baa...73b6'</span>,
 <span class="hljs-string">'memory'</span>: <span class="hljs-string">'Builds custom Gen AI solutions'</span>,
 <span class="hljs-string">'created_at'</span>: ..., <span class="hljs-string">'updated_at'</span>: ...}
</code></pre>
<h3 id="heading-update-a-memory">Update a memory</h3>
<pre><code class="lang-python">memory.update(memory_id=mem_id, data=<span class="hljs-string">"Builds custom Gen AI solutions"</span>)
<span class="hljs-comment"># → {'message': 'Memory updated successfully!'}</span>
</code></pre>
<h3 id="heading-view-change-history">View change history</h3>
<pre><code class="lang-python">history = memory.history(memory_id=mem_id)
</code></pre>
<p>Full output:</p>
<pre><code class="lang-python">[{<span class="hljs-string">'id'</span>: <span class="hljs-string">'e7242249-430a-4bf2-b4df-ca0e4b99e69a'</span>,
  <span class="hljs-string">'memory_id'</span>: <span class="hljs-string">'1baa6793-507f-46b1-8e01-b90dfa1e73b6'</span>,
  <span class="hljs-string">'old_memory'</span>: <span class="hljs-literal">None</span>,
  <span class="hljs-string">'new_memory'</span>: <span class="hljs-string">'Builds custom AI solutions'</span>,
  <span class="hljs-string">'event'</span>: <span class="hljs-string">'ADD'</span>,
  <span class="hljs-string">'created_at'</span>: <span class="hljs-string">'2025-07-12T10:58:49.182159-07:00'</span>,
  <span class="hljs-string">'updated_at'</span>: <span class="hljs-literal">None</span>,
  <span class="hljs-string">'is_deleted'</span>: <span class="hljs-literal">False</span>,
  <span class="hljs-string">'actor_id'</span>: <span class="hljs-literal">None</span>,
  <span class="hljs-string">'role'</span>: <span class="hljs-literal">None</span>},
 {<span class="hljs-string">'id'</span>: <span class="hljs-string">'9d2e9706-b0c2-480b-b37a-07bb6143767d'</span>,
  <span class="hljs-string">'memory_id'</span>: <span class="hljs-string">'1baa6793-507f-46b1-8e01-b90dfa1e73b6'</span>,
  <span class="hljs-string">'old_memory'</span>: <span class="hljs-string">'Builds custom AI solutions'</span>,
  <span class="hljs-string">'new_memory'</span>: <span class="hljs-string">'Builds custom Gen AI solutions'</span>,
  <span class="hljs-string">'event'</span>: <span class="hljs-string">'UPDATE'</span>,
  <span class="hljs-string">'created_at'</span>: <span class="hljs-string">'2025-07-12T10:58:49.182159-07:00'</span>,
  <span class="hljs-string">'updated_at'</span>: <span class="hljs-string">'2025-07-12T11:02:34.594521-07:00'</span>,
  <span class="hljs-string">'is_deleted'</span>: <span class="hljs-literal">False</span>,
  <span class="hljs-string">'actor_id'</span>: <span class="hljs-literal">None</span>,
  <span class="hljs-string">'role'</span>: <span class="hljs-literal">None</span>}]
</code></pre>
<p>Each entry records the old &amp; new value plus timestamp—handy for auditing:</p>
<pre><code class="lang-python">[{<span class="hljs-string">'event'</span>: <span class="hljs-string">'ADD'</span>,    <span class="hljs-string">'old_memory'</span>: <span class="hljs-literal">None</span>,                     <span class="hljs-string">'new_memory'</span>: <span class="hljs-string">'Builds custom AI solutions'</span>},
 {<span class="hljs-string">'event'</span>: <span class="hljs-string">'UPDATE'</span>, <span class="hljs-string">'old_memory'</span>: <span class="hljs-string">'Builds custom AI solutions'</span>,
  <span class="hljs-string">'new_memory'</span>: <span class="hljs-string">'Builds custom Gen AI solutions'</span>}]
</code></pre>
<p>That wraps up the core CRUD API.</p>
<hr />
<h2 id="heading-2-langgraph-integration-wiring-mem0-into-an-agent">2. LangGraph Integration – Wiring Mem0 into an Agent</h2>
<p><strong>New to LangGraph?</strong> Watch my YouTube walkthrough that covers LangGraph basics all the way to advanced patterns.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/60XDTWhklLA?si=wrxb_SJ7XaMsel2B">https://youtu.be/60XDTWhklLA?si=wrxb_SJ7XaMsel2B</a></div>
<p> </p>
<p>Below we build the simplest possible LangGraph agent that:</p>
<ol>
<li><p>Accepts user messages.</p>
</li>
<li><p>Retrieves relevant memories from Mem0.</p>
</li>
<li><p>Injects them into the system prompt for personalised replies.</p>
</li>
<li><p>Writes the new interaction back to Mem0.</p>
</li>
</ol>
<h3 id="heading-a-define-the-shared-state">a) Define the shared state</h3>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> Annotated, TypedDict
<span class="hljs-keyword">from</span> langgraph.graph.message <span class="hljs-keyword">import</span> add_messages
<span class="hljs-keyword">from</span> langchain_core.messages <span class="hljs-keyword">import</span> BaseMessage

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">State</span>(<span class="hljs-params">TypedDict</span>):</span>
    <span class="hljs-string">"""Conversation state passed between nodes"""</span>
    messages: Annotated[list[BaseMessage], add_messages]  <span class="hljs-comment"># chat history for this request</span>
    mem0_user_id: str                                     <span class="hljs-comment"># maps to Mem0 user record</span>
</code></pre>
<h3 id="heading-b-init-the-llm">b) Init the LLM</h3>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain_openai <span class="hljs-keyword">import</span> ChatOpenAI
llm = ChatOpenAI(model=<span class="hljs-string">"gpt-4.1-mini"</span>, temperature=<span class="hljs-number">0.7</span>)
</code></pre>
<h3 id="heading-c-create-the-chatbot-node">c) Create the chatbot node</h3>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain_core.messages <span class="hljs-keyword">import</span> HumanMessage, AIMessage, SystemMessage

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">chatbot</span>(<span class="hljs-params">state: State</span>):</span>
    <span class="hljs-keyword">global</span> memory  <span class="hljs-comment"># re‑use the Mem0 instance from Section 1</span>
    msgs = state[<span class="hljs-string">"messages"</span>]
    uid = state[<span class="hljs-string">"mem0_user_id"</span>]

    <span class="hljs-comment"># 1️⃣ Retrieve memories relevant to the latest user msg</span>
    mems = memory.search(msgs[<span class="hljs-number">-1</span>].content, user_id=uid)
    print(<span class="hljs-string">f"Retrieved Memories: <span class="hljs-subst">{mems}</span>"</span>)

    <span class="hljs-comment"># Build context string</span>
    <span class="hljs-keyword">if</span> mems[<span class="hljs-string">"results"</span>]:
        context = <span class="hljs-string">"
"</span>.join(<span class="hljs-string">f"- <span class="hljs-subst">{m[<span class="hljs-string">'memory'</span>]}</span>"</span> <span class="hljs-keyword">for</span> m <span class="hljs-keyword">in</span> mems[<span class="hljs-string">"results"</span>])
    <span class="hljs-keyword">else</span>:
        context = <span class="hljs-string">"No relevant information found."</span>

    system = SystemMessage(content=<span class="hljs-string">f"""You are a helpful assistant. Use the provided context to personalise your responses.
Relevant information from previous conversations:
<span class="hljs-subst">{context}</span>"""</span>)

    <span class="hljs-comment"># 2️⃣ Invoke the LLM</span>
    response = llm.invoke([system] + msgs)

    <span class="hljs-comment"># 3️⃣ Persist the new turn</span>
    memory.add([
        {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: msgs[<span class="hljs-number">-1</span>].content},
        {<span class="hljs-string">"role"</span>: <span class="hljs-string">"assistant"</span>, <span class="hljs-string">"content"</span>: response.content}
    ], user_id=uid)

    <span class="hljs-keyword">return</span> {<span class="hljs-string">"messages"</span>: [response]}
</code></pre>
<p><strong>How the node uses Mem0</strong></p>
<ol>
<li><p><strong>Search:</strong> For every incoming user message, we call <code>memory.search()</code> with the text and <code>user_id</code>. This performs a vector‑similarity lookup and returns any facts previously stored about the user.</p>
</li>
<li><p><strong>Prompt injection:</strong> Those facts are concatenated into a bullet list (<code>context</code>) and inserted into a system prompt so the LLM can personalise its reply.</p>
</li>
<li><p><strong>Add:</strong> After the LLM responds, we persist <strong>both</strong> the latest user message and the assistant reply via <code>memory.add()</code>. Mem0 distils them into new memories ready for the next turn.</p>
</li>
</ol>
<h3 id="heading-d-build-amp-compile-the-graph">d) Build &amp; compile the graph</h3>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langgraph.graph <span class="hljs-keyword">import</span> StateGraph, START, END

graph_builder = StateGraph(State)

graph_builder.add_node(<span class="hljs-string">"chatbot"</span>, chatbot)

graph_builder.add_edge(START, <span class="hljs-string">"chatbot"</span>)

graph_builder.add_edge(<span class="hljs-string">"chatbot"</span>, END)

graph = graph_builder.compile()
print(<span class="hljs-string">"Graph compiled successfully ✅"</span>)
</code></pre>
<h3 id="heading-e-commandline-loop-for-quick-testing">e) Command‑line loop for quick testing</h3>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langgraph_core.messages <span class="hljs-keyword">import</span> HumanMessage

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">run_conversation</span>(<span class="hljs-params">user_input: str, mem0_user_id: str</span>):</span>
    state = {<span class="hljs-string">"messages"</span>: [HumanMessage(content=user_input)], <span class="hljs-string">"mem0_user_id"</span>: mem0_user_id}
    result = graph.invoke(state)
    print(<span class="hljs-string">"🤖"</span>, result[<span class="hljs-string">"messages"</span>][<span class="hljs-number">-1</span>].content)

<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
    uid = <span class="hljs-string">"customer_pradip"</span>
    <span class="hljs-keyword">while</span> <span class="hljs-literal">True</span>:
        inp = input(<span class="hljs-string">"You: "</span>)
        <span class="hljs-keyword">if</span> inp.lower() <span class="hljs-keyword">in</span> {<span class="hljs-string">"quit"</span>, <span class="hljs-string">"exit"</span>, <span class="hljs-string">"bye"</span>}:
            <span class="hljs-keyword">break</span>
        run_conversation(inp, uid)
</code></pre>
<p>Run it, send two or three messages, then restart the script and ask <em>“who am I?”</em>—you’ll see the agent recall facts from the earlier run thanks to Mem0’s long‑term store.</p>
<hr />
<h2 id="heading-3-vector-db-setup-configuring-mem0-with-qdrant">3. Vector DB Setup – Configuring Mem0 with Qdrant</h2>
<p>SQLite works for quick tests, but once memories grow you’ll want a proper vector store. <strong>Qdrant Cloud</strong> offers a generous free tier and plugs straight into Mem0.</p>
<h3 id="heading-a-spin-up-locate-a-qdrant-cloud-cluster">a) Spin up / locate a Qdrant Cloud cluster</h3>
<p>Grab the cluster URL and create an API key from the Qdrant dashboard.</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Install the Python client</span>
!pip -q install qdrant_client
</code></pre>
<h3 id="heading-b-verify-connectivity-optional">b) Verify connectivity (optional)</h3>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> qdrant_client <span class="hljs-keyword">import</span> QdrantClient

qdrant = QdrantClient(
    url=<span class="hljs-string">"https://&lt;cluster-id&gt;.&lt;region&gt;.aws.cloud.qdrant.io:6333"</span>,
    api_key=userdata.get(<span class="hljs-string">"Qdrant_API_KEY"</span>)
)
print(qdrant.get_collections())  <span class="hljs-comment"># sanity‑check</span>
</code></pre>
<h3 id="heading-c-tell-mem0-to-use-qdrant">c) Tell Mem0 to use Qdrant</h3>
<pre><code class="lang-python">collection_name = <span class="hljs-string">"mem0_yt"</span>

config = {
    <span class="hljs-string">"vector_store"</span>: {
        <span class="hljs-string">"provider"</span>: <span class="hljs-string">"qdrant"</span>,
        <span class="hljs-string">"config"</span>: {
            <span class="hljs-string">"collection_name"</span>: collection_name,
            <span class="hljs-string">"host"</span>: <span class="hljs-string">"&lt;cluster-host&gt;"</span>,
            <span class="hljs-string">"port"</span>: <span class="hljs-number">6333</span>,
            <span class="hljs-string">"api_key"</span>: userdata.get(<span class="hljs-string">"Qdrant_API_KEY"</span>)
        }
    }
}

memory = Memory.from_config(config)
</code></pre>
<h3 id="heading-d-onetime-payload-index">d) One‑time payload index</h3>
<p>Mem0 filters by <code>user_id</code> when searching, so Qdrant needs a keyword index on that field. If you skip this step you’ll get:</p>
<pre><code class="lang-text">400 Bad Request – Index required but not found for "user_id" of type [keyword]
</code></pre>
<p>Create it once, then you’re good:</p>
<pre><code class="lang-python">qdrant.create_payload_index(
    collection_name=collection_name,
    field_name=<span class="hljs-string">"user_id"</span>,
    field_schema=<span class="hljs-string">"keyword"</span>
)
</code></pre>
<h3 id="heading-e-insert-and-query-as-usual">e) Insert and query as usual</h3>
<pre><code class="lang-python">messages = [
    {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"Hi, I'm Pradip Nichite. I run FutureSmart AI."</span>},
    {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"I love building RAG and AI Agent solutions that work in production."</span>}
]
memory.add(messages, user_id=<span class="hljs-string">"pradip"</span>)
</code></pre>
<p>From here all CRUD and LangGraph logic stays exactly the same—only the storage layer has changed.</p>
<hr />
<h2 id="heading-4-cloud-usage-using-the-mem0-cloud-platform">4. Cloud Usage – Using the Mem0 Cloud Platform</h2>
<p>If you’d rather skip managing your own DBs, Mem0 offers a hosted platform with a clean UI to inspect and edit memories.</p>
<h3 id="heading-a-authenticate">a) Authenticate</h3>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> mem0 <span class="hljs-keyword">import</span> MemoryClient

client = MemoryClient(api_key=userdata.get(<span class="hljs-string">"Mem0_API_KEY"</span>))
</code></pre>
<h3 id="heading-b-add-messages-same-schema-as-before">b) Add messages (same schema as before)</h3>
<pre><code class="lang-python">messages = [
    {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>,      <span class="hljs-string">"content"</span>: <span class="hljs-string">"Hi, I am Pradip. I am Founder of FutureSmart AI"</span>},
    {<span class="hljs-string">"role"</span>: <span class="hljs-string">"assistant"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"Hi Pradip"</span>}
]

client.add(messages, user_id=<span class="hljs-string">"Pradip_Founder"</span>)
</code></pre>
<h3 id="heading-c-inspect-in-the-dashboard">c) Inspect in the dashboard</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1752475996169/f275c471-8fd7-4dc4-bf00-8e5308d124cf.png" alt class="image--center mx-auto" /></p>
<blockquote>
<p>You’ll see two distilled memories automatically extracted, complete with timestamps and editable fields.</p>
</blockquote>
<p>The hosted store supports the same search/update/history API, so you can swap <code>Memory</code> for <code>MemoryClient</code> with minimal changes.</p>
<hr />
<h3 id="heading-watch-the-full-walkthrough">Watch the full walkthrough</h3>
<p>Prefer video? I recorded a step‑by‑step YouTube demo that mirrors this blog, including live coding and UI tours – check it out here 👇</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/e-wBojpJrrQ">https://youtu.be/e-wBojpJrrQ</a></div>
<p> </p>
<hr />
<h2 id="heading-need-a-custom-ai-solution">Need a Custom AI Solution?</h2>
<p>At <strong>FutureSmart AI</strong> we specialise in designing and shipping production‑grade AI systems—RAG agents, document parsers, NL2SQL bots, multi‑agent workflows, and more.</p>
<p>→ <strong>See our case studies:</strong> <a target="_blank" href="https://futuresmart.ai/case-studies">https://futuresmart.ai/case-studies</a><br />→ <strong>Try the LangGraph‑powered FutureSmart Agent:</strong> <a target="_blank" href="https://agent.futuresmart.ai/">https://agent.futuresmart.ai/</a><br />→ <strong>Get in touch:</strong> email us at <a target="_blank" href="mailto:contact@futuresmart.ai"><strong>contact@futuresmart.ai</strong></a> to discuss how we can build or fine‑tune an AI solution for your business.</p>
]]></content:encoded></item><item><title><![CDATA[LangGraph RAG Agent Tutorial | Basics to Advanced Multi-Agent AI Chatbot]]></title><description><![CDATA[Retrieval-Augmented Generation (RAG) is becoming the go-to pattern for building AI systems that can fetch real-time or domain-specific knowledge on demand. But RAG alone doesn’t make your chatbot smart.
With LangGraph, you can build stateful, agent-l...]]></description><link>https://blog.futuresmart.ai/langgraph-rag-agent-tutorial-basics-to-advanced-multi-agent-ai-chatbot</link><guid isPermaLink="true">https://blog.futuresmart.ai/langgraph-rag-agent-tutorial-basics-to-advanced-multi-agent-ai-chatbot</guid><category><![CDATA[langgraph]]></category><category><![CDATA[langchain]]></category><category><![CDATA[RAG ]]></category><category><![CDATA[agentic rag]]></category><category><![CDATA[ai-agent]]></category><dc:creator><![CDATA[Pradip Nichite]]></dc:creator><pubDate>Sat, 21 Jun 2025 13:40:26 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1750513172885/08c560e5-1e70-4433-8ede-da5208269de6.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Retrieval-Augmented Generation (RAG) is becoming the go-to pattern for building AI systems that can fetch real-time or domain-specific knowledge on demand. But RAG alone doesn’t make your chatbot smart.</p>
<p>With LangGraph, you can build stateful, agent-like flows that combine tools, memory, structured decision logic, and retrieval—all driven by LLMs.</p>
<p>In this blog, we’ll build up to a full LangGraph-based RAG Agent from scratch. We'll follow a practical path:</p>
<ol>
<li><p>Start with basic LLM usage</p>
</li>
<li><p>Bind tools to the LLM</p>
</li>
<li><p>Use LangGraph to build stateful agents</p>
</li>
<li><p>Add memory, routing logic, and tool execution</p>
</li>
<li><p>Finally, combine all of it with document retrieval to create a RAG-powered agent</p>
</li>
</ol>
<p>Each section mirrors what you’d build in a notebook, but with clear explanations to help you understand <em>why</em> each piece matters.</p>
<p>Let’s start with the first building block: invoking an LLM.</p>
<hr />
<h3 id="heading-step-1-invoking-a-language-model-llm">🧠 Step 1: Invoking a Language Model (LLM)</h3>
<p>To begin, we’ll use <code>ChatOpenAI</code> from LangChain to invoke a language model. We’ll keep it simple:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain_openai <span class="hljs-keyword">import</span> ChatOpenAI

<span class="hljs-comment"># Initialize the LLM</span>
llm = ChatOpenAI(model=<span class="hljs-string">"gpt-4.1-mini"</span>, temperature=<span class="hljs-number">0.7</span>)

<span class="hljs-comment"># Basic prompt</span>
response = llm.invoke(<span class="hljs-string">"What is artificial intelligence?"</span>)
print(response.content)
</code></pre>
<p>This returns a standard response from the LLM. But the real value comes when you treat the LLM like a conversation partner using message objects:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain_core.messages <span class="hljs-keyword">import</span> HumanMessage, SystemMessage

messages = [
    SystemMessage(content=<span class="hljs-string">"You are a helpful AI assistant that explains complex topics simply."</span>),
    HumanMessage(content=<span class="hljs-string">"Explain machine learning in 2 sentences."</span>)
]

response = llm.invoke(messages)
print(response.content)
</code></pre>
<p>Using <code>SystemMessage</code> and <code>HumanMessage</code> gives you more control over behavior and tone. It’s also how you’ll structure inputs later when building memory-enabled and multi-step agents.</p>
<p>Now that we can invoke an LLM in both simple and structured ways, we’re ready to start integrating tools.</p>
<hr />
<h3 id="heading-step-2-extending-llms-with-tools">🔧 Step 2: Extending LLMs with Tools</h3>
<p>LLMs are powerful, but they can’t do math or fetch real-time information on their own. To make your LLM truly useful, you can bind it with external tools. Here’s how:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain_core.tools <span class="hljs-keyword">import</span> tool
<span class="hljs-keyword">from</span> langchain_community.tools <span class="hljs-keyword">import</span> DuckDuckGoSearchRun

<span class="hljs-meta">@tool</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">calculator</span>(<span class="hljs-params">expression: str</span>) -&gt; str:</span>
    <span class="hljs-string">"""Calculate mathematical expressions. Use this for any math calculations."""</span>
    <span class="hljs-keyword">try</span>:
        result = eval(expression)
        <span class="hljs-keyword">return</span> <span class="hljs-string">f"The result of <span class="hljs-subst">{expression}</span> is <span class="hljs-subst">{result}</span>"</span>
    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        <span class="hljs-keyword">return</span> <span class="hljs-string">f"Error calculating <span class="hljs-subst">{expression}</span>: <span class="hljs-subst">{str(e)}</span>"</span>

search_tool = DuckDuckGoSearchRun()
</code></pre>
<p>We now have two tools:</p>
<ul>
<li><p><code>calculator</code> to perform basic arithmetic</p>
</li>
<li><p><code>search_tool</code> to fetch info from the web</p>
</li>
</ul>
<p>To bind these tools to the LLM:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Bind tools to the LLM</span>
tools = [calculator, search_tool]
llm_with_tools = llm.bind_tools(tools)
</code></pre>
<p>Let’s test the LLM with tools:</p>
<pre><code class="lang-python">response = llm_with_tools.invoke(<span class="hljs-string">"What's 25 * 4 + 17?"</span>)
print(response.content)
</code></pre>
<p>However, when an LLM is tool-enabled, its response might include <code>tool_calls</code> instead of just plain text. To handle that:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">handle_tool_calls</span>(<span class="hljs-params">response, tool_map</span>):</span>
    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> getattr(response, <span class="hljs-string">'tool_calls'</span>, <span class="hljs-literal">None</span>):
        <span class="hljs-keyword">return</span>

    <span class="hljs-keyword">for</span> tool_call <span class="hljs-keyword">in</span> response.tool_calls:
        tool_name = tool_call[<span class="hljs-string">'name'</span>]
        args = tool_call[<span class="hljs-string">'args'</span>]
        tool = tool_map.get(tool_name)
        <span class="hljs-keyword">if</span> tool:
            result = tool.invoke(args)
            print(<span class="hljs-string">f"Tool result: <span class="hljs-subst">{result}</span>"</span>)
</code></pre>
<p>Then:</p>
<pre><code class="lang-python">tool_map = {
    <span class="hljs-string">'calculator'</span>: calculator,
    <span class="hljs-string">'duckduckgo_search'</span>: search_tool,
}

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_llm_tool</span>(<span class="hljs-params">query</span>):</span>
    response = llm_with_tools.invoke(query)
    print(response.content)
    handle_tool_calls(response, tool_map)

<span class="hljs-comment"># Run some queries</span>
test_llm_tool(<span class="hljs-string">"What's 25 * 4 + 17?"</span>)
test_llm_tool(<span class="hljs-string">"Search for recent news about artificial intelligence"</span>)
</code></pre>
<p>With this setup, your LLM is now a <em>tool-using agent</em>.</p>
<p>Next, we’ll take this a step further by wiring everything into a LangGraph to make it stateful and multi-turn.</p>
<hr />
<h3 id="heading-step-3-building-a-basic-langgraph-chatbot">🧩 Step 3: Building a Basic LangGraph Chatbot</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750513436662/a0dde8f7-3b8d-40df-a6c4-526914621c70.png" alt class="image--center mx-auto" /></p>
<p>At its core, LangGraph lets you define a graph of nodes that process conversational state. Let’s begin with a minimal chatbot graph.</p>
<h4 id="heading-define-chatbot-state">Define Chatbot State</h4>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> Annotated, TypedDict
<span class="hljs-keyword">from</span> langchain_core.messages <span class="hljs-keyword">import</span> BaseMessage, HumanMessage, AIMessage
<span class="hljs-keyword">from</span> langgraph.graph.message <span class="hljs-keyword">import</span> add_messages

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">State</span>(<span class="hljs-params">TypedDict</span>):</span>
    messages: Annotated[list[BaseMessage], add_messages]
</code></pre>
<p>Here, we define a <code>State</code> object that will carry the conversation. The <code>add_messages</code> function ensures new messages are appended correctly.</p>
<h4 id="heading-create-the-chatbot-node">Create the Chatbot Node</h4>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">chatbot_node</span>(<span class="hljs-params">state: State</span>) -&gt; State:</span>
    response = llm.invoke(state[<span class="hljs-string">"messages"</span>])
    <span class="hljs-keyword">return</span> {<span class="hljs-string">"messages"</span>: [response]}
</code></pre>
<p>This node accepts messages and returns the updated state with the AI's response.</p>
<h4 id="heading-build-and-compile-the-graph">Build and Compile the Graph</h4>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langgraph.graph <span class="hljs-keyword">import</span> StateGraph, START, END

graph_builder = StateGraph(State)
graph_builder.add_node(<span class="hljs-string">"chatbot"</span>, chatbot_node)
graph_builder.add_edge(START, <span class="hljs-string">"chatbot"</span>)
graph_builder.add_edge(<span class="hljs-string">"chatbot"</span>, END)
graph = graph_builder.compile()
</code></pre>
<p>This sets up a simple one-node chatbot pipeline. You can now test it:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_chatbot</span>(<span class="hljs-params">message: str</span>):</span>
    initial_state = {<span class="hljs-string">"messages"</span>: [HumanMessage(content=message)]}
    result = graph.invoke(initial_state)
    print(<span class="hljs-string">"🤖 Assistant:"</span>, result[<span class="hljs-string">"messages"</span>][<span class="hljs-number">-1</span>].content)

test_chatbot(<span class="hljs-string">"Hello! My name is Pradip"</span>)
test_chatbot(<span class="hljs-string">"Do you remember my name?"</span>)
</code></pre>
<p>You’ll notice it doesn’t remember past messages yet. That’s what we’ll fix in the next step—by adding memory.</p>
<hr />
<h3 id="heading-step-4-adding-memory-to-the-chatbot">🧠 Step 4: Adding Memory to the Chatbot</h3>
<p>To make the chatbot remember previous conversations, we need to add a memory backend.<br />LangGraph provides <code>MemorySaver</code> for this purpose.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langgraph.checkpoint.memory <span class="hljs-keyword">import</span> MemorySaver

memory = MemorySaver()

<span class="hljs-comment"># Compile the graph again with memory enabled</span>
graph_with_memory = graph_builder.compile(checkpointer=memory)
</code></pre>
<p>We can now run the chatbot in a threaded manner, and it will retain context:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">chat_with_memory</span>(<span class="hljs-params">message: str, thread_id: str</span>):</span>
    config = {<span class="hljs-string">"configurable"</span>: {<span class="hljs-string">"thread_id"</span>: thread_id}}
    initial_state = {<span class="hljs-string">"messages"</span>: [HumanMessage(content=message)]}
    result = graph_with_memory.invoke(initial_state, config)
    print(<span class="hljs-string">"🤖 Assistant:"</span>, result[<span class="hljs-string">"messages"</span>][<span class="hljs-number">-1</span>].content)

<span class="hljs-comment"># Start a conversation</span>
chat_with_memory(<span class="hljs-string">"Hi, my name is Pradip"</span>, thread_id=<span class="hljs-string">"thread-1"</span>)
chat_with_memory(<span class="hljs-string">"What's my name?"</span>, thread_id=<span class="hljs-string">"thread-1"</span>)
</code></pre>
<p>With memory in place, the assistant can now recall previous messages.<br />This forms the foundation for building multi-turn, context-aware agents.</p>
<p>Next, we’ll add more intelligence to the flow using routing and tools.</p>
<h2 id="heading-step-5-langgraph-agent-with-tools">🛠️ Step 5 – LangGraph Agent with Tools</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750513455175/b7ea4e76-8e68-4341-b9e7-6d2925859b8f.png" alt class="image--center mx-auto" /></p>
<p>So far, our chatbot can talk (Step 3) and remember context (Step 4). Now we want it to <strong>recognise when a tool is needed and call it automatically</strong>.</p>
<p>At a high‑level we’ll add two new pieces:</p>
<ol>
<li><p><code>chatbot</code> node – decides whether it can answer directly or should call a tool.</p>
</li>
<li><p><code>tools</code> node – actually runs the requested tool‑call and passes the result back.</p>
</li>
</ol>
<p>The conversation state stays the same – a list of LangChain <code>Message</code> objects – so we just rename it to emphasise the agent role:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> Annotated, TypedDict
<span class="hljs-keyword">from</span> langgraph.graph.message <span class="hljs-keyword">import</span> add_messages
<span class="hljs-keyword">from</span> langchain_core.messages <span class="hljs-keyword">import</span> BaseMessage

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">AgentState</span>(<span class="hljs-params">TypedDict</span>):</span>
    <span class="hljs-string">"""State for our two‑node agent"""</span>
    messages: Annotated[list[BaseMessage], add_messages]
</code></pre>
<hr />
<h3 id="heading-1-bind-the-llm-to-our-existing-tools">1. Bind the LLM to our existing tools</h3>
<pre><code class="lang-python">llm = ChatOpenAI(model=<span class="hljs-string">"gpt-4.1-mini"</span>, temperature=<span class="hljs-number">0.7</span>)
llm_with_tools = llm.bind_tools(tools)  <span class="hljs-comment"># `tools` already contains `calculator` and `search_tool`</span>
</code></pre>
<p>Binding keeps the API exactly the same – we just swap <code>llm</code> for <code>llm_with_tools</code> when we need tool‑usage.</p>
<hr />
<h3 id="heading-2-the-chatbot-node-decide-answer-vs-tool">2. The <strong>chatbot</strong> node – decide <em>answer vs. tool</em></h3>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain_core.messages <span class="hljs-keyword">import</span> HumanMessage, AIMessage

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">chatbot_node</span>(<span class="hljs-params">state: AgentState</span>) -&gt; AgentState:</span>
    <span class="hljs-string">"""Gatekeeper: answer directly or request a tool"""</span>
    system_message = (
        <span class="hljs-string">"You are a helpful assistant.\n"</span>
        <span class="hljs-string">"Use the `web_search` tool for real‑time facts and `calculator` for maths.\n"</span>
        <span class="hljs-string">"Otherwise answer directly."</span>
    )

    messages = [
        {<span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>, <span class="hljs-string">"content"</span>: system_message},
        *state[<span class="hljs-string">"messages"</span>],
    ]

    response = llm_with_tools.invoke(messages)
    <span class="hljs-keyword">return</span> {<span class="hljs-string">"messages"</span>: [response]}  <span class="hljs-comment"># LangGraph merges this into the running state</span>
</code></pre>
<p><em>Key idea</em>: we embed the routing logic inside the prompt – the LLM decides whether tool calls are needed and, if so, emits a <code>tool_calls</code> entry in its JSON response.</p>
<hr />
<h3 id="heading-3-the-tools-node-run-any-requested-toolcalls">3. The <strong>tools</strong> node – run any requested tool‑calls</h3>
<p>Instead of re‑implementing the execution loop, we reuse the pre‑built <code>ToolNode</code>:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langgraph.prebuilt <span class="hljs-keyword">import</span> ToolNode

tool_node = ToolNode(tools)  <span class="hljs-comment"># automatically dispatches and streams results back</span>
</code></pre>
<hr />
<h3 id="heading-4-routing-logic">4. Routing logic</h3>
<p>We just need a small helper that checks whether the last message contains tool calls:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> Literal

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">should_continue</span>(<span class="hljs-params">state: AgentState</span>) -&gt; Literal["tools", "end"]:</span>
    last = state[<span class="hljs-string">"messages"</span>][<span class="hljs-number">-1</span>]
    <span class="hljs-keyword">return</span> <span class="hljs-string">"tools"</span> <span class="hljs-keyword">if</span> getattr(last, <span class="hljs-string">"tool_calls"</span>, <span class="hljs-literal">None</span>) <span class="hljs-keyword">else</span> <span class="hljs-string">"end"</span>
</code></pre>
<hr />
<h3 id="heading-5-wire-it-all-together-with-stategraph">5. Wire it all together with <code>StateGraph</code></h3>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langgraph.graph <span class="hljs-keyword">import</span> StateGraph, START, END
<span class="hljs-keyword">from</span> langgraph.checkpoint.memory <span class="hljs-keyword">import</span> MemorySaver

workflow = StateGraph(AgentState)
workflow.add_node(<span class="hljs-string">"chatbot"</span>, chatbot_node)
workflow.add_node(<span class="hljs-string">"tools"</span>,   tool_node)

workflow.add_edge(START, <span class="hljs-string">"chatbot"</span>)
workflow.add_conditional_edges(<span class="hljs-string">"chatbot"</span>, should_continue, {<span class="hljs-string">"tools"</span>: <span class="hljs-string">"tools"</span>, <span class="hljs-string">"end"</span>: END})
workflow.add_edge(<span class="hljs-string">"tools"</span>, <span class="hljs-string">"chatbot"</span>)  <span class="hljs-comment"># come back after tools run</span>

app = workflow.compile(checkpointer=MemorySaver())
</code></pre>
<blockquote>
<p><strong>Why a loop back to</strong> <code>chatbot</code>? After a tool runs we want the LLM to integrate the tool output and craft the final answer – so the graph cycles once.</p>
</blockquote>
<hr />
<h3 id="heading-6-quick-manual-test">6. Quick manual test</h3>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">chat_with_agent</span>(<span class="hljs-params">msg: str, thread_id=<span class="hljs-string">"demo"</span></span>):</span>
    cfg = {<span class="hljs-string">"configurable"</span>: {<span class="hljs-string">"thread_id"</span>: thread_id}}
    state = {<span class="hljs-string">"messages"</span>: [HumanMessage(content=msg)]}
    result = app.invoke(state, cfg)
    print(result[<span class="hljs-string">"messages"</span>][<span class="hljs-number">-1</span>].content)

chat_with_agent(<span class="hljs-string">"What's 15% of 240?"</span>)
chat_with_agent(<span class="hljs-string">"Search for recent news about artificial intelligence"</span>)
</code></pre>
<p>You should see the <em>calculator</em> and <em>web_search</em> tools being triggered automatically, followed by a neat, fully‑formed answer.</p>
<hr />
<p>That’s a self‑routing, tool‑aware agent. In the next step we’ll <strong>plug a knowledge‑base retriever into the tool‑chain</strong> and teach the agent when to switch from web search to internal RAG – bringing us one step closer to a production‑ready assistant.</p>
<h2 id="heading-step-6-langgraph-rag-agent">🔍 Step 6 – LangGraph RAG Agent</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750513305336/aa14ccd5-f220-4d5f-b7e8-2007004b5979.png" alt class="image--center mx-auto" /></p>
<blockquote>
<p><strong>Goal:</strong> Give your agent up‑to‑date, domain‑specific knowledge so it can answer beyond the LLM’s training data.</p>
<p>We’ll layer <strong>retrieval</strong>, <strong>routing</strong>, and an optional <strong>web‑search fallback</strong> on top of the tool‑enabled agent from Step 5.</p>
</blockquote>
<h3 id="heading-1-index-your-documents-once">1️⃣ Index your documents once</h3>
<pre><code class="lang-python"><span class="hljs-comment"># ── Build &amp; persist a Chroma index ────────────────────────────────</span>
<span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path
<span class="hljs-keyword">from</span> langchain_community.document_loaders <span class="hljs-keyword">import</span> PyPDFLoader, Docx2txtLoader
<span class="hljs-keyword">from</span> langchain_text_splitters <span class="hljs-keyword">import</span> RecursiveCharacterTextSplitter
<span class="hljs-keyword">from</span> langchain_openai <span class="hljs-keyword">import</span> OpenAIEmbeddings
<span class="hljs-keyword">from</span> langchain_community.vectorstores <span class="hljs-keyword">import</span> Chroma

SOURCE_DIR   = Path(<span class="hljs-string">"docs"</span>)          <span class="hljs-comment"># put your files here</span>
INDEX_DIR    = Path(<span class="hljs-string">"chroma_db_1"</span>)   <span class="hljs-comment"># will be created if missing</span>
EMBED_MODEL  = <span class="hljs-string">"text-embedding-3-small"</span>

<span class="hljs-comment"># Load docs (keep only pdf/docx for brevity)</span>
docs = []
<span class="hljs-keyword">for</span> f <span class="hljs-keyword">in</span> SOURCE_DIR.glob(<span class="hljs-string">"*.*"</span>):
    <span class="hljs-keyword">if</span> f.suffix == <span class="hljs-string">".pdf"</span>:
        docs += PyPDFLoader(str(f)).load()
    <span class="hljs-keyword">elif</span> f.suffix == <span class="hljs-string">".docx"</span>:
        docs += Docx2txtLoader(str(f)).load()

<span class="hljs-comment"># Split &amp; embed</span>
chunks     = RecursiveCharacterTextSplitter(chunk_size=<span class="hljs-number">1</span>_000, chunk_overlap=<span class="hljs-number">200</span>).split_documents(docs)
embeddings = OpenAIEmbeddings(model=EMBED_MODEL)

vectordb = Chroma.from_documents(
    documents         = chunks,
    embedding         = embeddings,
    persist_directory = str(INDEX_DIR),
    collection_name   = <span class="hljs-string">"kb_collection"</span>,
)
vectordb.persist()
print(<span class="hljs-string">"✅ Index built →"</span>, INDEX_DIR.resolve())
</code></pre>
<p><em>Run this once; the agent will query the saved index at runtime.</em></p>
<h3 id="heading-2-expose-a-retriever-as-a-langchain-tool">2️⃣ Expose a Retriever as a LangChain Tool</h3>
<pre><code class="lang-python">retriever = vectordb.as_retriever(search_kwargs={<span class="hljs-string">"k"</span>: <span class="hljs-number">2</span>})

<span class="hljs-meta">@tool</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">rag_search_tool</span>(<span class="hljs-params">query: str</span>) -&gt; str:</span>
    <span class="hljs-string">"""Search the knowledge‑base for relevant chunks"""</span>
    results = retriever.invoke(query)
    <span class="hljs-keyword">return</span> <span class="hljs-string">"

"</span>.join(d.page_content <span class="hljs-keyword">for</span> d <span class="hljs-keyword">in</span> results)
</code></pre>
<h3 id="heading-3-optional-fallback-realtime-web-search">3️⃣ Optional fallback → real‑time web search</h3>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain_tavily <span class="hljs-keyword">import</span> TavilySearch

tavily = TavilySearch(max_results=<span class="hljs-number">3</span>, topic=<span class="hljs-string">"general"</span>)

<span class="hljs-meta">@tool</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">web_search_tool</span>(<span class="hljs-params">query: str</span>) -&gt; str:</span>
    <span class="hljs-string">"""Up‑to‑date web info via Tavily"""</span>
    <span class="hljs-keyword">return</span> <span class="hljs-string">"

"</span>.join(r[<span class="hljs-string">"content"</span>] <span class="hljs-keyword">for</span> r <span class="hljs-keyword">in</span> tavily.invoke({<span class="hljs-string">"query"</span>: query})[<span class="hljs-string">"results"</span>])  <span class="hljs-comment"># simplified</span>
</code></pre>
<h3 id="heading-4-extend-the-agent-state">4️⃣ Extend the Agent State</h3>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">AgentState</span>(<span class="hljs-params">State</span>):</span>          <span class="hljs-comment"># add to previous `State`</span>
    route:    str          <span class="hljs-comment"># "rag", "answer", "web", "end"</span>
    rag:      str | <span class="hljs-literal">None</span>   <span class="hljs-comment"># KB result</span>
    web:      str | <span class="hljs-literal">None</span>   <span class="hljs-comment"># web‑search snippets</span>
</code></pre>
<h3 id="heading-5-decision-execution-nodes">5️⃣ Decision / Execution Nodes</h3>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Node</td><td>What it does</td></tr>
</thead>
<tbody>
<tr>
<td><strong>router_node</strong></td><td>Uses an LLM with structured output to decide the <code>route</code> – <em>rag</em>, <em>answer</em>, or <em>end</em>.</td></tr>
<tr>
<td><strong>rag_node</strong></td><td>Runs <code>rag_search_tool</code>, then asks a <em>judge</em> LLM if the chunks are <strong>sufficient</strong>. Sets <code>route</code> to <em>answer</em> or <em>web</em>.</td></tr>
<tr>
<td><strong>web_node</strong></td><td>Calls <code>web_search_tool</code> and passes snippets along.</td></tr>
<tr>
<td><strong>answer_node</strong></td><td>Crafts the final reply, combining any <code>rag</code> and/or <code>web</code> context.</td></tr>
</tbody>
</table>
</div><p>Key implementation points (condensed):</p>
<pre><code class="lang-python"><span class="hljs-comment"># ── Structured helpers ─────────────────</span>
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">RouteDecision</span>(<span class="hljs-params">BaseModel</span>):</span>
    route: Literal[<span class="hljs-string">"rag"</span>, <span class="hljs-string">"answer"</span>, <span class="hljs-string">"end"</span>]
    reply: str | <span class="hljs-literal">None</span> = <span class="hljs-literal">None</span>

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">RagJudge</span>(<span class="hljs-params">BaseModel</span>):</span>
    sufficient: bool

router_llm = ChatOpenAI(model=<span class="hljs-string">"gpt-4.1-mini"</span>, temperature=<span class="hljs-number">0</span>).with_structured_output(RouteDecision)
judge_llm  = ChatOpenAI(model=<span class="hljs-string">"gpt-4.1-mini"</span>, temperature=<span class="hljs-number">0</span>).with_structured_output(RagJudge)
answer_llm = ChatOpenAI(model=<span class="hljs-string">"gpt-4.1-mini"</span>, temperature=<span class="hljs-number">0.7</span>)

<span class="hljs-comment"># ── Router ─────────────────────────────</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">router_node</span>(<span class="hljs-params">state: AgentState</span>) -&gt; AgentState:</span>
    q = state[<span class="hljs-string">"messages"</span>][<span class="hljs-number">-1</span>].content
    decision = router_llm.invoke([
        (<span class="hljs-string">"system"</span>, <span class="hljs-string">"Decide route: rag / answer / end"</span>),
        (<span class="hljs-string">"user"</span>, q)
    ])
    new_state = {**state, <span class="hljs-string">"route"</span>: decision.route}
    <span class="hljs-keyword">if</span> decision.route == <span class="hljs-string">"end"</span>:
        new_state[<span class="hljs-string">"messages"</span>] += [AIMessage(content=decision.reply <span class="hljs-keyword">or</span> <span class="hljs-string">"Hello!"</span>)]
    <span class="hljs-keyword">return</span> new_state

<span class="hljs-comment"># ── RAG lookup ─────────────────────────</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">rag_node</span>(<span class="hljs-params">state: AgentState</span>) -&gt; AgentState:</span>
    q = state[<span class="hljs-string">"messages"</span>][<span class="hljs-number">-1</span>].content
    chunks = rag_search_tool.invoke(q)
    verdict = judge_llm.invoke([(<span class="hljs-string">"user"</span>, <span class="hljs-string">f"Question: <span class="hljs-subst">{q}</span>
Docs: <span class="hljs-subst">{chunks[:<span class="hljs-number">300</span>]}</span>…"</span>)])
    <span class="hljs-keyword">return</span> {**state, <span class="hljs-string">"rag"</span>: chunks, <span class="hljs-string">"route"</span>: <span class="hljs-string">"answer"</span> <span class="hljs-keyword">if</span> verdict.sufficient <span class="hljs-keyword">else</span> <span class="hljs-string">"web"</span>}

<span class="hljs-comment"># ── Web search &amp; Answer nodes omitted for brevity (same as notebook) ──</span>
</code></pre>
<h3 id="heading-6-wire-up-the-graph">6️⃣ Wire up the Graph</h3>
<pre><code class="lang-python">agent_graph = StateGraph(AgentState)
agent_graph.add_node(<span class="hljs-string">"router"</span>,      router_node)
agent_graph.add_node(<span class="hljs-string">"rag_lookup"</span>,  rag_node)
agent_graph.add_node(<span class="hljs-string">"web_search"</span>,  web_node)
agent_graph.add_node(<span class="hljs-string">"answer"</span>,      answer_node)

agent_graph.set_entry_point(<span class="hljs-string">"router"</span>)
agent_graph.add_conditional_edges(<span class="hljs-string">"router"</span>, from_router,
        {<span class="hljs-string">"rag"</span>: <span class="hljs-string">"rag_lookup"</span>, <span class="hljs-string">"answer"</span>: <span class="hljs-string">"answer"</span>, <span class="hljs-string">"end"</span>: END})
agent_graph.add_conditional_edges(<span class="hljs-string">"rag_lookup"</span>, after_rag,
        {<span class="hljs-string">"answer"</span>: <span class="hljs-string">"answer"</span>, <span class="hljs-string">"web"</span>: <span class="hljs-string">"web_search"</span>})
agent_graph.add_edge(<span class="hljs-string">"web_search"</span>, <span class="hljs-string">"answer"</span>)
agent_graph.add_edge(<span class="hljs-string">"answer"</span>, END)

agent = agent_graph.compile(checkpointer=MemorySaver())
</code></pre>
<h3 id="heading-7-quick-cli-test">7️⃣ Quick CLI test</h3>
<pre><code class="lang-python"><span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
    config = {<span class="hljs-string">"configurable"</span>: {<span class="hljs-string">"thread_id"</span>: <span class="hljs-string">"thread‑12"</span>}}
    <span class="hljs-keyword">while</span> <span class="hljs-literal">True</span>:
        q = input(<span class="hljs-string">"You: "</span>).strip()
        <span class="hljs-keyword">if</span> q <span class="hljs-keyword">in</span> {<span class="hljs-string">"quit"</span>, <span class="hljs-string">"exit"</span>}: <span class="hljs-keyword">break</span>
        result = agent.invoke({<span class="hljs-string">"messages"</span>: [HumanMessage(content=q)]}, config)
        print(result[<span class="hljs-string">"messages"</span>][<span class="hljs-number">-1</span>].content)
</code></pre>
<p>Now your LangGraph agent:</p>
<ul>
<li><p><strong>Routes</strong> intelligently</p>
</li>
<li><p><strong>Retrieves</strong> domain knowledge with RAG</p>
</li>
<li><p><strong>Falls back</strong> to web search when KB is insufficient</p>
</li>
<li><p><strong>Streams</strong> multi‑turn answers with memory</p>
</li>
</ul>
<p>In short, this is a <em>production‑ready skeleton</em> you can plug into any project.</p>
<hr />
<h2 id="heading-conclusion-amp-resources">🚀 Conclusion &amp; Resources</h2>
<p>In this tutorial we climbed the ladder from <strong>basic LLM calls</strong> ➜ <strong>tool‑aware agents</strong> ➜ <strong>memory</strong> ➜ <strong>RAG</strong> ➜ <strong>full multi‑step routing</strong> with LangGraph. You now have a production‑ready skeleton that can:</p>
<ul>
<li><p>Chat naturally across turns (memory)</p>
</li>
<li><p>Decide when to use internal knowledge vs. external tools (router)</p>
</li>
<li><p>Pull trusted data from your own docs (RAG)</p>
</li>
<li><p>Fall back to real‑time web search when the KB is lacking</p>
</li>
</ul>
<hr />
<h3 id="heading-grab-the-code">📂 Grab the code</h3>
<ul>
<li><strong>Full Notebook on GitHub:</strong> <a target="_blank" href="https://github.com/PradipNichite/Youtube-Tutorials/blob/main/RAG_AI_Agent_using_LangGraph.ipynb">LangGraph RAG Agent Notebook</a></li>
</ul>
<p>🕹 Try the live RAG Agent: <a target="_blank" href="https://agent.futuresmart.ai/">https://agent.futuresmart.ai/</a></p>
<h3 id="heading-watch-the-build-walkthrough">🎥 Watch the build walkthrough</h3>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/60XDTWhklLA">https://youtu.be/60XDTWhklLA</a></div>
<p> </p>
<hr />
<h3 id="heading-whats-next">What’s next?</h3>
<ol>
<li><p><strong>Swap in your own docs.</strong> Point the loader at your knowledge base and rebuild the index.</p>
</li>
<li><p><strong>Add streaming.</strong> LangGraph supports async generators so you can pipe partial answers to the UI.</p>
</li>
<li><p><strong>Deploy.</strong> Package the graph inside a FastAPI endpoint or a serverless function and wire up a front‑end.</p>
</li>
</ol>
<p>Got questions or improvement ideas? drop a comment under the YouTube video – I’d love to hear how you extend this skeleton!</p>
<p>Happy building 🛠️🤖</p>
]]></content:encoded></item><item><title><![CDATA[Building a RAG System with Async FastAPI, Qdrant, Langchain and OpenAI]]></title><description><![CDATA[Introduction
In the era of advanced AI applications, Retrieval-Augmented Generation (RAG) stands out as a game-changing approach. By combining retrieval techniques with generative models, RAG enhances the quality, accuracy, and relevance of generated...]]></description><link>https://blog.futuresmart.ai/rag-system-with-async-fastapi-qdrant-langchain-and-openai</link><guid isPermaLink="true">https://blog.futuresmart.ai/rag-system-with-async-fastapi-qdrant-langchain-and-openai</guid><category><![CDATA[FastAPI]]></category><category><![CDATA[async]]></category><category><![CDATA[asynchronous]]></category><category><![CDATA[qdrant]]></category><category><![CDATA[RAG ]]></category><category><![CDATA[langchain]]></category><category><![CDATA[vector database]]></category><category><![CDATA[llm]]></category><dc:creator><![CDATA[Shreyas Dhaware]]></dc:creator><pubDate>Fri, 31 Jan 2025 06:05:24 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1737126418664/40813509-5077-4a41-a561-60a43a5cfe1f.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>In the era of advanced AI applications, <a target="_blank" href="https://blog.futuresmart.ai/building-rag-applications-without-langchain-or-llamaindex"><strong>Retrieval-Augmented Generation (RAG)</strong></a> stands out as a game-changing approach. By combining retrieval techniques with generative models, RAG enhances the quality, accuracy, and relevance of generated outputs. This blog walks you through building a scalable and efficient RAG system using <strong>FastAPI</strong>, <strong>Qdrant</strong>, <strong>LangChain</strong>, and <strong>OpenAI</strong>, all while leveraging asynchronous capabilities for improved performance.</p>
<p>At <a target="_blank" href="https://www.futuresmart.ai/"><strong>FutureSmart AI</strong></a><strong>,</strong> we are committed to pioneering innovative solutions and leveraging cutting-edge technologies. Building a RAG system with Async FastAPI, Qdrant, Langchain, and OpenAI has helped us create efficient and <strong>Highly scalable</strong> AI-powered applications for our clients. This blog primarily reflects on our dedication to empowering developers and organizations with actionable knowledge to implement high-performance systems.</p>
<p>While this blog focuses on an on-premise setup for a hands-on approach, drawing from our experience we can assure that these tools also support scalable cloud-based deployments, ensuring flexibility for production-ready solutions. At FutureSmart AI, we’re always exploring and refining methods to push the boundaries of what AI can achieve.</p>
<h2 id="heading-overview-of-retrieval-augmented-generation-rag"><strong>Overview of Retrieval-Augmented Generation (RAG)</strong></h2>
<p>RAG combines two essential components:</p>
<ol>
<li><p><strong>Retrieval:</strong> Find relevant documents from a large dataset. This part uses a search mechanism to identify the most relevant passages from a large text based on the input query.</p>
</li>
<li><p><strong>Generation:</strong> Uses a language model to generate context-aware answers. Once relevant information is retrieved, a language model generates the final response by incorporating the retrieved context into the generated text.</p>
</li>
</ol>
<p>This integration empowers Retrieval-Augmented Generation (RAG) to deliver more accurate and contextually relevant responses compared to standalone Large Language Models (LLMs).</p>
<p>For a comprehensive understanding, explore our <a target="_blank" href="https://www.youtube.com/watch?v=38aMTXY2usU"><strong>Langchain RAG Course: From Basics to Production-Ready RAG Chatbot</strong></a> or, if you prefer reading, visit our detailed <a target="_blank" href="https://blog.futuresmart.ai/langchain-rag-from-basics-to-production-ready-rag-chatbot">Blog</a> for more insights.</p>
<h2 id="heading-the-tech-stack-what-you-need-amp-why">The Tech Stack: What You Need &amp; Why</h2>
<p>Let's break down our tools and why we chose them. Each one plays a crucial role in building a powerful RAG system.</p>
<p><strong>FastAPI</strong></p>
<p>FastAPI enables the rapid development of performant web APIs with asynchronous capabilities. Its support for Python-type hints makes it developer-friendly and robust.</p>
<p>For more information, check out the <a target="_blank" href="https://youtu.be/KVdP4SpWcc4?si=ILIRI398bMPlugv6">FastAPI Tutorial</a>.</p>
<p><a target="_blank" href="https://qdrant.tech/"><strong>Qdrant</strong></a></p>
<p>Qdrant excels in high-dimensional vector storage and retrieval operations. In our enterprise implementations, it has proven invaluable for:</p>
<ul>
<li><p>Efficient management of large-scale vector datasets</p>
</li>
<li><p>Optimal performance in similarity search operations</p>
</li>
<li><p>Seamless horizontal scaling capabilities</p>
</li>
</ul>
<p>For a detailed and in-depth explanation please refer to our <a target="_blank" href="https://blog.futuresmart.ai/comprehensive-guide-to-qdrant-vector-db-installation-and-setup"><strong>Comprehensive Guide to Installing and Using Qdrant VectorDB with Docker Server and Local Setup</strong></a></p>
<p><strong>LangChain</strong></p>
<p>LangChain and its components, such as chains, prompts, and memory, enable efficient interaction with LLMs.</p>
<p><strong>OpenAI</strong></p>
<p>We will use OpenAI’s language models in this tutorial. You'll also need a basic understanding of how to send queries to OpenAI’s API and interpret responses.</p>
<h2 id="heading-prerequisites"><strong>Prerequisites</strong></h2>
<ol>
<li><strong>Create a Python Virtual Environment</strong></li>
</ol>
<p>It’s recommended to use a virtual environment to isolate your dependencies.</p>
<ol start="2">
<li><strong>Install Dependencies</strong></li>
</ol>
<p>Use the provided <code>requirements.txt</code> file to install the necessary Python packages.</p>
<pre><code class="lang-bash">pip install -r requirements.txt
</code></pre>
<ol start="3">
<li><strong>Setting up API Keys</strong></li>
</ol>
<p>To connect to external services like OpenAI and Qdrant, you need to set up API keys securely.</p>
<ol start="4">
<li><strong>Create a</strong> <code>.env</code> File</li>
</ol>
<p>Create a <code>.env</code> file in the root of your project directory to store sensitive information like API keys and configuration details.</p>
<p>Example <code>.env</code> file:</p>
<pre><code class="lang-yaml"><span class="hljs-string">OPENAI_API_KEY=your_openai_api_key</span>
<span class="hljs-string">qdrant_db_path=http://localhost:6333</span>  <span class="hljs-comment"># Replace with your Qdrant URL</span>
<span class="hljs-string">llm_provider="openai"</span>
<span class="hljs-string">model="gpt-4o-mini"</span>
<span class="hljs-string">temperature="0.1"</span>
<span class="hljs-string">chunk_size</span> <span class="hljs-string">=</span> <span class="hljs-number">2000</span>
<span class="hljs-string">no_of_chunks</span> <span class="hljs-string">=</span> <span class="hljs-number">3</span>
</code></pre>
<ol start="5">
<li><strong>Load Environment Variables</strong></li>
</ol>
<p>Use libraries like <code>python-dotenv</code> to load the <code>.env</code> file into your application.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> dotenv <span class="hljs-keyword">import</span> load_dotenv
<span class="hljs-keyword">import</span> os

load_dotenv()

OPENAI_API_KEY = os.getenv(<span class="hljs-string">"OPENAI_API_KEY"</span>)
QDRANT_URL = os.getenv(<span class="hljs-string">"QDRANT_URL"</span>)
</code></pre>
<h2 id="heading-project-structure">Project Structure</h2>
<pre><code class="lang-yaml"><span class="hljs-string">services/</span>
    <span class="hljs-string">logger.py</span>
    <span class="hljs-string">pydantic_models.py</span>
<span class="hljs-string">uploads/</span>
    <span class="hljs-string">xxx.txt</span>
    <span class="hljs-string">yyy.pdf</span>
    <span class="hljs-string">zzz.docx</span>
<span class="hljs-string">utils/</span>
    <span class="hljs-string">__init__.py</span>
    <span class="hljs-string">db_utils.py</span>
    <span class="hljs-string">langchain_utils.py</span>
    <span class="hljs-string">prompts.py</span>
    <span class="hljs-string">qdrant_utils.py</span>
    <span class="hljs-string">utils.py</span>
<span class="hljs-string">.env</span>
<span class="hljs-string">api.py</span>
</code></pre>
<ul>
<li><p><code>services/:</code> This folder houses essential services that support core functionalities:</p>
<ul>
<li><p><code>logger.py:</code> Manages the logging setup for the application. Logging is critical for debugging, monitoring, and tracking the application's behavior.</p>
</li>
<li><p><code>pydantic_models.py</code>: Defines Pydantic models used for data validation and serialization. <strong>Pydantic</strong> ensures data entering the application is valid and formatted correctly.</p>
</li>
</ul>
</li>
<li><p><code>uploads/:</code> A dedicated folder for file uploads. This is where the application stores temporary or permanent files uploaded by users.</p>
</li>
<li><p><code>utils/:</code> A utility module containing helper scripts that encapsulate reusable logic:</p>
<ul>
<li><p><code>__init__.py</code>: Marks the folder as a Python package.</p>
</li>
<li><p><code>db_utils.py</code>: Contains functions for interacting with the database.</p>
</li>
<li><p><code>langchain_utils.py</code>: Provides utility functions for integrating LangChain, a framework for language model applications.</p>
</li>
<li><p><code>prompts.py</code>: Stores pre-defined prompts for interacting with language models or other AI systems.</p>
</li>
<li><p><code>qdrant_utils.py</code>: Handles operations with Qdrant, a vector search engine for similarity-based search.</p>
</li>
<li><p><code>utils.py</code>: General-purpose utility functions used across the project.</p>
</li>
</ul>
</li>
<li><p><code>.env:</code> A configuration file storing environment variables like database credentials, API keys, and other sensitive data.</p>
</li>
<li><p><code>api.py:</code> The application's entry point is where FastAPI initializes and routes are defined. This file connects all the components and defines the API endpoints.</p>
</li>
</ul>
<h2 id="heading-setting-up-qdrant-for-efficient-retrieval"><strong>Setting Up Qdrant for Efficient Retrieval</strong></h2>
<h3 id="heading-imports-and-configuration"><strong>Imports and Configuration</strong></h3>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> os
<span class="hljs-keyword">import</span> time
<span class="hljs-keyword">from</span> dotenv <span class="hljs-keyword">import</span> load_dotenv
<span class="hljs-keyword">from</span> uuid <span class="hljs-keyword">import</span> uuid4
<span class="hljs-keyword">import</span> asyncio

<span class="hljs-comment"># Langchain imports</span>
<span class="hljs-keyword">from</span> langchain.text_splitter <span class="hljs-keyword">import</span> RecursiveCharacterTextSplitter
<span class="hljs-keyword">from</span> langchain_core.documents <span class="hljs-keyword">import</span> Document
<span class="hljs-keyword">from</span> langchain_qdrant <span class="hljs-keyword">import</span> QdrantVectorStore
<span class="hljs-keyword">from</span> langchain_openai <span class="hljs-keyword">import</span> OpenAIEmbeddings

<span class="hljs-comment"># Qdrant imports</span>
<span class="hljs-keyword">from</span> qdrant_client <span class="hljs-keyword">import</span> QdrantClient, AsyncQdrantClient
<span class="hljs-keyword">from</span> qdrant_client.http.models <span class="hljs-keyword">import</span> Distance, VectorParams

<span class="hljs-keyword">from</span> services.logger <span class="hljs-keyword">import</span> logger
<span class="hljs-keyword">from</span> uuid <span class="hljs-keyword">import</span> uuid4

load_dotenv(override=<span class="hljs-literal">True</span>)

OPENAI_API_KEY = os.getenv(<span class="hljs-string">"OPENAI_API_KEY"</span>)
qdrant_db_path=os.getenv(<span class="hljs-string">"qdrant_db_path"</span>)
</code></pre>
<p>The setup begins with importing essential libraries, loading environment variables (like API keys), and initializing necessary configurations. Notable imports include LangChain's <code>RecursiveCharacterTextSplitter</code> for chunking documents and Qdrant’s async client for vector database interactions.</p>
<h3 id="heading-documentindexer-class"><strong>DocumentIndexer Class</strong></h3>
<p>The <code>DocumentIndexer</code> class handles indexing and retrieval in Qdrant. Let’s break it down step-by-step.</p>
<p><strong>Initialization</strong></p>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">DocumentIndexer</span>:</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self, qdrant_db_path</span>):</span>
        self.db_path = qdrant_db_path
        self.embedding_function = OpenAIEmbeddings(model=<span class="hljs-string">"text-embedding-3-large"</span>, api_key=OPENAI_API_KEY)
        self.vector_store = <span class="hljs-literal">None</span>
        self.client = AsyncQdrantClient(self.db_path)
</code></pre>
<ul>
<li><p><code>embedding_function</code>: Uses OpenAI’s embeddings to convert text into dense vector representations.</p>
</li>
<li><p><code>client</code>: Initializes an async Qdrant client to manage the vector database.</p>
</li>
<li><p><code>vector_store</code>: Qdrant vector store is used to add documents and manage their vector representations.</p>
</li>
</ul>
<p><strong>Indexing Text in Qdrant</strong></p>
<p>The method <code>index_in_qdrantdb</code> handles the extraction and indexing of document text. Here’s how it works:</p>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">index_in_qdrantdb</span>(<span class="hljs-params">self, extracted_text, file_name, doc_type, chunk_size</span>):</span>
    <span class="hljs-keyword">try</span>:
        <span class="hljs-comment"># Create a Document object</span>
        doc = Document(
            page_content=extracted_text,
            metadata={
                <span class="hljs-string">"file_name"</span>: file_name,
                <span class="hljs-string">"doc_type"</span>: doc_type
            }
        )


        chunk_size = int(os.getenv(<span class="hljs-string">"chunk_size"</span>))
        logger.info(<span class="hljs-string">f"Using dynamic chunk size: <span class="hljs-subst">{chunk_size}</span>"</span>)

        <span class="hljs-comment"># Split the document</span>
        text_splitter = RecursiveCharacterTextSplitter(
            separators=[<span class="hljs-string">'\\n\\n'</span>, <span class="hljs-string">'\\n'</span>, <span class="hljs-string">','</span>],
            chunk_size=chunk_size,
            chunk_overlap=<span class="hljs-number">200</span>
        )
        docus = text_splitter.split_documents([doc])

        <span class="hljs-comment"># Generate UUIDs for all chunks</span>
        uuids = [<span class="hljs-string">f"<span class="hljs-subst">{str(uuid4())}</span>"</span> <span class="hljs-keyword">for</span> _ <span class="hljs-keyword">in</span> range(len(docus))]
        collection = <span class="hljs-string">"rag_demo_collection"</span>

        collections = <span class="hljs-keyword">await</span> self.client.get_collections()

        <span class="hljs-keyword">if</span> collection <span class="hljs-keyword">in</span> [collection_name.name <span class="hljs-keyword">for</span> collection_name <span class="hljs-keyword">in</span> collections.collections]:
            logger.info(<span class="hljs-string">f"Collection <span class="hljs-subst">{collection}</span> already exists in QdrantDB"</span>)
        <span class="hljs-keyword">else</span>:
            <span class="hljs-keyword">await</span> self.client.create_collection(
                collection_name=collection,
                vectors_config=VectorParams(size=<span class="hljs-number">3072</span>, distance=Distance.COSINE))

        self.vector_store =  QdrantVectorStore.from_existing_collection(collection_name=collection, embedding=self.embedding_function, url=self.db_path)

        <span class="hljs-keyword">await</span> self.vector_store.aadd_documents(documents=docus, ids=uuids)

        logger.info(<span class="hljs-string">f"Successfully indexed document in QdrantDB"</span>)
        <span class="hljs-keyword">return</span> <span class="hljs-literal">True</span>

    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        logger.error(<span class="hljs-string">f"Error indexing document in QdrantDB: <span class="hljs-subst">{e}</span>"</span>)
        <span class="hljs-keyword">raise</span>
</code></pre>
<p><strong>Key Points</strong>:</p>
<ol>
<li><p><strong>Document Creation</strong>: Loading and splitting extracted data asynchronously for efficient processing. A <code>Document</code> object is created to store extracted text and metadata.</p>
</li>
<li><p><strong>Chunking of Document</strong>: The document is divided into manageable chunks using <code>RecursiveCharacterTextSplitter</code>.</p>
</li>
<li><p><strong>Collection Management</strong>: The Qdrant collection is created only if it doesn’t already exist.</p>
</li>
<li><p><strong>Batch Indexing</strong>: Chunks are added to the Qdrant database with unique UUIDs.</p>
</li>
<li><p>Using <strong>openai</strong> embedding model to create vector representations of documents.</p>
</li>
<li><p><strong>Asynchronously</strong> uploading documents to Qdrant for similarity search.</p>
</li>
</ol>
<p><strong>Retrieving Documents</strong></p>
<p>To enable querying of indexed data, the <code>get_retriever</code> method returns a retriever:</p>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_retriever</span>(<span class="hljs-params">self, top_k</span>):</span>
    <span class="hljs-keyword">try</span>:
        collection = <span class="hljs-string">"rag_demo_collection"</span>
        <span class="hljs-keyword">if</span> self.vector_store <span class="hljs-keyword">is</span> <span class="hljs-literal">None</span>:
            self.vector_store =  QdrantVectorStore.from_existing_collection(collection_name=collection, embedding=self.embedding_function, url=self.db_path)

        <span class="hljs-keyword">return</span> self.vector_store.as_retriever(search_type=<span class="hljs-string">"similarity"</span>, search_kwargs={<span class="hljs-string">"k"</span>: top_k})
    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        logger.error(<span class="hljs-string">f"Error creating retriever: <span class="hljs-subst">{e}</span>"</span>)
        <span class="hljs-keyword">raise</span>
</code></pre>
<p><strong>Key Points</strong>:</p>
<ol>
<li><p><strong>Retriever Initialization</strong>: If the <code>vector_store</code>object doesn’t exist, it initializes a retriever from the existing collection.</p>
</li>
<li><p><strong>Search Parameters</strong>: Supports similarity-based searches with a configurable <code>top_k</code> parameter.</p>
</li>
</ol>
<h2 id="heading-implementing-the-asynchronous-fastapi-endpoint"><strong>Implementing the Asynchronous FastAPI Endpoint</strong></h2>
<p>Asynchronous endpoints allow the server to handle multiple requests simultaneously, which is essential for applications that process large files or perform complex computations.</p>
<h3 id="heading-setting-up-fastapi"><strong>Setting Up FastAPI</strong></h3>
<p>The first step is initializing a FastAPI application that supports asynchronous request handling. This allows the server to process multiple incoming requests without blocking other operations, essential for high-performance APIs.</p>
<pre><code class="lang-python">app = FastAPI()
</code></pre>
<p>The application is initialized with the <code>FastAPI()</code> class, which serves as the primary entry point for defining routes and handling requests.</p>
<h3 id="heading-defining-api-routes"><strong>Defining API Routes</strong></h3>
<p>The code leverages <code>async def</code> for efficient non-blocking request handling, ensuring high performance under concurrent loads. Two main routes are implemented: the <code>/upload-knowledge</code> endpoint and the <code>/chat</code> endpoint. These routes demonstrate seamless integration of file processing, database operations, and conversational AI.</p>
<ol>
<li><p><strong>Document Ingestion Endpoint (</strong><code>/upload-knowledge</code>)</p>
<ul>
<li><p>Allows users to upload files containing knowledge documents.</p>
</li>
<li><p>Extracts text from the uploaded file and indexes it in a database for future query responses</p>
</li>
</ul>
</li>
</ol>
<pre><code class="lang-python"><span class="hljs-meta">    @app.post("/upload-knowledge")</span>
    <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">upload_knowledge</span>(<span class="hljs-params">
        username: str = Form(<span class="hljs-params">...</span>),
        file: Optional[UploadFile] = File(<span class="hljs-params">None</span>)
    </span>):</span>
        <span class="hljs-keyword">try</span>:
            <span class="hljs-comment"># Handle file extraction and indexing</span>
            extracted_text = <span class="hljs-string">""</span>
            <span class="hljs-keyword">if</span> file:
                logger.info(<span class="hljs-string">f"File uploaded: <span class="hljs-subst">{file.filename}</span>"</span>)
                file_content = <span class="hljs-keyword">await</span> file.read()
                file_extension = file.filename.split(<span class="hljs-string">'.'</span>)[<span class="hljs-number">-1</span>].lower()
                extracted_text = <span class="hljs-keyword">await</span> extract_text_from_file(file_content, file_extension)
                logger.info(<span class="hljs-string">f"Extracted text from file: <span class="hljs-subst">{extracted_text}</span>"</span>)
                <span class="hljs-keyword">await</span> index_documents(username, extracted_text, file.filename, file_extension)
            <span class="hljs-keyword">return</span> {<span class="hljs-string">'response'</span>: <span class="hljs-string">'Indexed Documents Successfully'</span>, <span class="hljs-string">'extracted_text'</span>: extracted_text}
        <span class="hljs-keyword">except</span> ValueError <span class="hljs-keyword">as</span> e:
            <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">400</span>, detail=str(e))
        <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
            logger.error(<span class="hljs-string">f"Error processing indexing request: <span class="hljs-subst">{str(e)}</span>"</span>)
            <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">500</span>, detail=<span class="hljs-string">f"Unexpected error: <span class="hljs-subst">{e}</span>"</span>)
</code></pre>
<p>    <strong>File Text Extraction</strong></p>
<p>    The function <code>extract_text_from_file</code> extracts text from different file types (e.g., TXT, PDF, DOCX).</p>
<pre><code class="lang-python">    <span class="hljs-comment"># Asynchronous file text extraction</span>
    <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">extract_text_from_file</span>(<span class="hljs-params">file_content: bytes, file_type: str</span>) -&gt; str:</span>
        <span class="hljs-string">"""
        Extract text from different file types based on the file type.
        """</span>
        <span class="hljs-keyword">if</span> file_type == <span class="hljs-string">"txt"</span>:
            <span class="hljs-keyword">return</span> <span class="hljs-keyword">await</span> extract_text_from_txt(file_content)
        <span class="hljs-keyword">elif</span> file_type == <span class="hljs-string">"pdf"</span>:
            <span class="hljs-keyword">return</span> <span class="hljs-keyword">await</span> extract_text_from_pdf(file_content)
        <span class="hljs-keyword">elif</span> file_type == <span class="hljs-string">"docx"</span>:
            <span class="hljs-keyword">return</span> <span class="hljs-keyword">await</span> extract_text_from_docx(file_content)
        <span class="hljs-keyword">else</span>:
            <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">400</span>, detail=<span class="hljs-string">"Unsupported file type"</span>)
</code></pre>
<p>    <strong>PDF Text Extraction</strong></p>
<p>    For PDF files, text extraction requires libraries like <code>PyPDF2</code>. Here's the asynchronous implementation:</p>
<pre><code class="lang-python">    <span class="hljs-comment"># Async version of the extract_text_from_pdf</span>
    <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">extract_text_from_pdf</span>(<span class="hljs-params">file_content: bytes</span>) -&gt; str:</span>
        <span class="hljs-string">"""
        Extract text from a PDF file.
        """</span>
        <span class="hljs-keyword">return</span> <span class="hljs-keyword">await</span> asyncio.to_thread(extract_text_from_pdf_sync, file_content)

      <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">extract_text_from_pdf_sync</span>(<span class="hljs-params">file_content: bytes</span>) -&gt; str:</span>
        <span class="hljs-string">"""
        Extract text from a PDF file (blocking version).
        """</span>
        content = <span class="hljs-string">""</span>
        pdf_reader = PyPDF2.PdfReader(file_content)
        num_pages = len(pdf_reader.pages)
        <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(num_pages):
            page = pdf_reader.pages[i]
            content += page.extract_text()
        <span class="hljs-keyword">return</span> content
</code></pre>
<p>    <strong>Indexing Documents</strong></p>
<p>    The <code>index_documents</code> function stores the extracted text in a Qdrant database, optimized for vector search and similarity queries.</p>
<pre><code class="lang-python">    <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">index_documents</span>(<span class="hljs-params">username,extracted_text,filename,file_extension</span>):</span>
        <span class="hljs-keyword">try</span>:
            indexer = DocumentIndexer(qdrant_db_path)
            start_time = time.time()
            logger.info(<span class="hljs-string">"Searching for similar documents in Qdrant..."</span>)

            <span class="hljs-keyword">await</span> indexer.index_in_qdrantdb(
                extracted_text=extracted_text,
                file_name=filename,
                doc_type=file_extension,
                chunk_size=<span class="hljs-number">1500</span>  
            )
            logger.info(<span class="hljs-string">f"Document indexing completed in <span class="hljs-subst">{time.time() - start_time:<span class="hljs-number">.2</span>f}</span> seconds"</span>)

        <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
            logger.error(<span class="hljs-string">f"Error processing documents: <span class="hljs-subst">{str(e)}</span>"</span>)
            <span class="hljs-keyword">raise</span> RuntimeError(<span class="hljs-string">f"Failed to process documents: <span class="hljs-subst">{str(e)}</span>"</span>)
</code></pre>
<p>    Refer to the GitHub Code at the end of this article for Text extraction from different sources.</p>
<ol start="2">
<li><p><strong>Chat Query Endpoint (</strong><code>/chat</code>)</p>
<ul>
<li><p>Accepts user queries and provides responses based on the ingested knowledge.</p>
</li>
<li><p>Handles previous session data to maintain conversational context.</p>
</li>
</ul>
</li>
</ol>
<pre><code class="lang-python"><span class="hljs-meta">    @app.post("/chat", response_model=ChatResponse)</span>
    <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">chat</span>(<span class="hljs-params">request: ChatRequest</span>):</span>
        <span class="hljs-keyword">try</span>:
            <span class="hljs-comment"># Process chat request</span>
            <span class="hljs-keyword">if</span> request.session_id <span class="hljs-keyword">is</span> <span class="hljs-keyword">not</span> <span class="hljs-literal">None</span>:
                past_messages = <span class="hljs-keyword">await</span> get_past_conversation_async(request.session_id)
            <span class="hljs-keyword">else</span>:
                request.session_id = str(uuid4())
                past_messages = []

            response, refined_query, extracted_documents = <span class="hljs-keyword">await</span> generate_chatbot_response(
                request.query, past_messages, request.no_of_chunks, request.username
            )
            <span class="hljs-keyword">await</span> add_conversation_async(request.session_id, request.query, response)
            <span class="hljs-keyword">return</span> {
                <span class="hljs-string">"username"</span>: request.username,
                <span class="hljs-string">"query"</span>: request.query,
                <span class="hljs-string">"refine_query"</span>: refined_query,
                <span class="hljs-string">"response"</span>: response,
                <span class="hljs-string">"session_id"</span>: request.session_id,
            }
        <span class="hljs-keyword">except</span> ValueError <span class="hljs-keyword">as</span> e:
            <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">400</span>, detail=str(e))
        <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
            logger.error(<span class="hljs-string">f"Error processing chat request: <span class="hljs-subst">{str(e)}</span>"</span>)
            <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">500</span>, detail=<span class="hljs-string">f"Unexpected error: <span class="hljs-subst">{e}</span>"</span>)
</code></pre>
<h2 id="heading-session-context-management"><strong>Session Context Management</strong></h2>
<p>    <strong>Fetching Past Conversations</strong></p>
<p>    To retain context, we retrieve past conversations from the SQLite database. Each session ID serves as a key to fetch previous interactions.</p>
<pre><code class="lang-python">    <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_past_conversation_async</span>(<span class="hljs-params">session_id: str</span>) -&gt; List[dict]:</span>
        start_time = asyncio.get_event_loop().time()
        messages = []

        <span class="hljs-keyword">try</span>:
            <span class="hljs-comment"># Open an async SQLite connection</span>
            <span class="hljs-keyword">async</span> <span class="hljs-keyword">with</span> aiosqlite.connect(<span class="hljs-string">"chat_log.db"</span>) <span class="hljs-keyword">as</span> connection:
                <span class="hljs-keyword">await</span> connection.execute(<span class="hljs-string">'''CREATE TABLE IF NOT EXISTS chat_logs (
                    session_id TEXT,
                    user_query TEXT,
                    gpt_response TEXT
                )'''</span>)
                logger.info(<span class="hljs-string">"Database schema ensured."</span>)

                <span class="hljs-comment"># Fetch chat logs for the given session_id</span>
                <span class="hljs-keyword">async</span> <span class="hljs-keyword">with</span> connection.execute(
                    <span class="hljs-string">"SELECT user_query, gpt_response FROM chat_logs WHERE session_id=?"</span>, (session_id,)
                ) <span class="hljs-keyword">as</span> cursor:
                    <span class="hljs-keyword">async</span> <span class="hljs-keyword">for</span> row <span class="hljs-keyword">in</span> cursor:
                        message_user = {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: row[<span class="hljs-number">0</span>]}
                        message_assistant = {<span class="hljs-string">"role"</span>: <span class="hljs-string">"assistant"</span>, <span class="hljs-string">"content"</span>: row[<span class="hljs-number">1</span>]}
                        messages.extend([message_user, message_assistant])

            elapsed_time = asyncio.get_event_loop().time() - start_time
            logger.info(<span class="hljs-string">f"History For Context (get_conversation): <span class="hljs-subst">{messages}</span> in <span class="hljs-subst">{elapsed_time:<span class="hljs-number">.2</span>f}</span>s"</span>)
            <span class="hljs-keyword">return</span> messages

        <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
            logger.exception(<span class="hljs-string">f"Error occurred: <span class="hljs-subst">{str(e)}</span>"</span>)
            <span class="hljs-keyword">raise</span> e
</code></pre>
<p>    <strong>Adding New Conversations</strong></p>
<p>    New conversations are stored in the database after processing. This ensures the chatbot can build upon prior interactions.</p>
<pre><code class="lang-python">    <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">add_conversation_async</span>(<span class="hljs-params">session_id, user_query, gpt_response</span>):</span>
        <span class="hljs-keyword">try</span>:
            <span class="hljs-comment"># Open an async SQLite connection</span>
            <span class="hljs-keyword">async</span> <span class="hljs-keyword">with</span> aiosqlite.connect(<span class="hljs-string">":memory:"</span>) <span class="hljs-keyword">as</span> connection:
                cursor = <span class="hljs-keyword">await</span> connection.cursor()

                <span class="hljs-comment"># Create table if it doesn't exist</span>
                <span class="hljs-keyword">await</span> cursor.execute(<span class="hljs-string">'''CREATE TABLE IF NOT EXISTS chat_logs (
                                            session_id TEXT,
                                            user_query TEXT,
                                            gpt_response TEXT)'''</span>)

                <span class="hljs-comment"># Insert new conversation</span>
                <span class="hljs-keyword">await</span> cursor.execute(<span class="hljs-string">"INSERT INTO chat_logs (session_id, user_query, gpt_response) VALUES (?, ?, ?)"</span>,
                                    (session_id, user_query, gpt_response))

                <span class="hljs-keyword">await</span> connection.commit()
                logger.info(<span class="hljs-string">f"Conversation added for session <span class="hljs-subst">{session_id}</span>"</span>)

        <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
            logger.exception(<span class="hljs-string">f"Error occurred while adding conversation: <span class="hljs-subst">{str(e)}</span>"</span>)
            <span class="hljs-keyword">raise</span> e
</code></pre>
<h3 id="heading-request-and-response-models"><strong>Request and Response Models</strong></h3>
<p>FastAPI leverages Pydantic models for robust data validation and serialization, ensuring input data adheres to the expected format. For example:</p>
<ul>
<li><p><code>ChatRequest</code> Model</p>
<ul>
<li>Defines the structure of incoming requests to the chat endpoint, including the <code>username</code>, <code>query</code>, and optional <code>session_id</code>.</li>
</ul>
</li>
</ul>
<pre><code class="lang-python">    <span class="hljs-keyword">from</span> pydantic <span class="hljs-keyword">import</span> BaseModel, field_validator
    <span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> Optional, List

    <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">ChatRequest</span>(<span class="hljs-params">BaseModel</span>):</span>
        username: str
        query: str
        session_id: Optional[str] = <span class="hljs-literal">None</span>
        no_of_chunks: Optional[int] = <span class="hljs-number">3</span>
</code></pre>
<ul>
<li><p><code>ChatResponse</code> Model</p>
<ul>
<li>Specifies the format of the API response, including the query, refined query, and chatbot response.</li>
</ul>
</li>
</ul>
<pre><code class="lang-python">    <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">ChatResponse</span>(<span class="hljs-params">BaseModel</span>):</span>
        username: str
        query: str
        refine_query: str
        response: str
        session_id: str
        debug_info: Optional[dict] = <span class="hljs-literal">None</span>
</code></pre>
<h2 id="heading-orchestrating-with-langchain"><strong>Orchestrating with LangChain</strong></h2>
<h3 id="heading-implementing-a-rag-chain"><strong>Implementing a RAG Chain</strong></h3>
<p>Combining vector search results with prompt engineering.</p>
<pre><code class="lang-python"><span class="hljs-meta">@ls.traceable(run_type="chain", name="Chat Pipeline")</span>
<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">generate_chatbot_response</span>(<span class="hljs-params">query, past_messages, no_of_chunks,username</span>):</span>
    <span class="hljs-string">"""Main function to generate chatbot responses asynchronously."""</span>
    logger.info(<span class="hljs-string">"Refining user query"</span>)
    refined_query = <span class="hljs-keyword">await</span> refine_user_query(query, past_messages)  <span class="hljs-comment"># Async call</span>
    logger.info(<span class="hljs-string">f"Generated refined query: <span class="hljs-subst">{refined_query}</span>"</span>)

    extracted_text_data, extracted_documents = <span class="hljs-keyword">await</span> retrieve_similar_documents(refined_query, int(no_of_chunks),username)  <span class="hljs-comment"># Async call</span>
    <span class="hljs-comment"># logger.info(f"Extracted text data: {extracted_text_data}")</span>
    logger.info(<span class="hljs-string">f"Extracted text data"</span>)


    llm = initialize_llm()  <span class="hljs-comment"># Synchronous initialization</span>
    history = create_history(past_messages)
    logger.info(<span class="hljs-string">f"Created history for session: <span class="hljs-subst">{history}</span>"</span>)

    logger.info(<span class="hljs-string">"Fetching response"</span>)
    start_time = time.time()
    final_response, cb = <span class="hljs-keyword">await</span> invoke_chain(query, extracted_text_data, history, llm)  <span class="hljs-comment"># Async call</span>
    response_time = time.time() - start_time

    <span class="hljs-comment"># logger.info(f"Got response from chain: {final_response}")</span>
    logger.info(<span class="hljs-string">f"Got response from chain:"</span>)

    <span class="hljs-keyword">return</span> final_response, response_time, cb.prompt_tokens, cb.completion_tokens, cb.total_tokens, extracted_text_data, refined_query, extracted_documents
</code></pre>
<h3 id="heading-query-refinement">Query Refinement</h3>
<p>User queries are often ambiguous, relying on prior interactions or chat history. To address this, we design a refinement mechanism to convert user input into</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_query_refiner_prompt</span>():</span>
    contextualize_q_system_prompt = (<span class="hljs-string">"""
    "Given a chat history and the latest user question "
    "which might reference context in the chat history, "
    "formulate a standalone question which can be understood "
    "without the chat history. Do NOT answer the question, "
    "just reformulate it if needed and otherwise return it as it is."
    """</span>)

    final_prompt = ChatPromptTemplate.from_messages(
        [
            (<span class="hljs-string">"system"</span>, contextualize_q_system_prompt),
            MessagesPlaceholder(variable_name=<span class="hljs-string">"messages"</span>),
            (<span class="hljs-string">"human"</span>,<span class="hljs-string">"{query}"</span>),
        ]
    )
    <span class="hljs-comment"># print(final_prompt)</span>
    <span class="hljs-keyword">return</span> final_prompt

  <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">refine_user_query</span>(<span class="hljs-params">query, messages</span>):</span>
    <span class="hljs-string">"""Refines the user query asynchronously."""</span>
    llm = ChatOpenAI(temperature=<span class="hljs-number">0</span>, model_name=<span class="hljs-string">"gpt-4o"</span>)
    history = create_history(messages)
    prompt = get_query_refiner_prompt()
    refined_query_chain = prompt | llm | StrOutputParser()
    refined_query = <span class="hljs-keyword">await</span> refined_query_chain.ainvoke({<span class="hljs-string">"query"</span>: query, <span class="hljs-string">"messages"</span>: history.messages})  <span class="hljs-comment"># Async method</span>
    <span class="hljs-keyword">return</span> refined_query
</code></pre>
<h3 id="heading-document-retrieval">Document Retrieval</h3>
<p>The refined query serves as input for a retrieval mechanism. Using Qdrant, we extract contextually similar documents for subsequent processing.</p>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">retrieve_similar_documents</span>(<span class="hljs-params">refined_query: str, num_of_chunks: int,username: str</span>) -&gt; str:</span>
    <span class="hljs-keyword">try</span>:
        indexer = DocumentIndexer(qdrant_db_path)
        start_time = time.time()
        logger.info(<span class="hljs-string">"Searching for similar documents in Qdrant..."</span>)

        <span class="hljs-keyword">if</span> num_of_chunks <span class="hljs-keyword">is</span> <span class="hljs-literal">None</span>:
            num_of_chunks = os.getenv(<span class="hljs-string">'no_of_chunks'</span>)
        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> isinstance(num_of_chunks, int) <span class="hljs-keyword">or</span> num_of_chunks &lt;= <span class="hljs-number">0</span>:
            <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">f"Invalid number of chunks: <span class="hljs-subst">{num_of_chunks}</span>"</span>)
        retriever = <span class="hljs-keyword">await</span> indexer.get_retriever(top_k=num_of_chunks)
        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> retriever:
            <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">"Failed to initialize document retriever"</span>)
        extracted_documents = <span class="hljs-keyword">await</span> retriever.ainvoke(refined_query)
        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> extracted_documents:
            extracted_text_data=<span class="hljs-string">""</span>
        <span class="hljs-keyword">else</span>:
            extracted_text_data = <span class="hljs-keyword">await</span> format_docs(extracted_documents)
        logger.info(<span class="hljs-string">f"Document retrieval and formatting completed in <span class="hljs-subst">{time.time() - start_time:<span class="hljs-number">.2</span>f}</span> seconds"</span>)
        <span class="hljs-keyword">return</span> extracted_text_data, extracted_documents

    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        logger.error(<span class="hljs-string">f"Error processing documents: <span class="hljs-subst">{str(e)}</span>"</span>)
        <span class="hljs-keyword">raise</span> RuntimeError(<span class="hljs-string">f"Failed to process documents: <span class="hljs-subst">{str(e)}</span>"</span>)
</code></pre>
<h3 id="heading-prompt-engineering"><strong>Prompt Engineering</strong></h3>
<p>Effective prompts are the backbone of any LLM-based pipeline. We design a system prompt that combines user inputs with the retrieved context.</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_main_prompt</span>():</span>
    prompt = <span class="hljs-string">""" 
    "You are an assistant for question-answering tasks."
    "Use the following pieces of retrieved context and user information to answer the question."
    "If you don't find the answer of the query,then just say I don't have that information at hand. Please provide more details or check your sources."
    """</span>
    prompt=prompt + <span class="hljs-string">"\\n\\n"</span> + <span class="hljs-string">"{context}"</span>

    final_prompt = ChatPromptTemplate.from_messages(
    [
        (<span class="hljs-string">"system"</span>, prompt),
        MessagesPlaceholder (variable_name=<span class="hljs-string">"messages"</span>),
        (<span class="hljs-string">"human"</span>, <span class="hljs-string">"{user_query}"</span>)
    ])
    <span class="hljs-keyword">return</span> final_prompt

<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">invoke_chain</span>(<span class="hljs-params">query, context, history, llm</span>):</span>
    <span class="hljs-string">"""Handles the streamed response asynchronously."""</span>
    logger.info(<span class="hljs-string">f"Initializing Chain using ..."</span>)
    final_chain = get_main_prompt() | llm | StrOutputParser()
    logger.info(<span class="hljs-string">"Chain initialized."</span>)
    input_data = {<span class="hljs-string">"user_query"</span>: query, <span class="hljs-string">"context"</span>: context, <span class="hljs-string">"messages"</span>: history.messages}

    <span class="hljs-keyword">with</span> get_openai_callback() <span class="hljs-keyword">as</span> cb:
        final_response = <span class="hljs-keyword">await</span> final_chain.ainvoke(input_data)  <span class="hljs-comment"># Asynchronous method</span>

    <span class="hljs-keyword">return</span> final_response, cb
</code></pre>
<p>Learn how to leverage <strong>LangChain Expression Language (LCEL)</strong> for seamless chain composition, including prompt formatting, retrieval-augmented generation (RAG), and efficient batching, with practical examples. Discover how LCEL simplifies building advanced LLM applications with features like streaming, parallelism, and async support! Checkout the video below</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=NQWfvhw7OcI&amp;list=PLAMHV77MSKJ7Pn_OwuGzbDPs_MOibBRP-&amp;index=8">https://www.youtube.com/watch?v=NQWfvhw7OcI&amp;list=PLAMHV77MSKJ7Pn_OwuGzbDPs_MOibBRP-&amp;index=8</a></div>
<p> </p>
<h2 id="heading-future-improvements"><strong>Future Improvements</strong></h2>
<ul>
<li><p><strong>Enhanced Document Preprocessing</strong>: Implement advanced text chunking methods incorporating summarization, document-based chunking, semantic and agentic chunking, and multimodal support.</p>
</li>
<li><p><strong>Storage:</strong> Transition from in-memory to persistent storage ensures data durability across sessions. Migrating to <strong>async PostgreSQL</strong> enhances scalability and performance for larger datasets and higher user concurrency.</p>
</li>
<li><p><strong>Dynamic Few-Shot Learning</strong>: Automatically generate examples based on query type.</p>
</li>
<li><p><strong>Adaptive Retrieval</strong>: Use feedback loops to improve retrieval accuracy over time.</p>
</li>
<li><p><strong>Real-Time User Feedback</strong>: Allow users to fine-tune the response in real-time.</p>
</li>
</ul>
<h2 id="heading-conclusion"><strong>Conclusion</strong></h2>
<p>Through our extensive experience with <strong>asynchronous FastAPI</strong> and building RAG systems, we have successfully optimized every operation in the pipeline to work seamlessly in an asynchronous manner. From document ingestion and indexing in <strong>Qdrant</strong> to efficient retrieval of relevant context and history storage, we have made each operation highly efficient by adopting <strong>async-first</strong> principles.</p>
<p>By adopting this approach, developers can craft intelligent systems capable of providing contextually accurate and highly relevant responses. From document indexing to dynamic query refinement and real-time conversational AI, the RAG architecture represents a significant leap forward in harnessing the capabilities of large language models.</p>
<p>At FutureSmart AI, we specialize in delivering state-of-the-art AI solutions tailored to the unique needs of businesses. Leveraging advanced technologies such as RAG, NL2SQL, <a target="_blank" href="https://blog.futuresmart.ai/multi-agent-system-with-langgraph">multi-agent architectures</a>, LangChain, LangGraph, Qdrant, Chroma vector databases, and OpenAI, we have successfully implemented cutting-edge systems for multiple clients—from intelligent customer service automation to advanced AI-driven interview platforms.</p>
<p>If you want to leverage the power of <strong>AI Applications with asynchronous FastAPI</strong>, we’re here to help. Reach out to us at <a target="_blank" href="mailto:contact@futuresmart.ai"><strong>contact@futuresmart.ai</strong></a> and discover how our experience can translate into practical, cutting-edge solutions tailored for your needs.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text"><a target="_self" href="https://github.com/PradipNichite/FutureSmart-AI-Blog/tree/main/Langchain%20RAG%20using%20Async%20Fastapi%20and%20Qdrant"><strong>Get the Full Code in our GitHub</strong></a></div>
</div>]]></content:encoded></item><item><title><![CDATA[Building an Async Similarity Search System from scratch with FastAPI and Qdrant VectorDB]]></title><description><![CDATA[Introduction
Search engines and retrieval systems have evolved to become remarkably intelligent. They no longer rely on exact keyword matches or rigid rules to find what you're looking for. Instead, they understand the context and meaning behind your...]]></description><link>https://blog.futuresmart.ai/building-an-async-similarity-search-system-from-scratch-with-fastapi-and-qdrant-vectordb</link><guid isPermaLink="true">https://blog.futuresmart.ai/building-an-async-similarity-search-system-from-scratch-with-fastapi-and-qdrant-vectordb</guid><category><![CDATA[qdrant]]></category><category><![CDATA[vector database]]></category><category><![CDATA[FastAPI]]></category><category><![CDATA[asynchronous]]></category><dc:creator><![CDATA[Pruthviraj Mahalunge]]></dc:creator><pubDate>Fri, 31 Jan 2025 05:57:48 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1738255394578/53ec63cb-8d4a-4fa1-966f-bf4e3f95c02f.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1 id="heading-introduction">Introduction</h1>
<p>Search engines and retrieval systems have evolved to become remarkably intelligent. They no longer rely on exact keyword matches or rigid rules to find what you're looking for. Instead, they understand the <strong>context</strong> and <strong>meaning</strong> behind your query, allowing you to retrieve information or objects from a database with just a vague hint. For example, consider an e-commerce platform where users search for products. A traditional keyword-based search might struggle if the search terms do not exactly match product descriptions. However, with semantic search, a user looking for "comfortable running shoes for long-distance" can receive relevant results, even if the product descriptions contain phrases like "marathon-ready sneakers with superior cushioning.”</p>
<p>The capabilities of semantic search systems extend far beyond text-based queries. By taking it a step further, you can enable users to search not just with text, but also with images, voice, and other forms of input. <strong>Semantic search</strong> relies on <strong>vector embeddings</strong>, which capture the deeper relationships between different data points to deliver highly relevant and contextually accurate results. This makes it a powerful tool for a wide range of applications, from <strong>E-commerce</strong> and <strong>Media &amp; Entertainment</strong> to <strong>Smart Assistants</strong>. Let’s explore the key components needed to build such a system and unlock its potential across diverse industries.</p>
<h3 id="heading-implementing-a-semantic-search-system">Implementing a Semantic Search system</h3>
<p>In this blog, we’ll walk you through implementing a <strong>semantic search system</strong> using <strong>Async FastAPI</strong>, <strong>Qdrant</strong>, and <strong>Sentence-Transformers</strong> to deliver efficient, context-aware search capabilities. Among the many vector database options available, we’ve chosen <strong>Qdrant</strong> for its exceptional support for <strong>fast, asynchronous retrieval</strong>, ensuring high performance and scalability—perfect for real-time applications. Let’s understand how it goes.</p>
<h1 id="heading-technologies-well-use">Technologies We’ll use</h1>
<p>To build this system, we’ll leverage:</p>
<ul>
<li><p><strong>Qdrant</strong>: A powerful, open-source vector database for efficient similarity searches.</p>
</li>
<li><p><strong>Sentence-Transformers</strong>: For generating high-quality embeddings from text data.</p>
</li>
<li><p><strong>Async FastAPI</strong>: To create a lightweight, asynchronous API for seamless query handling.</p>
</li>
</ul>
<h3 id="heading-qdrant">Qdrant</h3>
<p>Qdrant is an advanced vector database designed to store and search high-dimensional vector embeddings. Its scalability and efficiency make it ideal for semantic search use cases, providing powerful support for similarity-based queries. We are using Qdrant, not only for it’s speed and accuracy, but also because of it’s async nature. That allows us to reduce memory usage, allow support for multiple users simultaneously, improving the number of concurrent requests that can be handled. You can read more about the Qdrant Async API and what it bring to the table in this <a target="_blank" href="https://qdrant.tech/documentation/database-tutorials/async-api/">Documentation</a>.</p>
<p>After thoroughly evaluating Qdrant and adopting it into <a target="_blank" href="https://futuresmart.ai"><strong>FutureSmart AI's</strong></a> workflows, we’ve seen the value it brings firsthand. We encourage you to <a target="_blank" href="https://try.qdrant.tech/high-performance-vector-search">explore Qdrant</a> further and visit their official page to learn more about their offerings and capabilities.</p>
<h3 id="heading-sentence-transformers">Sentence-Transformers</h3>
<p>Sentence-Transformers is a powerful library designed to transform text into dense vector embeddings, enabling a deeper semantic understanding of textual data. These high-dimensional vectors are essential for tasks like semantic search, similarity matching, and clustering.</p>
<p>Want to explore Sentence-Transformers in action? <a target="_blank" href="https://youtu.be/OlhNZg4gOvA?si=ICcsTZoW-9kIq8S0">YouTube Tutorial</a> , for a hands-on guide to dive deeper into real-world applications. 🚀</p>
<h3 id="heading-fastapi">Fastapi</h3>
<p>FastAPI is a cutting-edge, high-performance web framework for building APIs with Python. It leverages asynchronous programming for creating fast, scalable applications and uses Python type hints to ensure clarity, robustness, and developer productivity.</p>
<p>Want to master FastAPI? <a target="_blank" href="https://youtu.be/KVdP4SpWcc4?si=ILIRI398bMPlugv6">FastAPI Tutorial</a> for step-by-step guidance to understand its core features and best practices. 🌟</p>
<h1 id="heading-why-use-qdrant-for-semantic-search">Why Use Qdrant for Semantic Search?</h1>
<p><strong>Qdrant</strong> is a high-performance vector database designed specifically to power semantic search systems. It allows efficient and scalable storage and retrieval of embeddings, which represent the meaning of data points as vectors.</p>
<h3 id="heading-key-advantages-of-using-qdrant">Key Advantages of Using Qdrant :</h3>
<ul>
<li><p><strong>Efficient Vector Search:</strong> 🚀 Qdrant is optimized for handling large-scale vector search queries with low latency, making it ideal for real-time semantic search applications.</p>
</li>
<li><p><strong>Scalable Architecture:</strong> 🌍 Built for horizontal scaling, Qdrant can support massive datasets while maintaining quick query responses, ensuring it handles growing data demands efficiently.</p>
</li>
<li><p><strong>Real-Time Updates:</strong> 🔄 Qdrant supports real-time indexing and updating of vector embeddings, allowing quick integration of new data points as they become available.</p>
</li>
<li><p><strong>Advanced Indexing Techniques:</strong> 🔍 Qdrant uses efficient indexing algorithms like HNSW (Hierarchical Navigable Small World), making vector searches fast even with a huge volume of data.</p>
</li>
<li><p><strong>Seamless Integration:</strong> 🔗 Qdrant integrates easily with frameworks like FastAPI, enabling smooth deployment of semantic search systems in modern applications.</p>
</li>
</ul>
<h1 id="heading-why-use-async">Why Use Async?</h1>
<p>Incorporating <strong>asynchronous processing</strong> enhances the efficiency and scalability of semantic search. By allowing multiple search queries to be processed concurrently without blocking the system, async offers significant benefits, particularly for large-scale applications.</p>
<h3 id="heading-benefits-of-async-in-semantic-search-with-qdrant">Benefits of Async in Semantic Search with Qdrant:</h3>
<ul>
<li><p><strong>High-Volume Applications:</strong> 📈 Async enables high-throughput systems, allowing efficient handling of large numbers of simultaneous search queries—critical for large-scale or enterprise-level applications.</p>
</li>
<li><p><strong>Low Latency and Seamless User Experience:</strong> ⏱️ Non-blocking asynchronous operations reduce response times, improving the user experience by minimizing delays in real-time semantic search, especially when powered by Qdrant.</p>
</li>
<li><p><strong>Optimized Resource Utilization:</strong> ⚙️ Async operations maximize system efficiency by handling multiple tasks concurrently, leveraging non-blocking I/O, and minimizing resource wastage.</p>
</li>
<li><p><strong>Native Support in FastAPI:</strong> 🌐 FastAPI’s built-in async support makes it a natural fit for building scalable, high-performance semantic search systems.</p>
</li>
</ul>
<h1 id="heading-understanding-the-workflow">Understanding the Workflow</h1>
<ol>
<li><p>The user submits a search query via an <strong>asynchronous FastAPI endpoint</strong>.</p>
</li>
<li><p><strong>Qdrant</strong> processes the query and performs <strong>semantic similarity matching</strong> using precomputed embeddings.</p>
</li>
<li><p>Results are ranked by <strong>relevance</strong>, ensuring accurate information delivery.</p>
</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1738219970526/9c80013f-eaa5-4843-a200-dd9d3232d3f7.png" alt class="image--center mx-auto" /></p>
<hr />
<h1 id="heading-prerequisites">Prerequisites</h1>
<h2 id="heading-1-system-setup">1. System Setup</h2>
<p>Before you begin, ensure that the following components are installed:</p>
<h3 id="heading-11-python">1.1 Python</h3>
<ul>
<li>Ensure you're using <strong>Python 3.8 or later</strong>.</li>
</ul>
<h3 id="heading-12-qdrant">1.2 Qdrant</h3>
<ul>
<li><p>Qdrant should be up and running, either locally or hosted on a cloud service.</p>
</li>
<li><p>If you're unfamiliar with <strong>Qdrant</strong> or need assistance with its setup, check out this in-depth guide we wrote as first part of our series. <a target="_blank" href="https://www.notion.so/Comprehensive-guide-to-Qdrant-Vector-DB-Installation-and-Setup-78ec78e781e34e47b5a7cefde156ce24?pvs=21"><strong>Comprehensive Guide to Installing and Using Qdrant VectorDB with Docker Server and Local Setup</strong></a>.</p>
</li>
</ul>
<h3 id="heading-13-required-libraries">1.3 Required Libraries</h3>
<ul>
<li><p>Install the necessary libraries using the following command:</p>
<pre><code class="lang-bash">  pip install fastapi uvicorn sentence-transformers qdrant-client
</code></pre>
</li>
</ul>
<hr />
<h1 id="heading-project-structure">Project Structure</h1>
<p>Here’s the project structure:</p>
<pre><code class="lang-plaintext">plaintextCopy code
semantic-search-system/
├── app/
│   ├── __init__.py
│   ├── main.py          # FastAPI application and routes
│   ├── utils.py         # Helper functions (e.g., read_and_store_data)
│   ├── qdrant_utils.py  # Helper function to upsert data to Qdrant
│   └── models.py        # Define Pydantic models
├── vehicle.csv          # CSV file with vehicle data
└── requirements.txt     # List of dependencies
</code></pre>
<hr />
<h1 id="heading-adding-data-to-qdrant-via-csv">Adding Data to Qdrant via CSV</h1>
<h2 id="heading-1-csv-file-preparation">1. CSV File Preparation</h2>
<p>The data from the CSV file will populate Qdrant for semantic search vectors. Here is a sample of the vehicle.csv file, which contains the following columns:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>id</td><td>name</td><td>category</td><td>description</td></tr>
</thead>
<tbody>
<tr>
<td>1</td><td>Toyota Camry</td><td>Sedan</td><td>A reliable and fuel-efficient car</td></tr>
<tr>
<td>2</td><td>Ford Explorer</td><td>SUV</td><td>A spacious SUV ideal for family trips</td></tr>
<tr>
<td>3</td><td>Chevrolet Silverado</td><td>Pickup Truck</td><td>A rugged and versatile pickup truck, perfect for heavy-duty tasks</td></tr>
</tbody>
</table>
</div><p>This .csv file should be saved in your project directory for easy access during data import.</p>
<h2 id="heading-2-initializing-the-qdrant-client">2. Initializing the Qdrant Client</h2>
<p>Before inserting data into Qdrant, initialize the Qdrant client with the AsyncQdrantClient from the qdrant-client library. The create_qdrant_collection function will connect to Qdrant and create a collection for your data, provided one doesn't already exist. A <strong>vector size</strong> and a <strong>distance metric</strong> (Cosine similarity in this case) need to be defined for accurate semantic search.</p>
<p><code>app/qdrant_utils.py</code>:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> qdrant_client
<span class="hljs-keyword">from</span> qdrant_client.models <span class="hljs-keyword">import</span> VectorParams
<span class="hljs-keyword">from</span> app.config <span class="hljs-keyword">import</span> QDRANT_COLLECTION

<span class="hljs-comment"># Initialize Qdrant client</span>
qdrant = AsyncQdrantClient(<span class="hljs-string">"&lt;http://localhost:6333&gt;"</span>)

<span class="hljs-comment"># Create collection in Qdrant vector database</span>
<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create_qdrant_collection</span>():</span>
    <span class="hljs-keyword">await</span> qdrant.create_collection(
        collection_name=<span class="hljs-string">"vehicles"</span>,  
        vectors_config=VectorParams(size=<span class="hljs-number">384</span>, distance=<span class="hljs-string">"Cosine"</span>) 
    )
</code></pre>
<ul>
<li><p><strong>Vector Size</strong>: The dimensionality of the embedding vector, which should match the output from the Sentence-Transformer model. Here, we choose 384 dimensions for the model.</p>
</li>
<li><p><strong>Cosine</strong>: A similarity measure used to compute how similar two vectors are, which is perfect for comparing semantic meaning.</p>
</li>
</ul>
<h2 id="heading-3-async-data-insertion">3. Async Data Insertion</h2>
<p>Now that we have set up the Qdrant collection, it's time to insert data from the CSV file. Each textual column in the CSV (name, category, description) is passed through the Sentence-Transformer model to create vector embeddings, which are then stored in Qdrant.</p>
<p><code>app/utils.py:</code></p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd
<span class="hljs-keyword">from</span> sentence_transformers <span class="hljs-keyword">import</span> SentenceTransformer

<span class="hljs-comment"># Initialize the Sentence-Transformer model globally</span>
model = SentenceTransformer(<span class="hljs-string">"all-MiniLM-L6-v2"</span>)

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">generate_vector</span>(<span class="hljs-params">text: str</span>):</span>
    <span class="hljs-string">"""Generate vector embeddings for a given text."""</span>
    <span class="hljs-keyword">return</span> model.encode(text).tolist()

<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">upsert_data_to_qdrant</span>(<span class="hljs-params">qdrant, df</span>):</span>
    <span class="hljs-string">"""Upsert data from DataFrame to Qdrant vector database."""</span>
    vectors = [
        generate_vector(<span class="hljs-string">f"<span class="hljs-subst">{row[<span class="hljs-string">'name'</span>]}</span> <span class="hljs-subst">{row[<span class="hljs-string">'category'</span>]}</span> <span class="hljs-subst">{row[<span class="hljs-string">'description'</span>]}</span>"</span>)
        <span class="hljs-keyword">for</span> _, row <span class="hljs-keyword">in</span> df.iterrows()
    ]
    <span class="hljs-keyword">try</span>:
        <span class="hljs-comment"># Insert the data into Qdrant</span>
        <span class="hljs-keyword">await</span> qdrant.upsert(
            collection_name=<span class="hljs-string">"vehicles"</span>, 
            points=[{
                 <span class="hljs-string">"id"</span>: int(row[<span class="hljs-string">'id'</span>]),
                 <span class="hljs-string">"vector"</span>: vector,
                 <span class="hljs-string">"payload"</span>: {<span class="hljs-string">"name"</span>: row[<span class="hljs-string">'name'</span>], 
                             <span class="hljs-string">"category"</span>: row[<span class="hljs-string">'category'</span>], 
                             <span class="hljs-string">"description"</span>: row[<span class="hljs-string">'description'</span>]}
                }
                <span class="hljs-keyword">for</span> vector, (_, row) <span class="hljs-keyword">in</span> zip(vectors, df.iterrows())
            ]
        )
    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        print(<span class="hljs-string">f"Error inserting data into Qdrant: <span class="hljs-subst">{e}</span>"</span>)
        <span class="hljs-keyword">raise</span> e

<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">read_and_store_data</span>(<span class="hljs-params">csv_file_path, qdrant</span>):</span>
    <span class="hljs-string">"""Reads CSV and upserts into Qdrant."""</span>
    df = pd.read_csv(csv_file_path)  <span class="hljs-comment"># Load CSV into a DataFrame</span>

    <span class="hljs-comment"># Upsert to Qdrant</span>
    <span class="hljs-keyword">await</span> upsert_data_to_qdrant(qdrant, df)
</code></pre>
<p>This approach ensures efficient handling of large datasets with asynchronous processing. The <code>generate_vector</code> function utilizes the <code>all-MiniLM-L6-v2</code> Sentence-Transformer model to create meaningful vector embeddings by combining the name, category, and description fields. These embeddings capture the semantic meaning of each vehicle, improving search accuracy.</p>
<p>By using Qdrant's upsert operation, we prevent duplicate entries while allowing seamless data updates. Additionally, you can manage and inspect your Qdrant collection using the Qdrant dashboard, where you can view inserted data, monitor collection health, and perform advanced searches.</p>
<p>For learning more about accessing the Qdrant Web UI, you can refer to our previous blog : <a target="_blank" href="https://blog.futuresmart.ai/comprehensive-guide-to-qdrant-vector-db-installation-and-setup#heading-accessing-the-qdrant-web-ui">Comprehensive Guide to Qdrant Vector DB: Installation and Setup.</a></p>
<h1 id="heading-building-the-semantic-search-api">Building the Semantic Search API</h1>
<h2 id="heading-pydantic-model">Pydantic Model</h2>
<p>Pydantic models provide a robust way to validate and serialize data, and they are integral to FastAPI's request handling. We define a <code>QueryRequest</code> model to structure incoming data for the search endpoint. This model will specify the expected query string and the limit for the number of search results:</p>
<p><code>app/models.py:</code></p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> pydantic <span class="hljs-keyword">import</span> BaseModel

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">QueryRequest</span>(<span class="hljs-params">BaseModel</span>):</span>
    query: str  <span class="hljs-comment"># The search query entered by the user</span>
    limit: int  <span class="hljs-comment"># The number of search results to return</span>
</code></pre>
<h2 id="heading-qdrant-search-function">Qdrant Search Function</h2>
<p>For semantic search, we leverage <strong>Qdrant</strong>, a vector database. The function generates a query vector from the input query using the SentenceTransformer model, which translates the user's query into a vector representation. We then search Qdrant for vectors similar to the query vector.</p>
<p><code>app/search.py:</code></p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> app.qdrant_utils <span class="hljs-keyword">import</span> qdrant
<span class="hljs-keyword">from</span> sentence_transformers <span class="hljs-keyword">import</span> SentenceTransformer

model = SentenceTransformer(<span class="hljs-string">"all-MiniLM-L6-v2"</span>)

<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">qdrant_search</span>(<span class="hljs-params">query: str, limit: int</span>):</span>
    <span class="hljs-comment"># Convert the user's query into a vector</span>
    vector = model.encode(query).tolist()

    search_result = <span class="hljs-keyword">await</span> qdrant.search(
                   collection_name=<span class="hljs-string">"vehicles"</span>,  <span class="hljs-comment"># The collection we're searching in</span>
                   query_vector=vector,
                   limit=limit  <span class="hljs-comment"># Limit the number of results returned by Qdrant</span>
                 )
    <span class="hljs-comment"># Format the search results for consistency</span>
    results = [{
                    <span class="hljs-string">"id"</span>: res.id,
                    <span class="hljs-string">"name"</span>: res.payload.get(<span class="hljs-string">"name"</span>),
                    <span class="hljs-string">"category"</span>: res.payload.get(<span class="hljs-string">"category"</span>),
                    <span class="hljs-string">"description"</span>: res.payload.get(<span class="hljs-string">"description"</span>),
                    <span class="hljs-string">"score"</span>: res.score  <span class="hljs-comment"># Semantic similarity score</span>
                } <span class="hljs-keyword">for</span> res <span class="hljs-keyword">in</span> search_result]

    <span class="hljs-keyword">return</span> results
</code></pre>
<h2 id="heading-fastapi-endpoint-for-search">FastAPI Endpoint for Search</h2>
<p>Now, we implement the FastAPI endpoint that ties everything together. This endpoint listens for a POST request at <code>/semantic_search</code>, where it receives a query and a limit for the results. It fetches results from Qdrant using vector-based semantic search.</p>
<h3 id="heading-api-definition">API Definition:</h3>
<p><code>app/main.py:</code></p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> fastapi <span class="hljs-keyword">import</span> FastAPI, HTTPException
<span class="hljs-keyword">from</span> app.search <span class="hljs-keyword">import</span> qdrant_search
<span class="hljs-keyword">from</span> app.models <span class="hljs-keyword">import</span> QueryRequest
<span class="hljs-keyword">from</span> app.qdrant_utils <span class="hljs-keyword">import</span> qdrant
<span class="hljs-keyword">from</span> app.utils <span class="hljs-keyword">import</span> read_and_store_data

<span class="hljs-comment"># FastAPI lifecycle event to initialize Qdrant</span>
<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">async_lifespan</span>(<span class="hljs-params">app: FastAPI</span>):</span>
    <span class="hljs-keyword">await</span> create_qdrant_collection()  <span class="hljs-comment"># Ensures the Qdrant collection exists</span>
    <span class="hljs-keyword">await</span> read_and_store_data(<span class="hljs-string">'vehicle.csv'</span>, qdrant)  <span class="hljs-comment"># Load initial data</span>
    <span class="hljs-keyword">yield</span>  <span class="hljs-comment"># Yield control, FastAPI will now handle incoming requests</span>

app = FastAPI(lifespan=async_lifespan)

<span class="hljs-meta">@app.post("/semantic_search")</span>
<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">search</span>(<span class="hljs-params">query_request: QueryRequest</span>):</span>
    <span class="hljs-comment"># Perform semantic search in Qdrant</span>
    results = <span class="hljs-keyword">await</span> qdrant_search(query_request.query, query_request.limit)

    <span class="hljs-comment"># Return the results</span>
    <span class="hljs-keyword">return</span> {<span class="hljs-string">"results"</span>: results}
</code></pre>
<h3 id="heading-how-this-works">How This Works:</h3>
<ol>
<li><p><strong>Qdrant Search</strong>: The function queries Qdrant for results based on semantic similarity using vector-based search.</p>
</li>
<li><p><strong>Return Results</strong>: The search results from Qdrant are returned, respecting the limit set by the user.</p>
</li>
</ol>
<h1 id="heading-running-and-testing-the-application">Running and Testing the Application</h1>
<p>To run the application, use <strong>Uvicorn</strong>, a fast ASGI server:</p>
<pre><code class="lang-bash">uvicorn main:app --reload
</code></pre>
<p>Once the server is running, you can use the <code>/semantic_search</code> endpoint to query the Qdrant database.</p>
<p>If everything is configured correctly, you will receive semantic search results similar to the example shown below.</p>
<p><strong>Demo :</strong></p>
<center>
  <iframe width="800" height="450" src="https://www.youtube.com/embed/blSUFbqYtbI?si=3EGRaaW32x9JTL-3"></iframe>
</center>

<h1 id="heading-example-query-amp-result-breakdown">Example Query &amp; Result Breakdown</h1>
<h2 id="heading-sample-query">Sample Query :</h2>
<p>Make a POST request to<code>/semantic_search</code> with the following payload<code>:</code></p>
<pre><code class="lang-json">{
<span class="hljs-attr">"query"</span>: <span class="hljs-string">"sporty bikes"</span>,
<span class="hljs-attr">"limit"</span>: <span class="hljs-number">3</span>
}
</code></pre>
<h2 id="heading-expected-output">Expected Output :</h2>
<pre><code class="lang-json">{
  <span class="hljs-attr">"results"</span>: [
    {
      <span class="hljs-attr">"id"</span>: <span class="hljs-number">17</span>,
      <span class="hljs-attr">"name"</span>: <span class="hljs-string">"Ducati Monster"</span>,
      <span class="hljs-attr">"category"</span>: <span class="hljs-string">"Motorcycle"</span>,
      <span class="hljs-attr">"description"</span>: <span class="hljs-string">"A sporty motorcycle with a distinctive design and powerful performance."</span>,
      <span class="hljs-attr">"score"</span>: <span class="hljs-number">0.5491828</span>
    },
    {
      <span class="hljs-attr">"id"</span>: <span class="hljs-number">29</span>,
      <span class="hljs-attr">"name"</span>: <span class="hljs-string">"Yamaha YZF-R3"</span>,
      <span class="hljs-attr">"category"</span>: <span class="hljs-string">"Motorcycle"</span>,
      <span class="hljs-attr">"description"</span>: <span class="hljs-string">"A lightweight sport motorcycle ideal for beginners and experienced riders alike."</span>,
      <span class="hljs-attr">"score"</span>: <span class="hljs-number">0.52757794</span>
    },
    {
      <span class="hljs-attr">"id"</span>: <span class="hljs-number">9</span>,
      <span class="hljs-attr">"name"</span>: <span class="hljs-string">"Harley-Davidson Street 750"</span>,
      <span class="hljs-attr">"category"</span>: <span class="hljs-string">"Motorcycle"</span>,
      <span class="hljs-attr">"description"</span>: <span class="hljs-string">"A cruiser motorcycle with a classic design and powerful engine."</span>,
      <span class="hljs-attr">"score"</span>: <span class="hljs-number">0.47901446</span>
    }
  ]
}
</code></pre>
<h1 id="heading-advantage-with-semantic-search">Advantage with Semantic Search</h1>
<p>Unlike traditional keyword-based search, which depends on exact keyword matches, <strong>vector search</strong> retrieves results based on <strong>semantic similarity</strong>. This means the system focuses on the intent behind queries and the contextual meaning of terms, providing a more intelligent and effective search experience for modern applications.</p>
<p><strong>Example Query:</strong> "<em>sporty bikes</em> "</p>
<p><strong>Traditional Keyword Search:</strong></p>
<ul>
<li><p>Returns only results that explicitly include the word <strong>"bikes"</strong> in their name or description.</p>
</li>
<li><p>Misses relevant items such as <strong>Ducati Monster</strong>, a motorcycle not explicitly labeled as "<strong>sporty bikes.</strong>"</p>
</li>
</ul>
<p><strong>Vector Search (Semantic Approach):</strong></p>
<ul>
<li><p>Recognizes that <strong>"bikes"</strong> and <strong>"motorcycles"</strong> are semantically similar and retrieves items that are contextually related, like <strong>sporty motorcycles</strong>, even when the word "bikes" is not directly mentioned.</p>
</li>
<li><p>The system also ranks results using <strong>relevance scores</strong>, indicating how closely each result aligns with the intent behind the query.</p>
</li>
</ul>
<h2 id="heading-comparison-of-search-approaches">Comparison of Search Approaches</h2>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Search Type</strong></td><td><strong>Search Process</strong></td><td><strong>Result Quality</strong></td></tr>
</thead>
<tbody>
<tr>
<td><strong>Keyword-Based (Traditional)</strong></td><td>Matches only exact terms in queries and results</td><td>Limited, misses semantically related results</td></tr>
<tr>
<td><strong>Vector Search (Semantic)</strong></td><td>Focuses on contextual and semantic meaning</td><td>Relevant, context-aware results</td></tr>
</tbody>
</table>
</div><h1 id="heading-improvements">Improvements</h1>
<ul>
<li><p>You can enhance the performance of the semantic search system by fine-tuning the embedding model on domain-specific data to improve result relevance.</p>
</li>
<li><p>You can experiment with multimodal search, allowing for querying with images, voice, or other media types, expanding the search functionality.</p>
</li>
<li><p>Adding filtering capabilities would allow for more precise search results, enabling users to refine searches based on metadata like date ranges.</p>
</li>
</ul>
<h1 id="heading-conclusion">Conclusion</h1>
<p>Through this step-by-step guide, you've learned how to build a semantic search system with FastAPI and Qdrant, integrating modern vector-based search techniques to create an efficient and scalable solution. By leveraging embeddings, you can enhance search relevance and deliver more accurate results for your applications.</p>
<p>But this is just the beginning. As you continue exploring Qdrant, consider diving into its advanced capabilities, such as hybrid search and large language model integrations, to further refine your search experience.</p>
<p>At <strong>FutureSmart AI</strong>, we specialize in developing AI-driven solutions tailored to your business needs, from semantic search to custom AI Solutions. If you're looking for expert guidance or custom AI solutions, reach out to us at <strong>contact@futuresmart.ai</strong>.</p>
<p>Let’s build the future of AI together! 🚀</p>
]]></content:encoded></item><item><title><![CDATA[Comprehensive guide to Qdrant Vector DB: Installation and Setup]]></title><description><![CDATA[Introduction:
In the era of AI-driven applications and unstructured data management, vector databases have become essential tools for enabling semantic search and similarity matching. Qdrant is one such cutting-edge, open-source vector database desig...]]></description><link>https://blog.futuresmart.ai/comprehensive-guide-to-qdrant-vector-db-installation-and-setup</link><guid isPermaLink="true">https://blog.futuresmart.ai/comprehensive-guide-to-qdrant-vector-db-installation-and-setup</guid><category><![CDATA[semantic search]]></category><category><![CDATA[vector database]]></category><category><![CDATA[qdrant]]></category><dc:creator><![CDATA[Shreyas Dhaware]]></dc:creator><pubDate>Fri, 24 Jan 2025 13:56:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1736763417210/95bc05d9-5e54-4a33-9497-ea12f12a15a4.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction"><strong>Introduction:</strong></h2>
<p>In the era of AI-driven applications and unstructured data management, <strong>vector databases</strong> have become essential tools for enabling semantic search and similarity matching. <a target="_blank" href="https://qdrant.tech/"><strong>Qdrant</strong></a> is one such cutting-edge, <strong>open-source vector database</strong> designed to simplify working with high-dimensional data embeddings. Whether you're building a recommendation system, integrating semantic search, or powering an AI chatbot, Qdrant makes it seamless to store, query, and manage vector embeddings at scale.</p>
<p>At <a target="_blank" href="https://futuresmart.ai">FutureSmart AI</a>, we pride ourselves on staying adaptable and embracing new technologies. If a solution has great potential and delivers value, we’re always eager to explore and adopt it. After testing Qdrant ourselves and seeing its impressive results, we’ve prepared this comprehensive guide to walk you through its <strong>installation and setup on Docker and locally</strong>. While the blog primarily focuses on on-premise deployment, it’s worth noting that Qdrant also offers a robust <strong>cloud platform</strong> that simplifies scaling and management for production use cases.</p>
<h3 id="heading-what-is-a-vector-database"><strong>What is a Vector Database</strong></h3>
<p>Let us first understand what a Vector database is, and what advantages it has over traditional dbs. In simple terms, a <strong>vector database</strong> is a specialized database designed to store and work with <strong>vector embeddings. Vector embeddings</strong> are numerical representations of data that capture its meaning, features, or relationships. These embeddings are often generated by AI models and are used to process <strong>unstructured data</strong> like text, images, audio, or videos.</p>
<p>Unlike traditional databases, which rely on exact matches (like finding a name or ID), vector databases focus on finding <strong>similarity</strong> between data points, even when the input isn’t identical. Unlike traditional relational databases, <strong>vector databases</strong> enable <strong>semantic search</strong>, allowing advanced similarity retrieval and <strong>unstructured data management</strong>.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1736753931814/6c185e57-0c30-491c-89d7-bdc49ade8842.png" alt="Unstructured data management with vector embeddings" class="image--center mx-auto" /></p>
<p>The above image is an example of how we can still find relevant or similar objects even if they are structurally completely different. If it were JSON objects instead of plain text, it would still be able to do it. That’s the power of similarity search or vector search. You can try out this <a target="_blank" href="https://aidemos.com/ai-playground/semantic-similarity-checker">similarity checker</a> yourself at AI Demos Playground.</p>
<h3 id="heading-what-is-qdrant"><strong>What is Qdrant</strong></h3>
<p>Among the top <strong>open-source vector databases</strong>, <strong>Qdrant</strong> stands out as a powerful, Rust-based <strong>vector database</strong> and similarity search engine. It supports seamless <strong>integration with LangChain</strong> for building sophisticated AI solutions. It offers robust performance, a user-friendly API, and support for Python. With its use of indexing, Qdrant delivers both speed and precision, making it a competitive choice for modern applications.</p>
<p>Key Features of Qdrant:</p>
<ol>
<li><p><strong>HNSW Indexing</strong>: Uses the <strong>Hierarchical Navigable Small World (HNSW)</strong> algorithm for fast and accurate similarity searches.</p>
</li>
<li><p><strong>Distance Metrics</strong>: Supports <strong>Cosine Similarity</strong>, <strong>Dot Product</strong>, and <strong>Euclidean Distance</strong> for flexible vector search.</p>
</li>
<li><p><strong>Free Vector Database</strong>: Qdrant is <strong>open-source</strong> and perfect for creating a <strong>local vector database</strong>.</p>
</li>
<li><p><strong>Multi-Language APIs</strong>: Offers APIs for <strong>Python</strong>, <strong>JavaScript/TypeScript</strong>, <strong>Rust</strong>, and <strong>Go</strong>, ensuring smooth integration with various tech stacks.</p>
</li>
<li><p><strong>Recommendation API</strong>: Includes a built-in API for creating efficient recommendation systems.</p>
</li>
<li><p><strong>Scalability and Production-Ready</strong>: Designed for real-world applications, it scales to handle millions or billions of vectors seamlessly.</p>
</li>
<li><p><strong>Hybrid Compatibility</strong>: Works well with databases like <strong>PostgreSQL</strong> for blending relational and vector data.</p>
</li>
</ol>
<h2 id="heading-semantic-vs-vector-search">Semantic Vs Vector Search</h2>
<p><strong>Semantic Search</strong>: Interprets user intent and context to deliver relevant results beyond simple keyword matching. It utilizes vector search to provide more accurate and contextually relevant outcomes.</p>
<p><strong>Vector Search</strong>: Transforms text into vectors representing semantic meaning, enabling rapid similarity comparisons within large datasets.</p>
<p>In essence, vector search serves as a foundational component of semantic search, facilitating the understanding and retrieval of information based on meaning rather than mere keywords.</p>
<h2 id="heading-setting-up-the-environment"><strong>Setting Up the Environment:</strong></h2>
<p>We'll cover two approaches to using Qdrant:</p>
<ol>
<li><p><strong>Local Setup</strong>: Create a <strong>local LangChain vector database</strong> for prototyping.</p>
</li>
<li><p><strong>Docker-based Server</strong>: Use a containerized approach for scalability and production.</p>
</li>
</ol>
<h3 id="heading-installing-qdrant-using-docker"><strong>Installing Qdrant Using Docker:</strong></h3>
<ul>
<li><strong>Pull the Qdrant Docker Image:</strong></li>
</ul>
<pre><code class="lang-json">docker pull qdrant/qdrant
</code></pre>
<ul>
<li><strong>Run the Qdrant Container:</strong></li>
</ul>
<pre><code class="lang-json">docker run -p <span class="hljs-number">6333</span>:<span class="hljs-number">6333</span> -p <span class="hljs-number">6334</span>:<span class="hljs-number">6334</span> -v <span class="hljs-string">"${PWD}/qdrant_storage:/qdrant/storage:z"</span> qdrant/qdrant
</code></pre>
<p>This Docker command runs the Qdrant container, exposing ports <code>6333</code> (REST API) and <code>6334</code> (gRPC API), while mapping the host directory <code>${PWD}/qdrant_storage</code> to <code>/qdrant/storage</code> inside the container to persist data.</p>
<h3 id="heading-python-client"><strong>Python Client</strong>:</h3>
<p>To create a <strong>vector database in Python</strong>, start by installing the Python client:</p>
<pre><code class="lang-bash">pip install qdrant-client[fastembed]
</code></pre>
<h2 id="heading-securing-your-qdrant-instance"><strong>Securing Your Qdrant Instance:</strong></h2>
<p>When running the Qdrant container, you can enable API key authentication by setting specific environment variables. This is an <strong>Optional Step</strong>, but it helps secure your <strong>Qdrant vector database</strong> from unauthorized requests, ensuring only trusted clients can access it.</p>
<ul>
<li><code>QDRANT__SERVICE__API_KEY</code>: Set this to the desired API key.</li>
</ul>
<p>Here’s the command:</p>
<pre><code class="lang-bash">docker run -d  -p 6333:6333  -e QDRANT__SERVICE__API_KEY=your-api-key-here -v <span class="hljs-string">"<span class="hljs-variable">${PWD}</span>/qdrant_storage:/qdrant/storage:z"</span>  qdrant/qdrant
</code></pre>
<p>Replace <code>your-api-key-here</code> with a strong API key. if needed.</p>
<h2 id="heading-understanding-the-data">Understanding the Data</h2>
<ul>
<li>The Json data represents movie entries including key details like its name, description, director, and release year. In Qdrant, this information will be stored as a <strong>point</strong>, where the <strong>vector</strong> could be derived from the movie description (e.g., embedding for semantic search), and the additional fields like director, year, and name will be stored as <strong>metadata</strong> or <strong>payload</strong> for enriched querying and context.</li>
</ul>
<pre><code class="lang-json">[
    {
        <span class="hljs-attr">"name"</span>: <span class="hljs-string">"Sholay"</span>,
        <span class="hljs-attr">"description"</span>: <span class="hljs-string">"Two ex-convicts are hired by a retired policeman to capture a ruthless dacoit terrorizing a village."</span>,
        <span class="hljs-attr">"director"</span>: <span class="hljs-string">"Ramesh Sippy"</span>,
        <span class="hljs-attr">"year"</span>: <span class="hljs-number">1975</span>,
    },
    {
        <span class="hljs-attr">"name"</span>: <span class="hljs-string">"Lagaan"</span>,
        <span class="hljs-attr">"description"</span>: <span class="hljs-string">"Villagers unite to play a cricket match against British officers to abolish oppressive taxes."</span>,
        <span class="hljs-attr">"director"</span>: <span class="hljs-string">"Ashutosh Gowariker"</span>,
        <span class="hljs-attr">"year"</span>: <span class="hljs-number">2001</span>,
    }
        ...
        ...
]
</code></pre>
<h2 id="heading-loading-the-data">Loading the Data</h2>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> os

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">read_files_from_folder</span>(<span class="hljs-params">folder_path</span>):</span>
    file_data = []

    <span class="hljs-keyword">for</span> file_name <span class="hljs-keyword">in</span> os.listdir(folder_path):
        <span class="hljs-keyword">if</span> file_name.endswith(<span class="hljs-string">".json"</span>):
            <span class="hljs-keyword">with</span> open(os.path.join(folder_path, file_name), <span class="hljs-string">'r'</span>) <span class="hljs-keyword">as</span> file:
                <span class="hljs-comment"># content = file.read()</span>
                content = json.load(file)
                file_data.append({<span class="hljs-string">"file_name"</span>: file_name, <span class="hljs-string">"content"</span>: content})

    <span class="hljs-keyword">return</span> file_data

folder_path = <span class="hljs-string">"data"</span>
file_data = read_files_from_folder(folder_path)
</code></pre>
<p>Then, we create separate lists for documents, metadata, which we add to our collection.</p>
<pre><code class="lang-python">documents = []
metadatas = []
ids = []

<span class="hljs-keyword">import</span> uuid 

<span class="hljs-keyword">for</span> file_index, data <span class="hljs-keyword">in</span> enumerate(file_data):
    context = data[<span class="hljs-string">"content"</span>]
    documents.extend(movie[<span class="hljs-string">"description"</span>] <span class="hljs-keyword">for</span> movie <span class="hljs-keyword">in</span> context)
    metadatas.extend(
        {**{key: value <span class="hljs-keyword">for</span> key, value <span class="hljs-keyword">in</span> movie.items() <span class="hljs-keyword">if</span> key != <span class="hljs-string">"description"</span>}, <span class="hljs-string">"source"</span>: data[<span class="hljs-string">"file_name"</span>]}
        <span class="hljs-keyword">for</span> movie <span class="hljs-keyword">in</span> context
    )
    ids.extend(str(uuid.uuid4()) <span class="hljs-keyword">for</span> _ <span class="hljs-keyword">in</span> context)  <span class="hljs-comment"># Generate a unique UUID for each movie</span>
</code></pre>
<pre><code class="lang-python">documets = [<span class="hljs-string">'Two ex-convicts are hired by a retired policeman to capture a ruthless dacoit terrorizing a village.'</span>,
...]

metadatas = [{<span class="hljs-string">'name'</span>: <span class="hljs-string">'Sholay'</span>,
 <span class="hljs-string">'director'</span>: <span class="hljs-string">'Ramesh Sippy'</span>,
 <span class="hljs-string">'year'</span>: <span class="hljs-number">1975</span>,
 <span class="hljs-string">'source'</span>: <span class="hljs-string">'movies.json'</span>},
...]

ids = [<span class="hljs-string">'7aa5f6f5-5de5-4776-8090-9a7c38a4cfcf'</span>,
...]
</code></pre>
<h2 id="heading-initializing-qdrant-client-through-local-or-docker">Initializing Qdrant Client through Local or Docker</h2>
<h3 id="heading-setting-up-a-local-in-memory-qdrant-instance">Setting Up a Local In-Memory Qdrant Instance:</h3>
<p>This approach is ideal for development, prototyping, or testing without the need for a running server.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> qdrant_client <span class="hljs-keyword">import</span> QdrantClient, models
<span class="hljs-comment"># Initialize the local client</span>
qdrant = QdrantClient(<span class="hljs-string">":memory:"</span>)  <span class="hljs-comment"># or QdrantClient(path="path/to/db")</span>
</code></pre>
<h3 id="heading-connecting-to-a-qdrant-server-via-docker">Connecting to a Qdrant Server via Docker:</h3>
<p>Ensure that your Qdrant server is running, typically accessible at <code>http://localhost:6333</code>.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> qdrant_client <span class="hljs-keyword">import</span> QdrantClient, models
<span class="hljs-comment"># Connect to Docker server Qdrant instance</span>
qdrant = QdrantClient(<span class="hljs-string">"&lt;http://localhost:6333&gt;"</span>, api_key=your-api-key-here)
</code></pre>
<blockquote>
<p>You can choose between running Qdrant in a Docker container (qdrant) or a local setup (client), depending on your specific requirements and environment. Code is same for both approach</p>
</blockquote>
<h2 id="heading-adding-documents"><strong>Adding Documents</strong></h2>
<p><strong>Inserting Data into Your Collection:</strong></p>
<p>You can replace qdrant variable with client for using local in-memory setup</p>
<pre><code class="lang-python"><span class="hljs-comment"># Use the new add method</span>
qdrant.add(
    collection_name=<span class="hljs-string">"new_movie_collection"</span>,
    documents = documents,
    metadata = metadatas
)
</code></pre>
<pre><code class="lang-python">Output: List of IDs of added documents. If no ids provided, will be randomly generated
[<span class="hljs-string">'d8984772ed664b2b8e2da23b0660989c'</span>,...,<span class="hljs-string">'fb32ac5bc5a74e48ba0a8fbb578fedfe'</span>]
</code></pre>
<p>This function adds text documents to a Qdrant collection, creating the collection with the default vector config if it doesn’t exist. Documents are embedded using the default embedding model. If you need custom embeddings or vectors, we will see a different function in a while.</p>
<h2 id="heading-querying"><strong>Querying</strong></h2>
<p><strong>Performing Vector Searches:</strong></p>
<pre><code class="lang-python">search_result = qdrant.query(
    collection_name=<span class="hljs-string">"movie_collection"</span>,
    query_text=<span class="hljs-string">"for adults"</span>,
    limit = <span class="hljs-number">1</span>
)
print(search_result)
</code></pre>
<pre><code class="lang-python"><span class="hljs-comment"># List of QueryResponse object</span>
[QueryResponse(id=<span class="hljs-string">'96d0e002-bc96-47a5-9307-16e0924ff9f6'</span>, embedding=<span class="hljs-literal">None</span>, sparse_embedding=<span class="hljs-literal">None</span>, metadata={<span class="hljs-string">'document'</span>: <span class="hljs-string">'The heartwarming tale of a mute and deaf man and his relationships with two women.'</span>, <span class="hljs-string">'name'</span>: <span class="hljs-string">'Barfi!'</span>, <span class="hljs-string">'director'</span>: <span class="hljs-string">'Anurag Basu'</span>, <span class="hljs-string">'year'</span>: <span class="hljs-number">2012</span>}, document=<span class="hljs-string">'The heartwarming tale of a mute and deaf man and his relationships with two women.'</span>, score=<span class="hljs-number">0.76938045</span>)]
</code></pre>
<p>In Qdrant, a <code>QueryResponse</code> represents the outcome of a search query, typically including the point ID, document, similarity score, associated payload (metadata), and optionally, vector embeddings. By default, the vector data is not included in the response. To include, you can adjust the parameter: <code>with_vector</code> as <code>True</code></p>
<h2 id="heading-using-a-different-embedding-model"><strong>Using a Different Embedding Model</strong></h2>
<p>Qdrant’s default embedding model is <strong>BAAI/bge-small-en</strong>, used for generating vector embeddings. You can use <strong>open-source embedding models</strong> like <code>all-MiniLM-L6-v2</code> for custom embeddings:</p>
<p>To use a different model when creating a new collection, you need to specify its configuration explicitly, including the embedding model's details, during the collection setup in the Qdrant API or dashboard.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> sentence_transformers <span class="hljs-keyword">import</span> SentenceTransformer

encoder = SentenceTransformer(<span class="hljs-string">"all-MiniLM-L6-v2"</span>)
</code></pre>
<p>This resource provides a <a target="_blank" href="https://www.youtube.com/watch?v=OlhNZg4gOvA">comprehensive guide on implementing sentence embeddings</a>, similarity measures, semantic search, and clustering using Sentence Transformers.</p>
<h3 id="heading-create-a-collection"><strong>Create a Collection:</strong></h3>
<pre><code class="lang-python">qdrant.create_collection(
    collection_name=<span class="hljs-string">"my_movies"</span>,
    vectors_config=models.VectorParams(
        size=encoder.get_sentence_embedding_dimension(),  <span class="hljs-comment"># Vector size is defined by used model</span>
        distance=models.Distance.COSINE,
    ),
)
</code></pre>
<h3 id="heading-format-data">Format Data</h3>
<pre><code class="lang-python">documents = []
metadatas = []
ids = []

<span class="hljs-keyword">import</span> uuid 

<span class="hljs-keyword">for</span> file_index, data <span class="hljs-keyword">in</span> enumerate(file_data):
    context = data[<span class="hljs-string">"content"</span>]
    documents.extend(movie[<span class="hljs-string">"description"</span>] <span class="hljs-keyword">for</span> movie <span class="hljs-keyword">in</span> context)
    metadatas.extend(
        {**{key: value <span class="hljs-keyword">for</span> key, value <span class="hljs-keyword">in</span> movie.items() }, <span class="hljs-string">"source"</span>: data[<span class="hljs-string">"file_name"</span>]}
        <span class="hljs-keyword">for</span> movie <span class="hljs-keyword">in</span> context
    )
    ids.extend(str(uuid.uuid4()) <span class="hljs-keyword">for</span> _ <span class="hljs-keyword">in</span> context)  <span class="hljs-comment"># Generate a unique UUID for each movie</span>
</code></pre>
<pre><code class="lang-python">documets = [<span class="hljs-string">'Two ex-convicts are hired by a retired policeman to capture a ruthless dacoit terrorizing a village.'</span>,
...]

metadatas = [{<span class="hljs-string">'name'</span>: <span class="hljs-string">'Sholay'</span>,
 <span class="hljs-string">'description'</span>: <span class="hljs-string">'Two ex-convicts are hired by a retired policeman to capture a ruthless dacoit terrorizing a village.'</span>,
 <span class="hljs-string">'director'</span>: <span class="hljs-string">'Ramesh Sippy'</span>,
 <span class="hljs-string">'year'</span>: <span class="hljs-number">1975</span>,
 <span class="hljs-string">'source'</span>: <span class="hljs-string">'movies.json'</span>},
...]

ids = [<span class="hljs-string">'7aa5f6f5-5de5-4776-8090-9a7c38a4cfcf'</span>,
...]
</code></pre>
<h3 id="heading-insert-data"><strong>Insert Data:</strong></h3>
<p>In Qdrant, you can insert data using two methods:</p>
<ul>
<li><p><strong>Record-Oriented Approach:</strong> Utilize the <code>upload_points</code> method with a list of points, each containing an <code>id</code>, <code>vector</code>embedding, and <code>payload</code>.</p>
</li>
<li><p><strong>Column-Oriented Approach:</strong> Employ the <code>upload_collection</code> method providing separate lists for <code>ids</code>, <code>vectors</code>, and <code>payload</code>.</p>
</li>
</ul>
<p>Both methods facilitate efficient data insertion, allowing you to choose the format that best suits your workflow.</p>
<pre><code class="lang-python">qdrant.upload_collection(
    collection_name=<span class="hljs-string">"my_movies"</span>,
    ids=ids,
    vectors=encoder.encode(documents),
    payload=metadatas
)
</code></pre>
<h3 id="heading-querying-1"><strong>Querying:</strong></h3>
<ul>
<li>In the provided query, a filter is applied on the payload data to retrieve points from the <code>my_movies</code> collection where the <code>year</code> field is <strong>greater than or equal to 2005</strong>. This refines the similarity search to include only relevant results.</li>
</ul>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> qdrant_client.models <span class="hljs-keyword">import</span> Filter, FieldCondition, MatchValue

hits = qdrant.query_points(
    collection_name=<span class="hljs-string">"my_movies"</span>,
    query=encoder.encode(<span class="hljs-string">"engineering student life"</span>).tolist(),
    limit=<span class="hljs-number">3</span>,
    with_payload=<span class="hljs-literal">True</span>,
    query_filter=Filter(
        should=[FieldCondition(
            key=<span class="hljs-string">"year"</span>,
            range=models.Range(
            gt=<span class="hljs-literal">None</span>,
            gte=<span class="hljs-number">2005</span>,
            lt=<span class="hljs-literal">None</span>,
            lte=<span class="hljs-literal">None</span>,),
        )]
    ),
).points

<span class="hljs-keyword">for</span> hit <span class="hljs-keyword">in</span> hits:
    print(hit.payload, <span class="hljs-string">"score:"</span>, hit.score)
</code></pre>
<p>Output</p>
<pre><code class="lang-python">{<span class="hljs-string">'name'</span>: <span class="hljs-string">'3 Idiots'</span>, <span class="hljs-string">'description'</span>: <span class="hljs-string">'Three engineering students navigate the pressures of academia while challenging societal norms.'</span>, <span class="hljs-string">'director'</span>: <span class="hljs-string">'Rajkumar Hirani'</span>, <span class="hljs-string">'year'</span>: <span class="hljs-number">2009</span>} score: <span class="hljs-number">0.6074454</span>
{<span class="hljs-string">'name'</span>: <span class="hljs-string">'Dangal'</span>, <span class="hljs-string">'description'</span>: <span class="hljs-string">'A former wrestler trains his daughters to become world-class wrestlers against societal odds.'</span>, <span class="hljs-string">'director'</span>: <span class="hljs-string">'Nitesh Tiwari'</span>, <span class="hljs-string">'year'</span>: <span class="hljs-number">2016</span>} score: <span class="hljs-number">0.23663726</span>
{<span class="hljs-string">'name'</span>: <span class="hljs-string">'Taare Zameen Par'</span>, <span class="hljs-string">'description'</span>: <span class="hljs-string">'A dyslexic boy discovers his artistic talent with the help of a compassionate teacher.'</span>, <span class="hljs-string">'director'</span>: <span class="hljs-string">'Aamir Khan'</span>, <span class="hljs-string">'year'</span>: <span class="hljs-number">2007</span>} score: <span class="hljs-number">0.1795995</span>
</code></pre>
<blockquote>
<p>In our in AI Demos Playground, we provide all AI tools like <a target="_blank" href="https://aidemos.com/ai-playground/semantic-similarity-checker"><strong>Semantic Similarity Checker</strong></a> that allows you to upload text or files and receive similarity scores. It utilizes models from OpenAI, Hugging Face, Google, and Mistral to provide accurate assessments of semantic similarity.</p>
</blockquote>
<h2 id="heading-integration-with-openai">Integration with OpenAI</h2>
<p>To utilize OpenAI embeddings, encode your text using the OpenAI API and provide the resulting vector when uploading or querying points in your database. When creating a collection in Qdrant, set the vector size to 1,536 to match the dimensionality of the embeddings generated by models like <code>text-embedding-ada-002</code></p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> openai
<span class="hljs-keyword">import</span> os 

embedding_model = <span class="hljs-string">"text-embedding-3-small"</span>

openai_client = openai.Client(
    api_key=<span class="hljs-string">"&lt;YOUR_API_KEY&gt;"</span>
)

result = openai_client.embeddings.create(input=<span class="hljs-string">"hey texts how are you?"</span>, model=embedding_model)
result

<span class="hljs-comment"># Extract the embedding vector</span>
embedding_vector = result.data[<span class="hljs-number">0</span>].embedding

<span class="hljs-comment"># Determine the dimensionality</span>
embedding_dimension = len(embedding_vector)
print(<span class="hljs-string">f"The embedding dimensionality is: <span class="hljs-subst">{embedding_dimension}</span>"</span>)
<span class="hljs-comment"># The embedding dimensionality is: 1536</span>
</code></pre>
<h2 id="heading-accessing-the-qdrant-web-ui"><strong>Accessing the Qdrant Web UI:</strong></h2>
<p>You can manage Qdrant deployments through the WebUI, accessible at <a target="_blank" href="http://localhost:6333/dashboard">Dashboard</a>.</p>
<p>The dashboard provides two primary sections: <strong>Console</strong> and <strong>Collection</strong></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1736753963071/f56e5738-ce8c-4fda-b1b9-8fa6f8231f46.png" alt class="image--center mx-auto" /></p>
<ol>
<li><p><strong>Console</strong>: Use the REST API to interact with Qdrant for tasks like querying or managing data.</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1736753987864/42f942d5-f40f-455c-a1e7-2d25fbf830ac.png" alt class="image--center mx-auto" /></p>
</li>
<li><p><strong>Collections</strong>: This section lets you organize, manage, and upload collections, as well as handle <strong>snapshots</strong> for backups or migrations.</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1736754027361/47976c24-7176-4dcc-a522-91033abec534.png" alt class="image--center mx-auto" /></p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1736754052206/4cc226b5-8522-4c46-8baa-0151b46b1ac8.png" alt class="image--center mx-auto" /></p>
</li>
</ol>
<p>The WebUI also features a <strong>Graph Tool</strong> to visualize relationships within datasets. Found under the <strong>Graph Tab</strong> in Collections, it represents data points as an interactive tree graph. Clusters of similar points are grouped, helping users uncover hidden patterns and explore connections. The tool is flexible, allowing zooming and manipulation for enhanced clarity and deeper insights into the data structure.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1736754122049/6e56202b-aa19-441f-9ea2-fbbc5fd087a8.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-key-improvements-for-using-qdrant">Key Improvements for Using Qdrant</h2>
<ol>
<li><p><strong>Caching</strong>: Cache frequent query results using <strong>Redis</strong> or store precomputed embeddings to reduce latency.</p>
</li>
<li><p><strong>LangChain Integration</strong>: Ensure embedding dimensions match Qdrant vectors and streamline query pipelines for LLM-based systems.</p>
</li>
<li><p><strong>Scalability</strong>: Use <strong>Qdrant Cloud</strong> for scaling and dividing large datasets into smaller collections to enhance performance.</p>
</li>
<li><p><strong>Vector Search Techniques</strong>: Experiment with distance metrics and chunk large documents to improve search relevance and precision.</p>
</li>
</ol>
<h2 id="heading-conclusion"><strong>Conclusion:</strong></h2>
<p>In this blog, we've explored how to set up and utilize Qdrant, a robust <strong>open-source vector database</strong>, using both local and Docker-based approaches. We've covered data preparation, insertion, and querying, demonstrating how Qdrant facilitates efficient similarity searches and <strong>unstructured data</strong> management. By following these steps, you can leverage Qdrant's capabilities to build sophisticated AI solutions tailored to your specific needs.</p>
<p>As you continue to work with Qdrant, consider exploring its advanced features, such as hybrid search capabilities and integration with large language models, to enhance your applications further. For more detailed information and documentation, visit Qdrant's official <a target="_blank" href="https://qdrant.tech/">website</a>.</p>
<hr />
<p>At <strong>FutureSmart AI</strong>, we specialize in developing advanced AI solutions tailored to your business needs. Our expertise encompasses building state-of-the-art vector databases and integrating LangChain-powered applications, enabling efficient semantic search and unstructured data management.</p>
<p>Have questions or need assistance? Contact us at <a target="_blank" href="mailto:contact@futuresmart.ai"><strong>contact@futuresmart.ai</strong></a> for a consultation! Visit our <a target="_blank" href="https://www.futuresmart.ai/case-studies">website</a> to discover how our AI technologies have delivered measurable business value.</p>
<p>Don’t miss our next tutorial, where we’ll dive deeper into semantic search optimizations, vector search, and more advanced LangChain-Qdrant integrations.</p>
<p>Let’s build the future together! 🌟</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text"><a target="_self" href="https://github.com/PradipNichite/FutureSmart-AI-Blog/tree/main/Comprehensive%20guide%20to%20Qdrant%20Vector%20DB"><strong>Get the Full Code in our GitHub</strong></a></div>
</div>]]></content:encoded></item><item><title><![CDATA[What AI Solutions Are Businesses Building Today? Trends & Real-World Applications]]></title><description><![CDATA[Introduction
AI adoption has shifted significantly in recent years. Earlier, businesses focused on classical machine learning models for tasks like classification and Named Entity Recognition (NER). Now, almost every project we build at FutureSmart A...]]></description><link>https://blog.futuresmart.ai/what-ai-solutions-are-businesses-building-today-trends-real-world-applications</link><guid isPermaLink="true">https://blog.futuresmart.ai/what-ai-solutions-are-businesses-building-today-trends-real-world-applications</guid><category><![CDATA[custom ai solutions ]]></category><category><![CDATA[generative AI solutions]]></category><category><![CDATA[ai agents]]></category><category><![CDATA[RAG ]]></category><category><![CDATA[NL2SQL]]></category><category><![CDATA[document parsing]]></category><dc:creator><![CDATA[Pradip Nichite]]></dc:creator><pubDate>Sat, 18 Jan 2025 08:58:46 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1737189830347/362850ad-f57f-407c-99fc-5aba7a5110e2.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>AI adoption has shifted significantly in recent years. Earlier, businesses focused on classical machine learning models for tasks like classification and Named Entity Recognition (NER). Now, almost every project we build at <a target="_blank" href="https://www.futuresmart.ai/"><strong>FutureSmart AI</strong></a> involves Generative AI. While traditional models still work, companies prefer <strong>LLMs</strong> because they offer <strong>higher accuracy, require little to no labeled data, and enable faster deployment</strong>.</p>
<p>Through our experience building <a target="_blank" href="https://www.futuresmart.ai/services"><strong>custom AI solutions</strong></a> for clients, we have observed a clear trend: businesses are investing in AI for real-world applications that directly impact their workflows. The most in-demand solutions today include:</p>
<ul>
<li><p><strong>Retrieval-Augmented Generation (RAG) chatbots</strong></p>
</li>
<li><p><strong>Natural Language to SQL (NL2SQL) systems</strong></p>
</li>
<li><p><strong>Document parsing and structured data extraction</strong></p>
</li>
<li><p><strong>AI Agents that orchestrate multiple tools</strong></p>
</li>
</ul>
<p>In this blog, I’ll share insights from our work at FutureSmart AI, covering why these solutions are in demand and when businesses choose custom AI over off-the-shelf tools.</p>
<p>Want to see how AI solutions are making an impact? Check out our <a target="_blank" href="https://www.futuresmart.ai/case-studies"><strong>case studies</strong></a> where we showcase real-world AI implementations and their business outcomes.</p>
<hr />
<h2 id="heading-the-ai-shift-from-traditional-ml-to-generative-ai">The AI Shift: From Traditional ML to Generative AI</h2>
<p>A few years ago, classical machine learning models dominated AI development. Companies built custom models for classification, sentiment analysis, or Named Entity Recognition (NER), requiring <strong>large labeled datasets and extensive tuning</strong>. While these models were effective, they came with challenges—data collection, training costs, and maintaining model performance over time.</p>
<p>Now, businesses are turning to <strong>LLMs (Large Language Models)</strong> like GPT-4, which offer <strong>pre-trained knowledge, adaptability, and higher accuracy with minimal data requirements</strong>. Instead of spending months curating labeled datasets, teams can leverage <strong>few-shot learning or prompt engineering</strong> to achieve comparable, if not superior, results in much less time.</p>
<p>At <strong>FutureSmart AI</strong>, we've worked with companies that initially built traditional ML models but switched to LLMs due to the faster implementation and better generalization. However, using LLMs effectively still requires <strong>customization, retrieval optimization, and integration with enterprise workflows</strong>, which is where <strong>custom AI solutions</strong> add value.</p>
<p>For those looking to go a step further, <strong>fine-tuning LLMs</strong> can unlock even more <strong>domain-specific accuracy and performance</strong>. Check out this step-by-step guide on how to <strong>Fine-Tune GPT-4o Model</strong>:</p>
<p>➡️ <a target="_blank" href="https://youtu.be/jiYqbEDPw7A?si=sVxRCXln1cWAR0M0"><strong>Fine-Tune GPT-4o Model Step by Step</strong></a></p>
<p>This video walks through the <strong>process, best practices, and practical applications</strong> of fine-tuning GPT-4o for real-world business scenarios.</p>
<hr />
<h2 id="heading-the-most-in-demand-ai-solutions-businesses-are-adopting">The Most In-Demand AI Solutions Businesses Are Adopting</h2>
<p>Over the past year, we've observed a growing demand for AI solutions that provide tangible business value. While many AI advancements sound futuristic, companies are investing in <strong>practical AI applications</strong> that directly improve efficiency and decision-making.</p>
<h3 id="heading-1-retrieval-augmented-generation-rag-chatbots">1. <strong>Retrieval-Augmented Generation (RAG) Chatbots</strong></h3>
<p>Many businesses now require chatbots that can accurately retrieve information from internal documentation rather than relying solely on pre-trained models. <strong>RAG-based chatbots</strong> enable:</p>
<ul>
<li><p><strong>Real-time document retrieval</strong> to provide precise answers.</p>
</li>
<li><p><strong>Enhanced accuracy</strong> by leveraging structured and unstructured data.</p>
</li>
<li><p><strong>Custom integrations</strong> with enterprise knowledge bases and APIs.</p>
</li>
</ul>
<p>At <strong>FutureSmart AI</strong>, we've implemented RAG-based chatbots for companies needing scalable knowledge assistants. Unlike generic chatbots, these solutions provide <strong>business-specific insights</strong> and can be fine-tuned for <strong>domain-specific knowledge</strong>.</p>
<p>Want to learn more about RAG? Watch our <a target="_blank" href="https://youtu.be/38aMTXY2usU?si=6IHqX4jCRybvX911"><strong>YouTube video on building RAG solutions using LangChain</strong></a> for a hands-on guide, or read our <a target="_blank" href="https://blog.futuresmart.ai/langchain-rag-from-basics-to-production-ready-rag-chatbot"><strong>in-depth blog on RAG chatbots</strong></a> to explore how they work and why they are transforming business interactions.</p>
<h3 id="heading-2-natural-language-to-sql-nl2sql-systems">2. <strong>Natural Language to SQL (NL2SQL) Systems</strong></h3>
<p>Many enterprises struggle with querying large databases efficiently. <strong>NL2SQL solutions</strong> bridge this gap by enabling users to ask questions in plain English and receive structured SQL queries. This has proven useful in:</p>
<ul>
<li><p><strong>Business intelligence &amp; analytics</strong> where non-technical users need insights.</p>
</li>
<li><p><strong>Customer support &amp; operations</strong> for quick data retrieval.</p>
</li>
<li><p><strong>Workflow automation</strong> by simplifying access to structured databases.</p>
</li>
</ul>
<p>However, generic NL2SQL models often fail in production environments due to <strong>complex database structures and domain-specific nuances</strong>. We’ve developed <strong>custom NL2SQL models</strong> that fine-tune responses based on real-world query patterns and database structures. Additionally, some businesses prefer <strong>Text-to-SQL models</strong>, which function similarly but have different optimization strategies depending on the use case.</p>
<p>Interested in how NL2SQL can enhance data accessibility? Watch our <a target="_blank" href="https://youtu.be/fss6CrmQU2Y?si=qQDTQBMp9ZW1HbVi"><strong>YouTube video: Mastering Natural Language to SQL with LangChain</strong></a> for a step-by-step guide, or check out our <a target="_blank" href="https://blog.futuresmart.ai/mastering-natural-language-to-sql-with-langchain-nl2sql"><strong>blog on NL2SQL solutions</strong></a> to see practical implementations and best practices.</p>
<hr />
<h3 id="heading-3-document-parsing-and-structured-data-extraction">3. <strong>Document Parsing and Structured Data Extraction</strong></h3>
<p>Many businesses deal with large volumes of unstructured documents—contracts, invoices, resumes, and more. Extracting structured information from these documents manually is time-consuming and error-prone. <strong>AI-powered document parsing</strong> helps automate this process by:</p>
<ul>
<li><p><strong>Extracting key details</strong> like names, dates, and financial figures.</p>
</li>
<li><p><strong>Improving accuracy</strong> over rule-based approaches using deep learning.</p>
</li>
<li><p><strong>Integrating seamlessly</strong> with business workflows and databases.</p>
</li>
</ul>
<p>At <strong>FutureSmart AI</strong>, we've built <strong>custom document parsing solutions</strong> tailored to various industries. Our models handle <strong>OCR-based text extraction, entity recognition, and format normalization</strong>, ensuring businesses get <strong>clean, structured data</strong> from PDFs and scanned images.</p>
<p>Many businesses also combine Document Parsing + NL2SQL, enabling natural language queries over extracted data (e.g., Show me resumes of candidates with Python experience).</p>
<h2 id="heading-the-role-of-ai-agents-in-business-automation">The Role of AI Agents in Business Automation</h2>
<p>AI is no longer just about isolated solutions—it’s about <strong>intelligent orchestration</strong>. AI Agents are transforming how businesses automate workflows, making <strong>real-time decisions</strong> on when and how to use different AI tools. If you're interested in exploring <strong>multi-agent systems</strong>, check out our video on <a target="_blank" href="https://youtu.be/g9E7VNeZItM?si=tbJupO4qiWPP7g5D"><strong>Building Multi-Agent Systems with OpenAI Swarm: Practical Example</strong>.</a></p>
<h3 id="heading-what-are-ai-agents"><strong>What Are AI Agents?</strong></h3>
<p>AI Agents are autonomous systems that can:</p>
<ul>
<li><p><strong>Decide dynamically</strong> which AI tool to use based on user input.</p>
</li>
<li><p><strong>Combine multiple AI solutions</strong> like RAG, NL2SQL, and document parsing.</p>
</li>
<li><p><strong>Integrate seamlessly</strong> with enterprise software like HubSpot, Odoo, and Gmail.</p>
</li>
</ul>
<p>For example, an AI Agent can:</p>
<ul>
<li><p><strong>Retrieve documents</strong> using RAG before responding to a query.</p>
</li>
<li><p><strong>Generate SQL queries</strong> using NL2SQL for structured database insights.</p>
</li>
<li><p><strong>Extract key details</strong> from documents before feeding them into a report.</p>
</li>
</ul>
<p>At <strong>FutureSmart AI</strong>, we specialize in designing <strong>custom AI Agents</strong> that fit specific business needs, ensuring they provide <strong>actionable, high-quality insights</strong> without human intervention.</p>
<p>Want to see AI Agents in action? Watch our <a target="_blank" href="https://youtu.be/qaEVUhoKS8M?si=DFXpZpLFLVHQLD1Z"><strong>YouTube video on AI Agents</strong></a> or read our <a target="_blank" href="https://blog.futuresmart.ai/series/langraph-tutorial"><strong>blog series on AI Agents</strong></a> for a deeper understanding of how they work and their business applications.</p>
<hr />
<h2 id="heading-choosing-the-right-ai-solution-for-your-business">Choosing the Right AI Solution for Your Business</h2>
<p>With so many AI technologies available, choosing the right solution for your business can be challenging. Companies must consider several factors before deciding between <strong>off-the-shelf AI solutions</strong> and <strong>custom AI development</strong>.</p>
<h2 id="heading-when-to-choose-off-the-shelf-ai-solutions"><strong>When to Choose Off-the-Shelf AI Solutions</strong></h2>
<p>Pre-built AI solutions can be useful when:</p>
<ul>
<li><p><strong>Your requirements are general</strong> (e.g., basic chatbots, automated transcription).</p>
</li>
<li><p><strong>Speed and cost are priorities</strong>—off-the-shelf solutions are ready to use.</p>
</li>
<li><p><strong>You need quick experimentation</strong> before committing to custom development.</p>
</li>
</ul>
<p>However, these solutions often <strong>lack flexibility</strong> and may not integrate well with existing enterprise workflows.</p>
<h3 id="heading-when-to-opt-for-custom-ai-development"><strong>When to Opt for Custom AI Development</strong></h3>
<p>Businesses choose custom AI solutions when:</p>
<ul>
<li><p><strong>They need domain-specific accuracy</strong> (e.g., legal, financial, or medical AI solutions).</p>
</li>
<li><p><strong>They require seamless integration</strong> with existing software and databases.</p>
</li>
<li><p><strong>Scalability and long-term ownership</strong> are critical factors.</p>
</li>
</ul>
<p>At <strong>FutureSmart AI</strong>, we help businesses evaluate whether an <strong>off-the-shelf solution meets their needs</strong> or if they <strong>require a tailored AI system</strong> that enhances efficiency and drives growth.</p>
<p>Want to discuss which AI solution fits your business? Contact us today at <a target="_blank" href="mailto:contact@futuresmart.ai"><strong>contact@futuresmart.ai</strong></a></p>
<hr />
<h2 id="heading-final-thoughts">Final Thoughts</h2>
<p>AI is rapidly transforming how businesses operate, but success depends on choosing the right solution for your needs. Whether it’s <strong>RAG-powered chatbots, NL2SQL systems, AI Agents, or document parsing</strong>, selecting the right approach can significantly impact efficiency and scalability.</p>
<p>At <strong>FutureSmart AI</strong>, we specialize in building <strong>custom AI solutions</strong> that align with business goals, ensuring AI is not just a tool but a competitive advantage.</p>
<p>If you're considering AI adoption or want to optimize your existing AI strategy, let's talk. <a target="_blank" href="https://www.linkedin.com/in/pradipnichite/"><strong>Get in touch with us today</strong></a> and explore how AI can revolutionize your business.</p>
]]></content:encoded></item><item><title><![CDATA[Integrating Google Authentication with FastAPI: A Step-by-Step Guide]]></title><description><![CDATA[In this blog, we’ll walk through the process of enabling Google Authentication for a backend API system built using FastAPI. FastAPI has emerged as a popular choice for developing advanced GenAI applications and we at FutureSmart AI use it regularly ...]]></description><link>https://blog.futuresmart.ai/integrating-google-authentication-with-fastapi-a-step-by-step-guide</link><guid isPermaLink="true">https://blog.futuresmart.ai/integrating-google-authentication-with-fastapi-a-step-by-step-guide</guid><category><![CDATA[oauth]]></category><category><![CDATA[FastAPI]]></category><category><![CDATA[Cookie based authentication]]></category><category><![CDATA[Google]]></category><category><![CDATA[google cloud]]></category><dc:creator><![CDATA[Sakalya Mitra]]></dc:creator><pubDate>Thu, 16 Jan 2025 15:40:46 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1735466696855/83ca88bb-574f-443b-9e72-d232f33d930c.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In this blog, we’ll walk through the process of enabling Google Authentication for a backend API system built using FastAPI. FastAPI has emerged as a popular choice for developing advanced GenAI applications and we at <a target="_blank" href="https://www.futuresmart.ai/">FutureSmart AI</a> use it regularly to develop client applications. On top of FastAPI, Authentication is a critical component of any API-driven system as it ensures that only authorized users can access protected resources. By implementing authentication, you can safeguard sensitive data, prevent unauthorized access, and maintain the integrity of your services.</p>
<p>This implementation covers different authentication endpoints, using Google OAuth2 for authentication, and secure API access through google auth. Such a setup not only enhances security but also simplifies the user experience by leveraging Google's robust authentication system. Additionally, we’ll add an endpoint that uses this authentication, demonstrating how authenticated APIs can provide tailored and secure responses.</p>
<h2 id="heading-prerequisites">Prerequisites</h2>
<p>Before we begin, ensure you have the following ready:</p>
<ul>
<li><p><strong>Google Cloud Console Account:</strong> To set up the OAuth client credentials.</p>
</li>
<li><p><strong>Python Environment:</strong> Installed and configured with the required libraries</p>
</li>
</ul>
<hr />
<h2 id="heading-step-1-setting-up-google-oauth2-client-on-google-cloud-console">Step 1: Setting Up Google OAuth2 Client on Google Cloud Console</h2>
<p>To enable Google authentication, the first step is setting up an OAuth2 client in the Google Cloud Console:</p>
<ol>
<li><p>Navigate to the <a target="_blank" href="https://console.cloud.google.com/">Google Cloud Console</a>.</p>
</li>
<li><p>Create a new project or select an existing one.</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1735413050051/914438bc-7a8b-461b-baa3-2759a5dd9745.png" alt class="image--center mx-auto" /></p>
</li>
<li><p>Once the project is opened/created, pull up the side bar and visit the <strong>APIs and Services</strong> section.</p>
</li>
<li><p>In that section select the “<strong>Credentials</strong>” options. The Credentials page will open.</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1735413225365/a5707458-1a47-4512-b3c3-618d6ec3eca2.png" alt class="image--center mx-auto" /></p>
</li>
<li><p>If this is a new project or your first project, you will be asked to configure the OAuth Consent Screen with your application details. This is a simple consent where the type of users who will be using your product is required (either External or Internal). Select it as per your requirement.</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1735413470900/f4a67414-9d33-4788-9599-16ab60baffeb.png" alt class="image--center mx-auto" /></p>
</li>
<li><p>Once the previous step is completed, you will be prompted to configure the consent screen of your application. This is how the consent screen looks like for the <a target="_blank" href="https://aidemos.com/ai-playground">AI Demos Playground</a> and <a target="_blank" href="https://aidemos.com/ai-tools">AI Demos Tools</a>.</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1735413700961/c439060a-7834-4ab6-bdcc-9757579861a9.png" alt class="image--center mx-auto" /></p>
<p> You can configure your application the way you want. You can put how the consent screen appears, what app name it shows, what logo to show and many more. Simply add your details and continue. For our ease we will add it as “<em>sample-auth-app</em>” and select myself as the developer. I will keep everything else as default and complete the setup.</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1735413789751/df2e10fa-07f0-4809-9eb4-6e029864f5f9.png" alt class="image--center mx-auto" /></p>
<p> This is how your OAuth Consent Screen page will look like after the setup</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1735413949052/b4430a33-e4af-4c02-b689-b5bda6eec251.png" alt class="image--center mx-auto" /></p>
</li>
<li><p>Now that you have the Consent screen configured, head over to the Credentials page and click on <strong>“Create Credentials”.</strong> Choose the <strong>“OAuth Client ID”</strong> option</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1735414163625/591ee278-4932-4110-b83d-d04a38761f11.png" alt class="image--center mx-auto" /></p>
<p> This ensures that when someone tries to authenticate themselves to your app, it asks for the user consent and displays the consent screen with all the details for user to login.</p>
</li>
<li><p>Once you are on the OAuth Client ID page, you will be prompted with two compulsory fields. One is the Application Type and other is the Name.</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1735414418697/ac2ee655-ff84-4e77-a454-336fbfeda18d.png" alt class="image--center mx-auto" /></p>
<p> You can select the Application Type as per your app requirement. For this blog, we will select Web Application as we will be using it to authenticate our APIs via FastAPI. Also, I set the Name the same as our app: <strong>sample-auth-app.</strong></p>
<p> The other two options named Authorised JavaScript Origins and Authorised redirect URIs are also something useful but we will configure them later in the blog with a clear understanding. For now we will keep them empty and click <strong>“Create”.</strong></p>
</li>
<li><p>We will have your Credentials created. We can now download the credentials as a secret JSON file from the home page.</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1735414653718/82ef7a69-f022-4792-be04-eac1eda16b3d.png" alt class="image--center mx-auto" /></p>
<p> We have successfully setup our OAuth Client and have completed the first step.</p>
</li>
</ol>
<hr />
<h2 id="heading-step-2-install-necessary-python-packages">Step 2: Install Necessary Python Packages</h2>
<p>We will install the necessary packages required to create the FastAPI endpoint and secure it with google auth.</p>
<pre><code class="lang-python">python-dotenv
mysql-connector-python
requests
fastapi
uvicorn
python-jose[cryptography]
authlib
pyjwt
itsdangerous
google-auth
</code></pre>
<hr />
<h2 id="heading-step-3-set-up-the-project-structure">Step 3: Set up the Project Structure</h2>
<p>We will follow a proper project structure for building our authenticated API system. Below is the sample project structure we will follow.</p>
<p>The main chat endpoint will be present in the <code>chat.py</code> file. The authentication setup will be present in the <code>auth.py</code> file. These will be routers and accessed via the main FastAPI app in the <code>api.py</code> file.</p>
<p>If you want to know in detail about how FastAPI works, you can check this detailed blog on <a target="_blank" href="https://blog.futuresmart.ai/beginners-guide-to-fastapi-openai-chatgpt-integration">Beginner’s Guide to FastAPI and OpenAI Integration</a>.</p>
<pre><code class="lang-python">fastapi_auth_demo/
├── apis/
│   ├── chat.py
│   ├── auth.py
├── utils/
├── venv/
├── .env
├── .gitignore
├── api.py
├── client_secret.json
├── README.md
└── requirements.txt
</code></pre>
<hr />
<h2 id="heading-step-4-configure-environment-variables">Step 4: Configure Environment Variables</h2>
<p>We will store sensitive credentials securely using environment variables. Create a <code>.env</code> file in the project directory and add:</p>
<pre><code class="lang-plaintext">GOOGLE_CLIENT_ID=&lt;your-google-client-id&gt;
GOOGLE_CLIENT_SECRET=&lt;your-google-client-secret&gt;
SECRET_KEY=&lt;your-secret-key&gt;
REDIRECT_URL=&lt;your-redirect-url&gt;
JWT_SECRET_KEY=&lt;your-secret-key&gt;
FRONTEND_URL=&lt;your-frontend-url&gt;
</code></pre>
<ul>
<li><p>You can directly get the <code>GOOGLE_CLIENT_ID</code> and <code>GOOGLE_CLIENT_SECRET</code> from the JSON file you downloaded in the first step.</p>
</li>
<li><p>Keep the <code>REDIRECT_URL</code> and <code>FRONTEND_URL</code> empty for now and we will fill it with a proper value in the upcoming steps.</p>
</li>
<li><p>The <code>SECRET_KEY</code> is a random string set by us, that is used to set the <code>state</code> parameter in the OAuth flow.</p>
</li>
<li><p>The <code>JWT_SECRET_KEY</code> is a random string set by us, that is basically used as a key to encode and decode the access tokens that will be generated after authentication.</p>
</li>
</ul>
<hr />
<h2 id="heading-step-5-set-up-google-oauth-integration-in-fastapi">Step 5: Set Up Google OAuth Integration in FastAPI</h2>
<p>The core functionality for Google authentication lies in integrating the OAuth2 flow into your FastAPI application. Below is the implementation:</p>
<h3 id="heading-import-dependencies">Import Dependencies</h3>
<p>Start by importing necessary modules and configuring OAuth:</p>
<pre><code class="lang-python"><span class="hljs-comment"># auth.py</span>
<span class="hljs-keyword">from</span> fastapi <span class="hljs-keyword">import</span> FastAPI, Depends, HTTPException, status, Request, Cookie
<span class="hljs-keyword">from</span> fastapi.responses <span class="hljs-keyword">import</span> JSONResponse, RedirectResponse
<span class="hljs-keyword">from</span> authlib.integrations.starlette_client <span class="hljs-keyword">import</span> OAuth
<span class="hljs-keyword">from</span> starlette.middleware.sessions <span class="hljs-keyword">import</span> SessionMiddleware
<span class="hljs-keyword">from</span> datetime <span class="hljs-keyword">import</span> datetime, timedelta
<span class="hljs-keyword">from</span> jose <span class="hljs-keyword">import</span> jwt, ExpiredSignatureError, JWTError
<span class="hljs-keyword">from</span> dotenv <span class="hljs-keyword">import</span> load_dotenv
<span class="hljs-keyword">import</span> os
<span class="hljs-keyword">import</span> uuid
<span class="hljs-keyword">import</span> traceback

<span class="hljs-comment"># Load environment variables</span>
load_dotenv(override=<span class="hljs-literal">True</span>)

<span class="hljs-comment"># App Configuration</span>
app = FastAPI()
app.add_middleware(SessionMiddleware, secret_key=os.getenv(<span class="hljs-string">"FASTAPI_SECRET_KEY"</span>))

<span class="hljs-comment"># OAuth Setup</span>
oauth = OAuth()
oauth.register(
    name=<span class="hljs-string">"auth_demo"</span>,
    client_id=config(<span class="hljs-string">"GOOGLE_CLIENT_ID"</span>),
    client_secret=config(<span class="hljs-string">"GOOGLE_CLIENT_SECRET"</span>),
    authorize_url=<span class="hljs-string">"https://accounts.google.com/o/oauth2/auth"</span>,
    authorize_params=<span class="hljs-literal">None</span>,
    access_token_url=<span class="hljs-string">"https://accounts.google.com/o/oauth2/token"</span>,
    access_token_params=<span class="hljs-literal">None</span>,
    refresh_token_url=<span class="hljs-literal">None</span>,
    authorize_state=config(<span class="hljs-string">"SECRET_KEY"</span>),
    redirect_uri=<span class="hljs-string">"http://127.0.0.1:8000/auth"</span>,
    jwks_uri=<span class="hljs-string">"https://www.googleapis.com/oauth2/v3/certs"</span>,
    client_kwargs={<span class="hljs-string">"scope"</span>: <span class="hljs-string">"openid profile email"</span>},
)

<span class="hljs-comment"># JWT Configurations</span>
SECRET_KEY = os.getenv(<span class="hljs-string">"JWT_SECRET_KEY"</span>)
ALGORITHM = <span class="hljs-string">"HS256"</span>
</code></pre>
<p>This what happens in this part of the code:</p>
<p>The <code>oauth = OAuth()</code> instance initializes the OAuth client. The <code>oauth.register()</code> method registers a new provider, in this case, Google. Here’s what each parameter does:</p>
<ul>
<li><p><code>name=”auth_demo”</code></p>
<ul>
<li>This is a unique name assigned to this OAuth2 provider configuration. It can be referenced in the application when invoking authentication flows.</li>
</ul>
</li>
<li><p><code>client_id=config("GOOGLE_CLIENT_ID")</code></p>
<ul>
<li>The <code>client_id</code> is fetched from configuration (usually environment variables). It identifies your application to Google’s OAuth2 system.</li>
</ul>
</li>
<li><p><code>client_secret=config("GOOGLE_CLIENT_SECRET")</code></p>
<ul>
<li>The <code>client_secret</code>, also fetched from configuration, is a secret key provided by Google that verifies your application during the OAuth2 flow. It should remain confidential.</li>
</ul>
</li>
<li><p><code>authorize_url="</code><a target="_blank" href="https://accounts.google.com/o/oauth2/auth"><code>https://accounts.google.com/o/oauth2/auth</code></a><code>"</code></p>
<ul>
<li>This URL is the endpoint where users are redirected to log in with Google. It starts the OAuth2 authorization flow.</li>
</ul>
</li>
<li><p><code>authorize_params=None</code></p>
<ul>
<li>Any additional parameters for the authorization URL can be specified here. Since it’s <code>None</code>, no extra parameters are added.</li>
</ul>
</li>
<li><p><code>access_token_url="</code><a target="_blank" href="https://accounts.google.com/o/oauth2/token"><code>https://accounts.google.com/o/oauth2/token</code></a><code>"</code></p>
<ul>
<li>After the user logs in and grants permission, your application exchanges the authorization code for an access token at this endpoint.</li>
</ul>
</li>
<li><p><code>access_token_params=None</code></p>
<ul>
<li>Specifies any additional parameters for obtaining the access token. Since it’s <code>None</code>, no extra parameters are included.</li>
</ul>
</li>
<li><p><code>refresh_token_url=None</code></p>
<ul>
<li>If your application needs to refresh tokens, you would specify the refresh token endpoint. Here it’s <code>None</code>, indicating no token refresh functionality is configured.</li>
</ul>
</li>
<li><p><code>authorize_state=config("SECRET_KEY")</code></p>
<ul>
<li>This optional parameter ensures the state parameter (used to prevent CSRF attacks) is securely managed. The <code>SECRET_KEY</code> provides added security.</li>
</ul>
</li>
<li><p><code>redirect_uri="</code><a target="_blank" href="http://127.0.0.1:8000/auth"><code>http://127.0.0.1:8000/auth</code></a><code>"</code></p>
<ul>
<li>The <code>redirect_uri</code> is the URL to which Google redirects the user after a successful login. This must match the redirect URI configured in the Google Cloud Console. We will understand this in detail in the later part of the blog</li>
</ul>
</li>
<li><p><code>jwks_uri="</code><a target="_blank" href="https://www.googleapis.com/oauth2/v3/certs"><code>https://www.googleapis.com/oauth2/v3/certs</code></a><code>"</code></p>
<ul>
<li>The <strong>JWKS URI</strong> (JSON Web Key Set URI) provides the keys needed to validate the tokens issued by Google.</li>
</ul>
</li>
<li><p><code>client_kwargs={"scope": "openid profile email"}</code></p>
<ul>
<li><p>Defines the scopes (permissions) your application requests:</p>
<ul>
<li><p><code>openid</code>: Grants access to the user's identity.</p>
</li>
<li><p><code>profile</code>: Provides access to the user’s basic profile details, such as name and profile picture.</p>
</li>
<li><p><code>email</code>: Grants access to the user’s email address.</p>
</li>
</ul>
</li>
</ul>
</li>
</ul>
<p>The second part of the code configures JWT, a token standard for securely transmitting information between parties.</p>
<ol>
<li><p><code>SECRET_KEY = os.getenv("JWT_SECRET_KEY")</code>:</p>
<ul>
<li><p>Retrieves the secret key from the environment variables. This key is used to encode and decode JWT tokens.</p>
</li>
<li><p>It’s essential to keep this key secure because anyone with access to it can forge valid tokens.</p>
</li>
</ul>
</li>
<li><p><code>ALGORITHM = "HS256"</code>:</p>
<ul>
<li>Specifies the algorithm used for encoding and decoding JWT tokens. Here, <strong>HS256</strong> (HMAC using SHA-256) is chosen, which is widely used for its balance of security and performance.</li>
</ul>
</li>
</ol>
<h3 id="heading-utility-functions-for-jwt">Utility Functions for JWT</h3>
<p>We’ll use JWT (JSON Web Tokens) for managing user sessions securely:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create_access_token</span>(<span class="hljs-params">data: dict, expires_delta: timedelta = None</span>):</span>
    to_encode = data.copy()
    expire = datetime.utcnow() + (expires_delta <span class="hljs-keyword">or</span> timedelta(minutes=<span class="hljs-number">30</span>))
    to_encode.update({<span class="hljs-string">"exp"</span>: expire})
    <span class="hljs-keyword">return</span> jwt.encode(to_encode, SECRET_KEY, algorithm=ALGORITHM)

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_current_user</span>(<span class="hljs-params">token: str = Cookie(<span class="hljs-params">None</span>)</span>):</span>
    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> token:
        <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">401</span>, detail=<span class="hljs-string">"Not authenticated"</span>)

    <span class="hljs-keyword">try</span>:
        payload = jwt.decode(token, SECRET_KEY, algorithms=[ALGORITHM])
        <span class="hljs-keyword">return</span> {<span class="hljs-string">"user_id"</span>: payload.get(<span class="hljs-string">"sub"</span>), <span class="hljs-string">"email"</span>: payload.get(<span class="hljs-string">"email"</span>)}
    <span class="hljs-keyword">except</span> ExpiredSignatureError:
        <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">401</span>, detail=<span class="hljs-string">"Token expired"</span>)
    <span class="hljs-keyword">except</span> JWTError:
        <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">401</span>, detail=<span class="hljs-string">"Invalid token"</span>)
</code></pre>
<p>These are the two basic functions that is crucial.</p>
<ol>
<li><p><code>create_access_token</code> : This function takes the user data, and encodes it in a jwt token using the <code>SECRET_KEY</code> we have set in the .env file and the algorithm we have set. It also takes a parameter called expire which is the duration for which the access_token will be active. After this duration the access_token will expire and will not be a valid one to authenticate a user.</p>
</li>
<li><p><code>get_current_user</code>: This function takes the access_token from the Cookie and decodes it to retrieve the information like user_id, email etc. Whenever an user goes through the login procedure, the access_token will be set as a browser cookie and can be directly accessed from there. We will see in the next part how this actually happens.</p>
</li>
</ol>
<h3 id="heading-login-and-auth-endpoints">Login and Auth Endpoints</h3>
<p>Create endpoints for logging in and authenticating users:</p>
<pre><code class="lang-python"><span class="hljs-meta">@router.get("/login")</span>
<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">login</span>(<span class="hljs-params">request: Request</span>):</span>
    request.session.clear()
    referer = request.headers.get(<span class="hljs-string">"referer"</span>)
    frontend_url = os.getenv(<span class="hljs-string">"FRONTEND_URL"</span>)
    redirect_url = os.getenv(<span class="hljs-string">"REDIRECT_URL"</span>)
    request.session[<span class="hljs-string">"login_redirect"</span>] = frontend_url 

    <span class="hljs-keyword">return</span> <span class="hljs-keyword">await</span> oauth.auth_demo.authorize_redirect(request, redirect_url, prompt=<span class="hljs-string">"consent"</span>)
</code></pre>
<p>When we hit the endpoint <code>/login</code>, we will first fetch the referrer (the domain from where the endpoint is accessed or the request is received). Then we also fetch the <code>FRONTEND_URL</code> and <code>REDIRECT_URL</code> from the .env file. Now why are these required and what values to put in these variables is what we will discuss.</p>
<p>The google authentication process happens in the following way:</p>
<ol>
<li><p>We send the request for login to google using our registered OAuth Client. This ensures the information about the user who is logging in is shared with us and they consent for the same.</p>
</li>
<li><p>Once that is done and the user consents and allow to share information, Google allows the login and attaches the information in an access token.</p>
</li>
<li><p>The control now moves to our end to fetch the access token, decode it and retrieve the information about the user and verify it with our database or system. Now the redirect_url is responsible to decide in which URL should Google forward the control to for this verification and other process. In our case, we will use another router endpoint called /auth which will handle these requirements and we will set our auth endpoint as the redirect_url in the environment variable</p>
</li>
<li><p>Once this is done we can safely say the user is authenticated and allow them access to the apis.</p>
</li>
<li><p>The <code>FRONTEND_URL</code> parameter decides that once the user logs in, gets verified, where can we redirect them to.</p>
</li>
</ol>
<p>Let’s take the example of <a target="_blank" href="https://aidemos.com/ai-playground">AI Demos Playground</a> to understand this complete flow in a much better way.</p>
<p>When you visit, AI Demos Playground you will see the Login button at the top right</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1735417326030/4fb0d847-4b59-463e-acf7-0ed3e9066999.png" alt class="image--center mx-auto" /></p>
<p>When you click the Login button, you will be redirected to the Google Auth Consent Page to consent sharing of information with this website. What actually happens in the backend is that, when the Login button is clicked, it calls the /login api endpoint that redirects you to the google consent screen.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1735417520979/f6aa1240-1a21-41fc-9845-9958a338e0f3.png" alt class="image--center mx-auto" /></p>
<p>Now once you complete the process and consent, the control moves to the /auth endpoint that fetches the access token, verifies it, fetches information and stores them in Database. After all this the the user is redirected to the <code>FRONTEND_URL</code> which in this case will be the AI Demos Playground page from where you did Login.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1735417542281/7105e1e0-5ecc-4104-b3fd-3fd759b2c584.png" alt class="image--center mx-auto" /></p>
<p>Now that you have got the idea of the entire flow, let’s move to the auth endpoint.</p>
<pre><code class="lang-python"><span class="hljs-meta">@router.route("/auth")</span>
<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">auth</span>(<span class="hljs-params">request: Request</span>):</span>
    <span class="hljs-keyword">try</span>:
        token = <span class="hljs-keyword">await</span> oauth.auth_demo.authorize_access_token(request)
    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">401</span>, detail=<span class="hljs-string">"Google authentication failed."</span>)

    <span class="hljs-keyword">try</span>:
        user_info_endpoint = <span class="hljs-string">"https://www.googleapis.com/oauth2/v2/userinfo"</span>
        headers = {<span class="hljs-string">"Authorization"</span>: <span class="hljs-string">f'Bearer <span class="hljs-subst">{token[<span class="hljs-string">"access_token"</span>]}</span>'</span>}
        google_response = requests.get(user_info_endpoint, headers=headers)
        user_info = google_response.json()
    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">401</span>, detail=<span class="hljs-string">"Google authentication failed."</span>)

    user = token.get(<span class="hljs-string">"userinfo"</span>)
    expires_in = token.get(<span class="hljs-string">"expires_in"</span>)
    user_id = user.get(<span class="hljs-string">"sub"</span>)
    iss = user.get(<span class="hljs-string">"iss"</span>)
    user_email = user.get(<span class="hljs-string">"email"</span>)
    first_logged_in = datetime.utcnow()
    last_accessed = datetime.utcnow()

    user_name = user_info.get(<span class="hljs-string">"name"</span>)
    user_pic = user_info.get(<span class="hljs-string">"picture"</span>)

    <span class="hljs-keyword">if</span> iss <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> [<span class="hljs-string">"https://accounts.google.com"</span>, <span class="hljs-string">"accounts.google.com"</span>]:
        <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">401</span>, detail=<span class="hljs-string">"Google authentication failed."</span>)

    <span class="hljs-keyword">if</span> user_id <span class="hljs-keyword">is</span> <span class="hljs-literal">None</span>:
        <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">401</span>, detail=<span class="hljs-string">"Google authentication failed."</span>)

    <span class="hljs-comment"># Create JWT token</span>
    access_token_expires = timedelta(seconds=expires_in)
    access_token = create_access_token(data={<span class="hljs-string">"sub"</span>: user_id, <span class="hljs-string">"email"</span>: user_email}, expires_delta=access_token_expires)

    session_id = str(uuid.uuid4())
    log_user(user_id, user_email, user_name, user_pic, first_logged_in, last_accessed)
    log_token(access_token, user_email, session_id)

    redirect_url = request.session.pop(<span class="hljs-string">"login_redirect"</span>, <span class="hljs-string">""</span>)
    response = RedirectResponse(redirect_url)
    response.set_cookie(
        key=<span class="hljs-string">"access_token"</span>,
        value=access_token,
        httponly=<span class="hljs-literal">True</span>,
        secure=<span class="hljs-literal">True</span>,  <span class="hljs-comment"># Ensure you're using HTTPS</span>
        samesite=<span class="hljs-string">"strict"</span>,  <span class="hljs-comment"># Set the SameSite attribute to None</span>
    )

    <span class="hljs-keyword">return</span> response
</code></pre>
<ol>
<li><strong>Initial Token Exchange:</strong></li>
</ol>
<pre><code class="lang-python">token = <span class="hljs-keyword">await</span> oauth.auth_demo.authorize_access_token(request)
</code></pre>
<ul>
<li><p>This exchanges the authorization code (received from Google after user consent) for an access token</p>
</li>
<li><p>It's the final step of the OAuth flow where Google confirms the user's authentication</p>
</li>
</ul>
<ol start="2">
<li><strong>Getting Additional User Info:</strong></li>
</ol>
<pre><code class="lang-python">user_info_endpoint = <span class="hljs-string">"https://www.googleapis.com/oauth2/v2/userinfo"</span>
headers = {<span class="hljs-string">"Authorization"</span>: <span class="hljs-string">f'Bearer <span class="hljs-subst">{token[<span class="hljs-string">"access_token"</span>]}</span>'</span>}
google_response = requests.get(user_info_endpoint, headers=headers)
user_info = google_response.json()
</code></pre>
<ul>
<li><p>Makes a separate API call to Google to get more detailed user information</p>
</li>
<li><p>Uses the access token to authorize this request</p>
</li>
<li><p>Returns additional details like user's name and profile picture</p>
</li>
</ul>
<ol start="3">
<li><strong>Extracting User Details:</strong></li>
</ol>
<pre><code class="lang-python">user = token.get(<span class="hljs-string">"userinfo"</span>)
expires_in = token.get(<span class="hljs-string">"expires_in"</span>)
user_id = user.get(<span class="hljs-string">"sub"</span>)
iss = user.get(<span class="hljs-string">"iss"</span>)
user_email = user.get(<span class="hljs-string">"email"</span>)
user_name = user_info.get(<span class="hljs-string">"name"</span>)
user_pic = user_info.get(<span class="hljs-string">"picture"</span>)
</code></pre>
<ul>
<li><p>Extracts various user details from both the token and user info response</p>
</li>
<li><p><code>sub</code> is the unique Google user ID</p>
</li>
<li><p><code>iss</code> is the issuer (should be Google)</p>
</li>
<li><p>Gets basic info like email, name, and profile picture</p>
</li>
</ul>
<ol start="4">
<li><strong>Security Validation:</strong></li>
</ol>
<pre><code class="lang-python"><span class="hljs-keyword">if</span> iss <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> [<span class="hljs-string">"https://accounts.google.com"</span>, <span class="hljs-string">"accounts.google.com"</span>]:
    <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">401</span>, detail=<span class="hljs-string">"Google authentication failed."</span>)
</code></pre>
<ul>
<li><p>Verifies that the token was actually issued by Google</p>
</li>
<li><p>Important security check to prevent token forgery</p>
</li>
</ul>
<ol start="5">
<li><strong>Creating JWT Token:</strong></li>
</ol>
<pre><code class="lang-python">access_token_expires = timedelta(seconds=expires_in)
access_token = create_access_token(
    data={<span class="hljs-string">"sub"</span>: user_id, <span class="hljs-string">"email"</span>: user_email}, 
    expires_delta=access_token_expires
)
</code></pre>
<ul>
<li><p>Creates a JWT token containing the user's ID and email</p>
</li>
<li><p>Uses the same expiration time as the Google token</p>
</li>
<li><p>This JWT will be used for subsequent API calls</p>
</li>
</ul>
<ol start="6">
<li><strong>Logging and Session Management:</strong></li>
</ol>
<pre><code class="lang-python">session_id = str(uuid.uuid4())
log_user(user_id, user_email, user_name, user_pic, first_logged_in, last_accessed)
log_token(access_token, user_email, session_id)
</code></pre>
<ul>
<li><p>Generates a unique session ID</p>
</li>
<li><p>Logs the user's information and token for tracking/auditing</p>
</li>
<li><p>Records when the user first logged in and last accessed the system</p>
</li>
</ul>
<p>Here are the <code>log_token</code> and <code>log_user</code> functions which are simple Database CRUD operations</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">log_user</span>(<span class="hljs-params">user_id, user_email, user_name, user_pic, first_logged_in, last_accessed</span>):</span>
    <span class="hljs-keyword">try</span>:
        connection = mysql.connector.connect(host=host, database=database, user=user, password=password)

        <span class="hljs-keyword">if</span> connection.is_connected():
            cursor = connection.cursor()
            sql_query = <span class="hljs-string">"""SELECT COUNT(*) from users WHERE email_id = %s"""</span>
            cursor.execute(sql_query, (user_email,))
            row_count = cursor.fetchone()[<span class="hljs-number">0</span>]

            <span class="hljs-keyword">if</span> row_count == <span class="hljs-number">0</span>:
                sql_query = <span class="hljs-string">"""INSERT INTO users (user_id, email_id,user_name,user_pic,first_logged_in, last_accessed) VALUES (%s, %s, %s, %s, %s, %s)"""</span>
                cursor.execute(sql_query, (user_id, user_email, user_name, user_pic, first_logged_in, last_accessed))

            <span class="hljs-comment"># Commit changes</span>
            connection.commit()

    <span class="hljs-keyword">except</span> Error <span class="hljs-keyword">as</span> e:
        print(<span class="hljs-string">"Error while connecting to MySQL"</span>, e)
    <span class="hljs-keyword">finally</span>:
        <span class="hljs-keyword">if</span> connection.is_connected():
            cursor.close()
            connection.close()

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">log_token</span>(<span class="hljs-params">access_token, user_email, session_id</span>):</span>
    <span class="hljs-keyword">try</span>:
        connection = mysql.connector.connect(host=host, database=database, user=user, password=password)

        <span class="hljs-keyword">if</span> connection.is_connected():
            cursor = connection.cursor()

            <span class="hljs-comment"># SQL query to insert data</span>
            sql_query = <span class="hljs-string">"""INSERT INTO issued_tokens (token, email_id, session_id) VALUES (%s,%s,%s)"""</span>
            <span class="hljs-comment"># Execute the SQL query</span>
            cursor.execute(sql_query, (access_token, user_email, session_id))

            <span class="hljs-comment"># Commit changes</span>
            connection.commit()

    <span class="hljs-keyword">except</span> Error <span class="hljs-keyword">as</span> e:
        print(<span class="hljs-string">"Error while connecting to MySQL"</span>, e)
    <span class="hljs-keyword">finally</span>:
        <span class="hljs-keyword">if</span> connection.is_connected():
            cursor.close()
            connection.close()
            logger.info(<span class="hljs-string">"MySQL connection is closed"</span>)
</code></pre>
<ol start="7">
<li><strong>Setting up Response:</strong></li>
</ol>
<pre><code class="lang-python">redirect_url = request.session.pop(<span class="hljs-string">"login_redirect"</span>, <span class="hljs-string">""</span>)
response = RedirectResponse(redirect_url)
response.set_cookie(
    key=<span class="hljs-string">"access_token"</span>,
    value=access_token,
    httponly=<span class="hljs-literal">True</span>,
    secure=<span class="hljs-literal">True</span>,
    samesite=<span class="hljs-string">"strict"</span>,
)
</code></pre>
<ul>
<li><p>Gets the redirect URL that was stored before starting the OAuth flow</p>
</li>
<li><p>Creates a response that will redirect the user back to the original page</p>
</li>
<li><p>Sets the JWT token as an HTTP-only cookie with security flags:</p>
<ul>
<li><p><code>httponly</code>: Prevents JavaScript access to the cookie</p>
</li>
<li><p><code>secure</code>: Cookie only sent over HTTPS</p>
</li>
<li><p><code>samesite="strict"</code>: Prevents CSRF attacks by only sending cookie to same site</p>
</li>
</ul>
</li>
</ul>
<hr />
<p>Now that we have understood the entire login and authentication process, it is now time to revisit some of the sections above that we have kept as empty.</p>
<ul>
<li><p>While were registering the oauth client, we kept the redirect_uri parameter empty. We also kept the <code>REDIRECT_URL</code> parameter empty in the .env file. This will be assigned the url of our auth endpoint. The reason being after the login is successful from Google, it should again shift the control to the auth endpoint and we would be able to perform the necessary actions as discussed above. So it will be set to <code>127.0.0.1:8000/auth</code></p>
</li>
<li><p>We will be testing the login functionality via tha FastAPI Swagger Docs. So suppose we run the server on the 8000 port. It can be accessed at <code>http://127.0.0.1:8000/docs</code>. Now we want that after login and authentication we should be redirected to this page only. So we will set the FRONTEND_URL as <code>http://127.0.0.1:8000/docs</code>.</p>
<p>  This is how the final .env file look like</p>
</li>
<li><pre><code class="lang-python">                  GOOGLE_CLIENT_ID=&lt;your-google-client-id&gt;
                  GOOGLE_CLIENT_SECRET=&lt;your-google-client-secret&gt;
                  REDIRECT_URL=http://<span class="hljs-number">127.0</span><span class="hljs-number">.0</span><span class="hljs-number">.1</span>:<span class="hljs-number">8000</span>/auth
                  JWT_SECRET_KEY=<span class="hljs-string">"secret-key"</span>
                  FRONTEND_URL=http://<span class="hljs-number">127.0</span><span class="hljs-number">.0</span><span class="hljs-number">.1</span>:<span class="hljs-number">8000</span>/docs
</code></pre>
</li>
<li><p>In the Google Cloud Console we have kept two fields empty while creating our OAuth credentials: <strong>Authorised JavaScript Origins</strong> and <strong>Authorised redirect URIs</strong>.</p>
<ul>
<li><p>The <strong>Authorised JavaScript Origins</strong> is nothing but the source from which you are trying to login and authorize with the help of Google. In our case, we will request login from our localhost so it will be set to the same: <code>127.0.0.1:8000</code></p>
</li>
<li><p>The <strong>Authorised redirect URIs</strong> are the list of URLs that are authorised, to which Google can hand over control after the login process is done. As we want Google to shift control to the auth endpoint, we have to explicitly mention that it is a authorised redirect url from our end, so that Google doesn’t block the request and allows it. If we don’t specify this, the request after login will be blocked as Google ensures the access token and sensitive information is handed over to authorised uris only.</p>
<p>  This is how your Credentials configuration should look like.</p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1735419247864/09d544f7-f600-412a-b610-c09ccd961261.png" alt class="image--center mx-auto" /></p>
</li>
</ul>
</li>
</ul>
<hr />
<h2 id="heading-step-6-create-the-api-endpoint-and-secure-it-with-google-auth-dependency">Step 6: Create the API endpoint and Secure it with Google Auth Dependency</h2>
<p>For securing the endpoint, we will be required to add a Dependency on the endpoint parameters. This dependency will be the access token which will only be available post login. Let’s create the dependency first. We will create this dependency in the <code>auth.py</code> file.</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_current_user</span>(<span class="hljs-params">token: str = Cookie(<span class="hljs-params">None</span>)</span>):</span>
    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> token:
        <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">401</span>, detail=<span class="hljs-string">"Not authenticated"</span>)

    credentials_exception = HTTPException(
        status_code=<span class="hljs-number">401</span>,
        detail=<span class="hljs-string">"Could not validate credentials"</span>,
        headers={<span class="hljs-string">"WWW-Authenticate"</span>: <span class="hljs-string">"Bearer"</span>},
    )
    <span class="hljs-keyword">try</span>:
        payload = jwt.decode(token, SECRET_KEY, algorithms=[ALGORITHM])

        user_id: str = payload.get(<span class="hljs-string">"sub"</span>)
        user_email: str = payload.get(<span class="hljs-string">"email"</span>)

        <span class="hljs-keyword">if</span> user_id <span class="hljs-keyword">is</span> <span class="hljs-literal">None</span> <span class="hljs-keyword">or</span> user_email <span class="hljs-keyword">is</span> <span class="hljs-literal">None</span>:
            <span class="hljs-keyword">raise</span> credentials_exception

        <span class="hljs-keyword">return</span> {<span class="hljs-string">"user_id"</span>: user_id, <span class="hljs-string">"user_email"</span>: user_email}

    <span class="hljs-keyword">except</span> ExpiredSignatureError:
        <span class="hljs-comment"># Specifically handle expired tokens</span>
        traceback.print_exc()
        <span class="hljs-keyword">raise</span> HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail=<span class="hljs-string">"Session expired. Please login again."</span>)
    <span class="hljs-keyword">except</span> JWTError:
        <span class="hljs-comment"># Handle other JWT-related errors</span>
        traceback.print_exc()
        <span class="hljs-keyword">raise</span> credentials_exception
    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        traceback.print_exc()
        <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">401</span>, detail=<span class="hljs-string">"Not Authenticated"</span>)

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">validate_user_request</span>(<span class="hljs-params">token: str = Cookie(<span class="hljs-params">None</span>)</span>):</span>
    session_details = get_current_user(token)

    <span class="hljs-keyword">return</span> session_details
</code></pre>
<p>We will now define the chat endpoint in the <code>chat.py</code> file.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> fastapi <span class="hljs-keyword">import</span> Depends, APIRouter
<span class="hljs-keyword">from</span> auth <span class="hljs-keyword">import</span> get_current_user

router = APIRouter()

<span class="hljs-meta">@router.get("/chat")</span>
<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_response</span>(<span class="hljs-params">current_user: dict = Depends(<span class="hljs-params">get_current_user</span>)</span>):</span>
    <span class="hljs-keyword">return</span> {<span class="hljs-string">"message"</span>: <span class="hljs-string">"Welcome!"</span>, <span class="hljs-string">"user"</span>: current_user}
</code></pre>
<hr />
<h2 id="heading-step-7-building-the-fastapi-app">Step 7: Building the FastAPI App</h2>
<p>Now that we have the authentication endpoint and chat endpoint ready, we can now setup our FastAPI app in the root of the project, in the <code>api.py</code> file.</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> os
<span class="hljs-keyword">from</span> fastapi <span class="hljs-keyword">import</span> FastAPI, Header, HTTPException, Depends, Request
<span class="hljs-keyword">from</span> starlette.config <span class="hljs-keyword">import</span> Config
<span class="hljs-keyword">from</span> fastapi.middleware.cors <span class="hljs-keyword">import</span> CORSMiddleware
<span class="hljs-keyword">from</span> starlette.middleware.sessions <span class="hljs-keyword">import</span> SessionMiddleware
<span class="hljs-keyword">from</span> apis <span class="hljs-keyword">import</span> auth, chat
<span class="hljs-keyword">import</span> time, requests

<span class="hljs-keyword">from</span> dotenv <span class="hljs-keyword">import</span> load_dotenv
load_dotenv(override=<span class="hljs-literal">True</span>)

config = Config(<span class="hljs-string">".env"</span>)

app.add_middleware(
    CORSMiddleware,
    allow_origins=[<span class="hljs-string">"*"</span>],  
    allow_credentials=<span class="hljs-literal">True</span>,
    allow_methods=[<span class="hljs-string">"*"</span>],
    allow_headers=[<span class="hljs-string">"*"</span>],
    expose_headers=[<span class="hljs-string">"*"</span>]
)

<span class="hljs-comment"># Add Session middleware</span>
app.add_middleware(SessionMiddleware, secret_key=config(<span class="hljs-string">"SECRET_KEY"</span>))

<span class="hljs-comment"># # Logging time taken for each api request</span>
<span class="hljs-meta">@app.middleware("http")</span>
<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">log_response_time</span>(<span class="hljs-params">request: Request, call_next</span>):</span>
    start_time = time.time()
    response = <span class="hljs-keyword">await</span> call_next(request)
    process_time = time.time() - start_time
    logger.info(<span class="hljs-string">f"Request: <span class="hljs-subst">{request.url.path}</span> completed in <span class="hljs-subst">{process_time:<span class="hljs-number">.4</span>f}</span> seconds"</span>)
    <span class="hljs-keyword">return</span> response 

app.include_router(chat.router)
app.include_router(auth.router)

<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
    <span class="hljs-keyword">import</span> uvicorn
    <span class="hljs-keyword">import</span> nest_asyncio
    nest_asyncio.apply()
    uvicorn.run(app, host=<span class="hljs-string">"0.0.0.0"</span>, port=<span class="hljs-number">8000</span>)
</code></pre>
<hr />
<h2 id="heading-step-8-running-and-testing-the-application">Step 8: Running and Testing the Application</h2>
<p>Run your application using Uvicorn:</p>
<pre><code class="lang-bash">uvicorn api:app --reload
</code></pre>
<ol>
<li><p><strong>Login Endpoint:</strong> Visit <code>/login</code> to authenticate with Google.</p>
</li>
<li><p><strong>Chat Endpoint:</strong> Use <code>/chat</code> to see if you can see your details after authentication.</p>
<p> If everything was done correctly, you will be able to see your FastAPI function the same way as the demo below.</p>
</li>
</ol>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/630ZLpP6avA">https://youtu.be/630ZLpP6avA</a></div>
<p> </p>
<p>If you were able to setup your Google Auth, integrate it with FastAPI and authenticate your endpoints, then kudos. You have completely understood the process and created a simple FastAPI application backed by Google Authentication working!!</p>
<hr />
<h2 id="heading-step-9-deploying-your-application">Step 9: Deploying your Application</h2>
<p>Now that we have successfully setup our FastAPI application locally and able to integrated Google OAuth with it, its time to move one step ahead and deploy it on a VM. These APIs are the ones that will be consumed by the Frontend Application. So they have to be accessible and hence deploying our application is a crucial step.</p>
<p>You can check out this video <a target="_blank" href="https://youtu.be/7FVPn25mmEQ?si=5p4IhP7u9A5nu2N0">Deploy FastAPI &amp; Open AI ChatGPT on AWS EC2</a> to understand how to deploy your FastAPI application on an AWS EC2 instance. It also has a beginner-friendly YouTube tutorial that can be followed to get the FastAPI application up and running on EC2 VM.</p>
<p>Now that we have our FastAPI application up and running on the VM (for this tutorial assume we have the app running on <strong>8000 port</strong>), it is time to move ahead and configure our domain and nginx that are some crucial steps in the completion of this backend deployment. As we are building this API to be consumed by the frontend, we have to make sure the frontend makes requests to our API and gets response.</p>
<h2 id="heading-domain-mapping-and-ssl-certificate-generation">Domain Mapping and SSL Certificate Generation</h2>
<p>The first step in this configuration is to map our deployed endpoint to a domain and have a SSL certificate assigned to the same.</p>
<h3 id="heading-why-our-api-needs-domain-mapping-and-ssl">Why Our API Needs Domain Mapping and SSL</h3>
<p>Modern browsers block direct IP access to protect users from potential security threats.</p>
<ul>
<li><p>Domain names provide a verifiable identity for your service. SSL certificates are issued to domains, not IPs, because domains can be validated through a chain of trust (Domain Registrar → Certificate Authority → Your Server). IPs can change hands frequently, making them unreliable for identity verification.</p>
</li>
<li><p>Modern web browsers enforce HTTPS for security-critical features. They require:</p>
<ul>
<li><p>A valid domain name (not an IP address)</p>
</li>
<li><p>An SSL certificate from a trusted authority</p>
</li>
<li><p>A match between the domain name and SSL certificate</p>
</li>
</ul>
</li>
</ul>
<p>Without these, browsers display warning messages and may block access entirely, especially for features like secure cookies, service workers, or HTTP/2 connections.</p>
<p>Consider accessing a bank's website. We trust <a target="_blank" href="https://mybank.com"><code>https://mybank.com</code></a> because:</p>
<ul>
<li><p>The domain ownership is verified</p>
</li>
<li><p>The SSL certificate confirms you're connecting to the real bank</p>
</li>
<li><p>Your data is encrypted during transmission</p>
</li>
</ul>
<p>If the bank used just an IP (like <a target="_blank" href="https://193.168.1.1"><code>https://193.168.1.1</code></a>), we'd have no way to verify its authenticity, making it vulnerable to impersonation attacks.</p>
<p>Configuring this is very straightforward and fairly simple. We can follow these simple steps and we will be good to go:</p>
<ol>
<li><p>The first step is to obtain a domain to which you will map your IP to. For example we already have a domain named, <code>sample-auth-app.futuresmartai.com</code> and this is mapped to the IP of our VM where the FastAPI application is hosted.</p>
</li>
<li><p>Now what we want is to ensure that our IP is associated with this domain name and any request coming to this domain, basically get transferred to our FastAPI application. To make this happen, we have to add some NGINX configuration in the VM where our FastAPI app is hosted.</p>
<ul>
<li><p>Open your EC2 instance terminal and make sure to update and install nginx. You can run these two commands in the same order: <code>sudo apt update</code>, <code>sudo apt install nginx</code> . These make sure the libraries are updated and nginx is installed on your EC2 instance.</p>
</li>
<li><p>Move to the directory <code>/etc/nginx/sites-available</code></p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1736407597633/b11824a1-86de-47d1-8c0b-ec1156a82cb7.png" alt class="image--center mx-auto" /></p>
</li>
<li><p>Create a file for nginx configuration by running: <code>sudo nano appconfig</code>. This creates a file named appconfig and opens it in a VIM editor where we can directly put our configurations and save it.</p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1736407887945/51e11448-ea7f-4ad1-a41c-c77c810f1da8.png" alt class="image--center mx-auto" /></p>
</li>
<li><p>Now we can directly put our configuration in this file. Below is the NGINX config we will be using</p>
<pre><code class="lang-nginx">  <span class="hljs-section">server</span> {
      <span class="hljs-attribute">listen</span> <span class="hljs-number">80</span>;
      <span class="hljs-attribute">server_name</span> sample-auth-app.futuresmartai.com;

      <span class="hljs-attribute">location</span> / {
          <span class="hljs-attribute">proxy_pass</span> http://127.0.0.1:8000; <span class="hljs-comment"># Forward requests to the backend server.</span>
          <span class="hljs-attribute">proxy_set_header</span> Host <span class="hljs-variable">$host</span>; <span class="hljs-comment"># Pass the original host header to the backend.</span>
          <span class="hljs-attribute">proxy_set_header</span> X-Real-IP <span class="hljs-variable">$remote_addr</span>; <span class="hljs-comment"># Send the actual client IP to the backend.</span>
          <span class="hljs-attribute">proxy_set_header</span> X-Forwarded-For <span class="hljs-variable">$proxy_add_x_forwarded_for</span>; <span class="hljs-comment"># Include client IPs in forwarded requests.</span>
          <span class="hljs-attribute">proxy_set_header</span> X-Forwarded-Proto <span class="hljs-variable">$scheme</span>; <span class="hljs-comment"># Indicate the original request protocol (HTTP/HTTPS).</span>
      }
  }
</code></pre>
  <div data-node-type="callout">
  <div data-node-type="callout-emoji">💡</div>
  <div data-node-type="callout-text">These headers (Host, X-Real-IP, X-Forwarded-For) are critical for passing client IP addresses and cookies to the backend. Without them, backend logs will show the Nginx IP instead of the client’s IP, and client cookies may not work correctly.</div>
  </div>
</li>
<li><p>We can put this in the appconfig file and save it using Ctrl + O and then press Enter. Finally press Ctrl + X to come out of the editor.</p>
</li>
<li><p>We can now see the config in our file using, <code>cat appconfig</code></p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1736408120715/fdfe2ea4-a6cb-4f65-a31a-f2b98fed5b5f.png" alt class="image--center mx-auto" /></p>
</li>
<li><p>We now have to create a symbolic link using the nginx file in sites-available directory using: <code>sudo ln -s /etc/nginx/sites-available/appconfig /etc/nginx/sites-enabled/</code>. This is a best practice in Nginx server configuration because:</p>
<ol>
<li><p>'sites-available' stores all our server configurations</p>
</li>
<li><p>'sites-enabled' only contains configurations that should be currently active</p>
</li>
<li><p>The symbolic link allows to easily enable/disable sites without duplicating files</p>
</li>
<li><p>When Nginx starts, it only reads configurations from 'sites-enabled'</p>
</li>
</ol>
</li>
<li><p>Now once this is done, we have to test if the nginx config file has no errors. Run: <code>sudo nginx -t</code></p>
</li>
<li><p>If the test is successful, reload the nginx configuration for the server to ensure these changes are applied using: <code>sudo systemctl restart nginx</code></p>
</li>
</ul>
</li>
<li><p>Thats it! Now we will be able to access our FastAPI application on the mapped domain: <code>sample-auth-app.futuresmart.ai/docs</code></p>
</li>
</ol>
<hr />
<h2 id="heading-common-mistakes">Common Mistakes</h2>
<p>We have successfully built a FastAPI backend application using Google OAuth and deployed it to VM and mapped it to a certified domain name. When developing the frontend application that consumes these APIs, there are some common mistakes that tend to occur that can be avoided if the frontend and backend adjust these certain configurations:</p>
<h3 id="heading-cross-domain-cookie-access">Cross-Domain Cookie Access</h3>
<p>This is one of the most trivial mistakes that tend to happen. In many of our applications, we have faced this issue and have identified it as a core mistake that needs to be taken care of during each application being developed.</p>
<p>As we have our FastAPI application hosted on an EC2 instance and mapped to domain. The Frontend application will similarly be deployed on an instance and mapped to a domain. But as the domain of the frontend and backend will be completely different, the cookie being shared becomes a problem. What I mean by this is that:</p>
<ul>
<li><p>When we are accessing the /login endpoint, we have seen that it verifies the identity via Google, creates an access token and sets it as a Cookie in the browser (more specifically in the domain).</p>
</li>
<li><p>If the frontend sends a request to the /login endpoint, even after the entire authentication process, the token will be set as Cookie in the sample-auth-app.futuresmart.ai domain and not the domain where the frontend actually makes the request from</p>
</li>
<li><p>The issue in this approach is that now the frontend has no access to the Cookie which is required to access the /chat endpoint and as well as get the user information</p>
</li>
</ul>
<p>The solution to this is very simple and straightforward. We have to use a proxy from the frontend to ensure the Cookie is being set in the same domain from where the request is being sent.</p>
<ul>
<li><p>This has to be communicated to the frontend dev that on their end, they have to configure the nginx configuration such that the login request from the frontend is not directly made to the backend apis.</p>
</li>
<li><p>They have to setup a proxy such that the request from frontend is sent to frontend domain/login endpoint and upon receiving that it should be sent to the actual backend api.</p>
<ul>
<li><p>Let’s understand this thing in a simple manner. Suppose we have our frontend at <code>sampe-auth-app-frontend.futuresmart.ai</code>. So any request from this domain has to be made via <code>sampe-auth-app-frontend.futuresmart.ai/login</code>. In the proxy configuration it has to be set such that the request coming to <code>sampe-auth-app-frontend.futuresmart.ai/login</code> has to be then redirected to the actual backend api which is <code>sampe-auth-app.futuresmart.ai/login</code>.</p>
</li>
<li><p>In the backend, the REDIRECT_URL in the .env file as well as the oauth object also has to be changed to <code>sampe-auth-app-frontend.futuresmart.ai/auth</code>. This is to ensure that when the login endpoint redirects its authenticated result, it again goes via the frontend and then reaches the backend <code>/auth</code> route.</p>
</li>
<li><p>This will ensure that the communication always happens via the frontend and hence the Cookie will be set in the same domain from where the request is sent, which is <code>sampe-auth-app-frontend.futuresmart.ai</code>.</p>
</li>
</ul>
</li>
</ul>
<h3 id="heading-cross-origin-resource-sharing-cors-issue">Cross-Origin Resource Sharing (CORS) Issue</h3>
<p>This is another very common issue that we have faced in our projects. Mainly during testing locally, we tend to allow all origins to access the apis. But in production, we have to ensure we only allow the origins/domains that can access the apis explicitly. For example, in our case, we have to ensure that only the frontend domain can access our apis. So we have to set our CORS Middleware in the below way:</p>
<pre><code class="lang-python">app.add_middleware(
    CORSMiddleware,
    allow_origins=[<span class="hljs-string">"https://sampe-auth-app-frontend.futuresmart.ai"</span>],
    allow_credentials=<span class="hljs-literal">True</span>,
    allow_methods=[<span class="hljs-string">"*"</span>],
    allow_headers=[<span class="hljs-string">"*"</span>],
    expose_headers=[<span class="hljs-string">"*"</span>]
)
</code></pre>
<p>This ensures our application is secure and prevented from any random access.</p>
<hr />
<h2 id="heading-conclusion">Conclusion</h2>
<p>In this tutorial, we've built a FastAPI application and integrated it with Google Authentication. The system showcases advanced concepts like JWT in FastAPI, OAuth 2.0 Implementation, Cookie based session and user management - which we have already implemented and successfully delivered in numerous enterprise solutions.</p>
<p>This architecture provides several advantages:</p>
<ol>
<li><p><strong>Security</strong>: Combines industry-standard JWT tokens, secure cookies, and Google's robust authentication system to create a multi-layered security approach protecting user data and preventing common attack vectors.</p>
</li>
<li><p><strong>Maintainability</strong>: Implements a clean, middleware-based architecture that centralizes authentication logic and makes it easy to protect new routes while keeping the codebase organized and manageable</p>
</li>
<li><p><strong>Scalability</strong>: Leverages stateless JWT tokens and Google's infrastructure to create an authentication system that can easily scale across multiple servers and services without additional complexity.</p>
</li>
</ol>
<p>If you found this guide helpful and want to explore more advanced backend development techniques, check out our other <a target="_blank" href="https://blog.futuresmart.ai/"><strong>tutorials</strong></a>. If you want to know what skills truly matter to become a successful software developer or AI engineer, check out our blog on <a target="_blank" href="https://blog.futuresmart.ai/must-have-skills-for-upcoming-software-developers-and-ai-engineers-in-2025"><em>Must-Have Skills for Upcoming Software Developers and AI Engineers in 2025</em></a><em>.</em></p>
<p>At <a target="_blank" href="https://www.futuresmart.ai/">FutureSmart AI</a>, we specialize in building develop state-of-the-art AI solutions backed by scalable authentication systems and backend infrastructure for modern web applications. We've successfully implemented diverse authentication architectures across industries, from SaaS platforms to enterprise applications, while also specializing in AI integration by building numerous applications that incorporate Langchain, Langgraph, and ChatGPT through FastAPI frameworks.</p>
<p>For custom implementation support or consultations, feel free to reach out to us at <a target="_blank" href="mailto:contact@futuresmart.ai"><strong>contact@futuresmart.ai</strong></a>. For real-world examples of our work, visit our <a target="_blank" href="https://www.futuresmart.ai/case-studies">case studies</a> where we showcase how our expertise in FastAPI, OAuth integration, and security best practices has helped businesses build robust authentication systems.</p>
<p>Stay tuned for our next tutorial in this series, where we'll dive into building the front-end for this system, accessing the API, and developing an end-to-end full-stack application!</p>
<h2 id="heading-resources"><strong>Resources</strong></h2>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text"><a target="_self" href="https://github.com/PradipNichite/FutureSmart-AI-Blog/tree/main/Google%20OAuth%20Integration%20with%20FastAPI"><strong>Get the Full Code in our GitHub</strong></a></div>
</div>]]></content:encoded></item><item><title><![CDATA[Cliniko API and Python: How We Built a Custom AI Solution for Clinics]]></title><description><![CDATA[Introduction
In today’s fast-paced medical environment, healthcare professionals juggle patient care with administrative tasks that often slow them down. While tools like Cliniko have streamlined practice management—handling scheduling, patient recor...]]></description><link>https://blog.futuresmart.ai/cliniko-api-and-python-how-we-built-a-custom-ai-solution-for-clinics</link><guid isPermaLink="true">https://blog.futuresmart.ai/cliniko-api-and-python-how-we-built-a-custom-ai-solution-for-clinics</guid><category><![CDATA[Cliniko]]></category><category><![CDATA[Python]]></category><category><![CDATA[APIs]]></category><category><![CDATA[AI-automation]]></category><category><![CDATA[Healthcare AI]]></category><dc:creator><![CDATA[Pruthviraj Mahalunge]]></dc:creator><pubDate>Tue, 14 Jan 2025 06:24:31 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1736835385972/83c1002d-cbef-4225-b10c-a44b24aab5f9.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction"><strong>Introduction</strong></h2>
<p>In today’s fast-paced medical environment, healthcare professionals juggle patient care with administrative tasks that often slow them down. While tools like <a target="_blank" href="https://www.cliniko.com/"><strong>Cliniko</strong></a> have streamlined practice management—handling scheduling, patient records, and billing—there’s still room to make workflows even more efficient with <a target="_blank" href="https://www.futuresmart.ai/case-studies"><strong>custom AI-powered solutions</strong></a><strong>.</strong></p>
<p>At <a target="_blank" href="https://www.futuresmart.ai/"><strong>FutureSmart AI</strong></a>, we specialize in building <strong>AI-driven automation, NLP-based assistants, and workflow optimization solutions</strong> that integrate seamlessly with existing platforms. In this blog, we not only explore how to <strong>leverage Cliniko APIs with Python</strong> but also showcase how we helped a client build a <a target="_blank" href="https://www.abby.clinic/"><strong>custom AI solution on top of Cliniko</strong></a>, automating key processes for clinics and improving efficiency.</p>
<h2 id="heading-1-understanding-cliniko-apis"><strong>1. Understanding Cliniko APIs</strong></h2>
<h3 id="heading-what-is-cliniko"><strong>What is Cliniko?</strong></h3>
<p>Cliniko is a cloud-based software designed for comprehensive medical practice management. It combines flexibility and ease of use to streamline daily operations. The platform excels in the following areas:</p>
<ul>
<li><p>Appointment scheduling via a customizable calendar.</p>
</li>
<li><p>Patient management, including medical histories and communication records.</p>
</li>
<li><p>Billing and invoicing with integrated payment tracking.</p>
</li>
<li><p>Multi-location support for organizations operating across various sites.</p>
</li>
</ul>
<h3 id="heading-api-overview"><strong>API Overview</strong></h3>
<p>The Cliniko API enables developers to programmatically access and manipulate data. Here are some key capabilities:</p>
<ul>
<li><p>Retrieve, create, or update appointment records.</p>
</li>
<li><p>Access detailed patient data.</p>
</li>
<li><p>Manage practitioner and business information.</p>
</li>
</ul>
<p>The API’s flexibility allows for custom integrations that cater to unique business workflows, making it a powerful tool for healthcare organizations.</p>
<hr />
<h2 id="heading-2-setting-up-access"><strong>2. Setting Up Access</strong></h2>
<h3 id="heading-1-generate-api-key"><strong>1. Generate API Key</strong></h3>
<p>To generate a Cliniko API key follow below steps :</p>
<ul>
<li>Log into your Cliniko account.</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1736833694345/9c67f4c8-7992-42a7-84aa-f810dce03b16.png" alt class="image--center mx-auto" /></p>
<ul>
<li>Navigate to <strong>My Info</strong> and then go to the <strong>Manage API Keys</strong> section.</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1736833701280/91b6615a-e270-484c-a356-b51fe5222c49.png" alt class="image--center mx-auto" /></p>
<ul>
<li>Click <strong>Add an API key</strong> to create a new one, which will allow you to interact with Cliniko's API.</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1736833712564/6e39d9ff-58d5-4de5-b999-a5c1eef5c86b.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-2-manage-permissions"><strong>2. Manage Permissions</strong></h3>
<p>After generating your API key, assign appropriate <strong>scopes</strong> based on the specific data you need access to. Each scope allows access to different parts of Cliniko’s system (appointments, patients, etc.). Be mindful to select only the required permissions for your needs.</p>
<h3 id="heading-3-understand-shards"><strong>3. Understand Shards</strong></h3>
<ul>
<li><p><strong>Shards</strong> represent geographically isolated instances of Cliniko’s system. When you create a new account, you’ll select a region for hosting your data.</p>
</li>
<li><p>Each shard is identified by a short code at the end of the API key (e.g., <code>au1</code>). This shard ID is crucial when constructing API URLs as it ensures proper routing of your requests to the correct server.</p>
<p>  <strong>Example API key :</strong></p>
</li>
</ul>
<pre><code class="lang-plaintext">MS0xLWl4SzYYYYdtR3V2HNOT..............-au1
</code></pre>
<p>In the above example, the shard is <code>au1</code>.</p>
<h3 id="heading-4-identifying-your-application"><strong>4. Identifying Your Application</strong></h3>
<p>To identify your application, you need to send the <code>User-Agent</code> header. In case of an issue, this will allow Cliniko to track down your requests and contact you. The format for this header should be:</p>
<pre><code class="lang-plaintext">APP_VENDOR_NAME (APP_VENDOR_EMAIL)
</code></pre>
<p>Where:</p>
<ul>
<li><p>APP_VENDOR_NAME is the name of your application.</p>
</li>
<li><p>APP_VENDOR_EMAIL is the contact email for you or your company.</p>
</li>
</ul>
<p><strong>Example of a valid</strong> <code>User-Agent</code> <strong>header :</strong></p>
<pre><code class="lang-less"><span class="hljs-selector-tag">Really</span> <span class="hljs-selector-tag">helpful</span> <span class="hljs-selector-tag">app</span> (contact<span class="hljs-variable">@futuresmart</span>.ai)
</code></pre>
<h3 id="heading-5-consult-documentation"><strong>5. Consult Documentation</strong></h3>
<p>Cliniko’s <a target="_blank" href="https://docs.api.cliniko.com/">https://docs.api.cliniko.com/</a> provides detailed information on available endpoints, parameters, and request examples. Reviewing this documentation will ensure you understand how to interact with Cliniko’s data programmatically.</p>
<hr />
<h2 id="heading-3-getting-started-with-python"><strong>3. Getting Started with Python</strong></h2>
<h3 id="heading-setting-up-the-environment"><strong>Setting Up the Environment</strong></h3>
<p>Start by preparing a Python environment. Here’s what you’ll need:</p>
<ul>
<li><p><strong>Python 3.x</strong></p>
</li>
<li><p>IDE: PyCharm, VSCode, or Jupyter Notebook.</p>
</li>
<li><p>Libraries: Install the necessary packages using pip:</p>
<pre><code class="lang-plaintext">  pip install requests
</code></pre>
</li>
</ul>
<h3 id="heading-authentication-with-cliniko-api"><strong>Authentication with Cliniko API</strong></h3>
<p>Cliniko uses API Key authentication for secure access. Below is a simple way to set it up:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> requests

key = <span class="hljs-string">"your_api_key_here"</span>
shard = API_KEY[<span class="hljs-number">-3</span>:]
headers={<span class="hljs-string">"Accept"</span>:<span class="hljs-string">"application/json"</span>,<span class="hljs-string">"User-Agent"</span>:<span class="hljs-string">""</span>} <span class="hljs-comment"># Use your User-Agent</span>
username = API_KEY  <span class="hljs-comment"># Use API key as the username</span>
password = <span class="hljs-string">""</span>

url = <span class="hljs-string">f"&lt;https://api&gt;.<span class="hljs-subst">{shard}</span>.cliniko.com/v1/bookings/"</span>

response = requests.get(url, headers=headers, auth=(username,password))

<span class="hljs-keyword">if</span> response.status_code == <span class="hljs-number">200</span>:
    bookings = response.json()
    print(<span class="hljs-string">"Bookings fetched successfully:"</span>, bookings)
<span class="hljs-keyword">else</span>:
    print(<span class="hljs-string">f"Failed to fetch bookings. Status code: <span class="hljs-subst">{response.status_code}</span>, Response: <span class="hljs-subst">{response.text}</span>"</span>)
</code></pre>
<hr />
<h2 id="heading-4-making-basic-api-requests"><strong>4. Making Basic API Requests</strong></h2>
<h3 id="heading-fetching-appointment-data"><strong>Fetching Appointment Data :</strong></h3>
<pre><code class="lang-python">url = <span class="hljs-string">f"&lt;https://api&gt;.<span class="hljs-subst">{shard}</span>.cliniko.com/v1/bookings"</span>
response = requests.get(url, headers=headers, auth=(username,password))
<span class="hljs-keyword">if</span> response.status_code == <span class="hljs-number">200</span>:
    appointments = response.json()
    print(<span class="hljs-string">"Appointments fetched successfully:"</span>, appointments)
<span class="hljs-keyword">else</span>:
    print(<span class="hljs-string">"Failed to fetch appointments. Status code:"</span>, response.status_code)
</code></pre>
<h3 id="heading-fetching-individual-appointment-data"><strong>Fetching Individual Appointment Data :</strong></h3>
<pre><code class="lang-python">id = 
url = <span class="hljs-string">f"&lt;https://api&gt;.<span class="hljs-subst">{shard}</span>.cliniko.com/v1/bookings/<span class="hljs-subst">{id}</span>"</span>
response = requests.get(url, headers=headers, auth=(username,password))
<span class="hljs-keyword">if</span> response.status_code == <span class="hljs-number">200</span>:
    appointments = response.json()
    print(<span class="hljs-string">"Appointments fetched successfully:"</span>, appointments)
<span class="hljs-keyword">else</span>:
    print(<span class="hljs-string">"Failed to fetch appointments. Status code:"</span>, response.status_code)
</code></pre>
<h3 id="heading-retrieving-patient-records"><strong>Retrieving Patient Records :</strong></h3>
<pre><code class="lang-python">url = <span class="hljs-string">f"&lt;https://api&gt;.<span class="hljs-subst">{shard}</span>.cliniko.com/v1/patients"</span>
response = requests.get(url, headers=headers, auth=(username,password))
<span class="hljs-keyword">if</span> response.status_code == <span class="hljs-number">200</span>:
    patients = response.json()
    print(<span class="hljs-string">"Patient records fetched successfully:"</span>, patients)
<span class="hljs-keyword">else</span>:
    print(<span class="hljs-string">"Failed to fetch patient records. Status code:"</span>, response.status_code)
</code></pre>
<hr />
<h2 id="heading-5-advanced-api-operations"><strong>5. Advanced API Operations</strong></h2>
<h3 id="heading-filtering-and-query-parameters"><strong>Filtering and Query Parameters</strong></h3>
<p>Cliniko allows the use of powerful query parameters to filter data based on specific conditions. By using the <code>q[]</code> parameter, you can fine-tune your queries for appointments, patients, and other resources. Here are some examples of how to leverage this functionality.</p>
<p><strong>1. Filtering Appointments by Date Range</strong></p>
<p>To fetch appointments within a specific date range, you can apply multiple filters using the <code>q[]</code> parameter for conditions like "greater than" or "less than." For example, retrieving appointments that start after a certain date:</p>
<pre><code class="lang-python">query = {
    <span class="hljs-string">"page"</span>: <span class="hljs-string">"0"</span>,
    <span class="hljs-string">"per_page"</span>: <span class="hljs-string">"1"</span>,
    <span class="hljs-string">"sort"</span>: <span class="hljs-string">"created_at:desc"</span>,
    <span class="hljs-string">"q[]"</span>: <span class="hljs-string">"starts_at:&gt;2014-03-04T20:37:17Z"</span>,  <span class="hljs-comment"># Start date filter</span>
    <span class="hljs-string">"order"</span>: <span class="hljs-string">"asc"</span>
}
url = <span class="hljs-string">f"&lt;https://api&gt;.<span class="hljs-subst">{shard}</span>.cliniko.com/v1/individual_appointments"</span>
response = requests.get(url, params=query, headers=headers, auth=(username,password))

<span class="hljs-keyword">if</span> response.status_code == <span class="hljs-number">200</span>:
    appointments = response.json()
    print(<span class="hljs-string">"Filtered appointments:"</span>, appointments)
<span class="hljs-keyword">else</span>:
    print(<span class="hljs-string">"Failed to fetch filtered appointments. Status code:"</span>, response.status_code)
</code></pre>
<p><strong>2. Using Wildcards for Flexible Search</strong></p>
<p>You can perform more complex searches using wildcards by employing the <code>~~</code> symbol. For instance, to search for patients whose last names include a variation of "johnson":</p>
<pre><code class="lang-python">query = {
    <span class="hljs-string">"page"</span>: <span class="hljs-string">"0"</span>,
    <span class="hljs-string">"per_page"</span>: <span class="hljs-string">"10"</span>,
    <span class="hljs-string">"q[]"</span>: <span class="hljs-string">"last_name:~~ja%on%"</span>,
    <span class="hljs-string">"order"</span>: <span class="hljs-string">"asc"</span>
}
url = <span class="hljs-string">f"&lt;https://api&gt;.<span class="hljs-subst">{shard}</span>.cliniko.com/v1/patients"</span>
response = requests.get(url, params=query, headers=headers, auth=(username,password))

<span class="hljs-keyword">if</span> response.status_code == <span class="hljs-number">200</span>:
    patients = response.json()
    print(<span class="hljs-string">"Filtered patients:"</span>, patients)
<span class="hljs-keyword">else</span>:
    print(<span class="hljs-string">"Failed to fetch filtered patients. Status code:"</span>, response.status_code)
</code></pre>
<p><strong>3. Filtering Archived Records</strong></p>
<p>To fetch archived records, you can filter with <code>archived_at</code> like this:</p>
<pre><code class="lang-python">query = {
    <span class="hljs-string">"page"</span>: <span class="hljs-string">"0"</span>,
    <span class="hljs-string">"per_page"</span>: <span class="hljs-string">"10"</span>,
    <span class="hljs-string">"q[]"</span>: <span class="hljs-string">"archived_at:*"</span>,  <span class="hljs-comment"># Fetch all records including archived</span>
    <span class="hljs-string">"order"</span>: <span class="hljs-string">"asc"</span>
}
url = <span class="hljs-string">f"&lt;https://api&gt;.<span class="hljs-subst">{shard}</span>.cliniko.com/v1/individual_appointments"</span>
response = requests.get(url, params=query, headers=headers, auth=(username,password))

<span class="hljs-keyword">if</span> response.status_code == <span class="hljs-number">200</span>:
    appointments = response.json()
    print(<span class="hljs-string">"Archived appointments:"</span>, appointments)
<span class="hljs-keyword">else</span>:
    print(<span class="hljs-string">"Failed to fetch archived appointments. Status code:"</span>, response.status_code)
</code></pre>
<p><strong>4. Sorting and Ordering Results</strong></p>
<p>By default, results are ordered by the <code>created_at</code> field in ascending order. If you want to sort based on a different field, such as <code>starts_at</code>, and specify the order, you can use the <code>sort</code> and <code>order</code> parameters:</p>
<pre><code class="lang-python">query = {
    <span class="hljs-string">"page"</span>: <span class="hljs-string">"0"</span>,
    <span class="hljs-string">"per_page"</span>: <span class="hljs-string">"10"</span>,
    <span class="hljs-string">"sort"</span>: <span class="hljs-string">"starts_at,created_at:desc"</span>,  <span class="hljs-comment"># Sort by start time, then creation time</span>
    <span class="hljs-string">"q[]"</span>: <span class="hljs-string">"starts_at:&gt;2014-03-04T20:37:17Z"</span>,  <span class="hljs-comment"># Example filter</span>
    <span class="hljs-string">"order"</span>: <span class="hljs-string">"desc"</span>
}
url = <span class="hljs-string">f"&lt;https://api&gt;.<span class="hljs-subst">{shard}</span>.cliniko.com/v1/individual_appointments"</span>
response = requests.get(url, params=query, headers=headers, auth=(username,password))

<span class="hljs-keyword">if</span> response.status_code == <span class="hljs-number">200</span>:
    appointments = response.json()
    print(<span class="hljs-string">"Sorted appointments:"</span>, appointments)
<span class="hljs-keyword">else</span>:
    print(<span class="hljs-string">"Failed to fetch sorted appointments. Status code:"</span>, response.status_code)
</code></pre>
<h3 id="heading-pagination"><strong>Pagination</strong></h3>
<p>Large datasets are paginated. Use the <code>next</code> and <code>prev</code> links to navigate:</p>
<pre><code class="lang-python">url = <span class="hljs-string">f"&lt;https://api&gt;.<span class="hljs-subst">{shard}</span>.cliniko.com/v1/bookings"</span>
<span class="hljs-keyword">while</span> url:
    response = requests.get(url, headers=headers, auth=(username,password))
    <span class="hljs-keyword">if</span> response.status_code == <span class="hljs-number">200</span>:
        data = response.json()
        <span class="hljs-keyword">for</span> record <span class="hljs-keyword">in</span> data[<span class="hljs-string">'records'</span>]:
            print(record)
        url = data.get(<span class="hljs-string">'next'</span>)
    <span class="hljs-keyword">else</span>:
        print(<span class="hljs-string">"Failed to fetch data. Status code:"</span>, response.status_code)
        <span class="hljs-keyword">break</span>
</code></pre>
<h3 id="heading-error-handling-in-api-requests"><strong>Error Handling in API Requests</strong></h3>
<p>To ensure your application handles API errors smoothly, follow this simple structure:</p>
<pre><code class="lang-python"><span class="hljs-keyword">try</span>:
    response = requests.get(url, headers=headers, auth=(username, password))
    response.raise_for_status()  <span class="hljs-comment"># Checks for any HTTP error</span>
    data = response.json()
<span class="hljs-keyword">except</span> requests.exceptions.HTTPError <span class="hljs-keyword">as</span> err:
    print(<span class="hljs-string">f"HTTP error occurred: <span class="hljs-subst">{err}</span>"</span>)
<span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
    print(<span class="hljs-string">f"An error occurred: <span class="hljs-subst">{e}</span>"</span>)
</code></pre>
<p><strong>HTTP Response Codes :</strong></p>
<ul>
<li><p><strong>2xx</strong>: Success – Everything worked as expected.</p>
</li>
<li><p><strong>4xx</strong>: Client error – There was an issue with the request (e.g., missing required parameters).</p>
</li>
<li><p><strong>5xx</strong>: Server error – Something went wrong on the Cliniko server's side.</p>
</li>
</ul>
<hr />
<h2 id="heading-6-challenges-you-can-face-when-using-cliniko-apis-in-python">6. Challenges You Can Face When Using Cliniko APIs in Python</h2>
<h3 id="heading-1-authentication-and-authorization-issues">1. Authentication and Authorization Issues</h3>
<p>Managing API keys securely is essential to avoid access issues or security vulnerabilities. Each API key is associated with a specific <strong>shard</strong>—the geographical server where your data is hosted. When making requests, ensure that the API key includes the correct shard information in the request, such as appending the shard ID to the API endpoint to direct the request to the correct server. Expired or invalid keys can lead to authentication failures, so regularly refresh and monitor your API keys.</p>
<h3 id="heading-2-api-deprecation-or-changes"><strong>2. API Deprecation or Changes</strong></h3>
<p>Cliniko may update or discontinue API endpoints, introduce new versions, or change response formats, causing compatibility issues. Regularly monitoring the API documentation and release notes helps ensure your integration stays up to date.</p>
<hr />
<h2 id="heading-7-custom-solutions-for-our-client-abbyhttpswwwabbyclinic"><strong>7. Custom Solutions for our client</strong> <a target="_blank" href="https://www.abby.clinic/"><strong>Abby</strong></a></h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1736753958064/d14e6591-6e58-4561-85e6-3b943d6248e6.png" alt /></p>
<h3 id="heading-fine-tuning-intent-classification-models-for-abbyhttpswwwabbyclinic"><strong>Fine-Tuning Intent Classification Models for</strong> <a target="_blank" href="https://www.abby.clinic/"><strong>Abby</strong></a></h3>
<p><a target="_blank" href="https://www.abby.clinic/">Abby</a> is an AI-powered solution designed to streamline appointment confirmations in healthcare operations through seamless integration with the Cliniko API. The client required us to develop a secure, stand-alone system that does not rely on broad-based language models, ensuring data privacy and adherence to stringent security standards.Available as a Chrome extension for Cliniko users, Abby uses a proprietary AI language model to analyze inbound responses, updating appointment statuses directly within the Cliniko calendar. With over 98% accuracy, it reliably handles most message intents, including emojis, while ongoing refinements address edge cases like conflicting responses.</p>
<p><strong>Key Features:</strong></p>
<ul>
<li><p><strong>AI-Powered SMS Analysis:</strong> Automatically interpret patient responses and instantly update appointment statuses, focusing on what requires action.</p>
</li>
<li><p><strong>Enhanced Visuals for Cliniko:</strong> Temporary color-coded status markers in your Cliniko calendar make it easy to spot unconfirmed appointments.</p>
</li>
<li><p><strong>Integrated Appointment Notes:</strong> Abby posts SMS responses directly to appointment notes, keeping all communication in one place.</p>
</li>
<li><p><strong>No Learning Curve:</strong> A single click provides access to real-time appointment status without disrupting your workflow.</p>
</li>
</ul>
<p>By automating routine tasks, Abby optimizes healthcare operations, improving patient experience while saving valuable administrative time—without compromising on security or patient confidentiality.</p>
<p><strong><em>Check out Abby here</em></strong> : <a target="_blank" href="https://www.abby.clinic/">https://www.abby.clinic/</a>.</p>
<hr />
<h2 id="heading-8-ai-powered-innovations-for-healthcare">8. AI-Powered Innovations for Healthcare</h2>
<p>We specialize in developing <a target="_blank" href="https://www.futuresmart.ai/case-studies"><strong>custom AI solutions</strong></a> that seamlessly integrate with existing platforms like Cliniko, enabling clinics and healthcare businesses to <strong>automate processes, enhance patient interactions, and extract actionable insights from data.</strong></p>
<p>Here’s how our AI expertise can benefit your practice:</p>
<p>✅ <strong>RAG-Based AI Chatbots for Clinics</strong> – AI-powered chatbots capable of retrieving patient information, answering FAQs, and assisting with appointment scheduling based on real-time Cliniko data.</p>
<p>✅ <strong>NL2SQL for Healthcare Analytics</strong> – Query your Cliniko database using natural language and extract structured insights without writing complex SQL queries.</p>
<p>✅ <strong>Document Parsing with Vision LLMs</strong> – Process invoices, prescriptions, medical records, and insurance forms using advanced AI models that can accurately extract key details from scanned documents.</p>
<p>✅ <strong>Automated Patient Reports &amp; Summaries</strong> – Use AI to generate structured patient reports, summarizing appointment details, lab results, and doctor notes, reducing manual effort.</p>
<p>✅ <strong>AI Agents for Administrative Tasks</strong> – Automate routine clinic operations, such as appointment confirmations, follow-ups, and billing inquiries, reducing workload and improving patient experience.</p>
<p>By combining <strong>Generative AI, NLP, and advanced data processing techniques</strong>, we create solutions that improve efficiency, reduce administrative burden, and optimize patient management.</p>
<hr />
<h2 id="heading-9-why-work-with-us">9. Why Work With Us?</h2>
<p>At <a target="_blank" href="https://www.futuresmart.ai/"><strong>FutureSmart AI</strong></a>, we bring a wealth of experience in <strong>Cliniko APIs, Python development, and cutting-edge AI solutions</strong> tailored for the healthcare industry.</p>
<p>🚀 <strong>Cliniko Expertise</strong> – Deep understanding of Cliniko APIs, enabling seamless integrations and automation.<br />🔍 <strong>AI-Driven Efficiency</strong> – We build secure, privacy-compliant AI solutions that enhance decision-making and reduce operational overhead.<br />📈 <strong>Custom-Tailored Solutions</strong> – Whether you need chatbot automation, document parsing, or AI-driven analytics, we create <strong>bespoke AI applications</strong> to fit your unique needs.<br />🔒 <strong>Security &amp; Compliance</strong> – Our solutions adhere to industry best practices, ensuring patient data confidentiality and regulatory compliance.</p>
<p>Want to <strong>automate your clinic operations</strong> or build a <strong>custom AI-powered healthcare solution</strong>?<br />📩 <strong>Contact us at</strong> <a target="_blank" href="mailto:contact@futuresmart.ai">contact@futuresmart.ai</a> to discuss how we can help!</p>
<hr />
<h2 id="heading-10-conclusion">10. Conclusion</h2>
<p>Cliniko’s API ecosystem, combined with <strong>Python and AI</strong>, unlocks powerful opportunities for <strong>automating clinic workflows, streamlining data processing, and enhancing patient engagement.</strong> Whether it's <strong>AI-powered chatbots, document automation, or analytics-driven insights</strong>, FutureSmart AI can help you <strong>build scalable, intelligent solutions</strong> tailored to your needs.</p>
<p>Let's transform healthcare operations with AI. 🚀</p>
]]></content:encoded></item><item><title><![CDATA[Must-Have Skills for Upcoming Software Developers and AI Engineers in 2025]]></title><description><![CDATA[Over my 8+ years of industry experience—ranging from backend development and data engineering to machine learning and the latest generative AI applications—I’ve seen firsthand what it really takes to add value in a team. As the Founder of FutureSmart...]]></description><link>https://blog.futuresmart.ai/must-have-skills-for-upcoming-software-developers-and-ai-engineers-in-2025</link><guid isPermaLink="true">https://blog.futuresmart.ai/must-have-skills-for-upcoming-software-developers-and-ai-engineers-in-2025</guid><category><![CDATA[internships]]></category><category><![CDATA[General Advice]]></category><category><![CDATA[generative ai]]></category><category><![CDATA[Developer]]></category><category><![CDATA[mentorship]]></category><dc:creator><![CDATA[Pradip Nichite]]></dc:creator><pubDate>Sun, 12 Jan 2025 18:48:55 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1736707657620/ddb7d450-532a-42a7-8401-cc9838b65903.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Over my 8+ years of industry experience—ranging from backend development and data engineering to machine learning and the latest generative AI applications—I’ve seen firsthand what it really takes to add value in a team. As the Founder of <a target="_blank" href="https://www.futuresmart.ai/">FutureSmart AI</a>, I’ve mentored numerous students and interviewed candidates, and one clear pattern emerges:</p>
<p><strong>You need solid fundamental skills in coding, APIs, Databases and Communications to thrive in the rapidly evolving tech landscape.</strong> This holds true whether you aim to become a traditional software developer or you’re intrigued by the cutting edge of Generative AI. If you’re focused on learning the “shiny” new <a target="_blank" href="https://aidemos.com/">AI tools</a> without first mastering the core foundations, you’ll struggle to deliver real impact.</p>
<p><a target="_blank" href="https://www.linkedin.com/feed/update/urn:li:activity:7242065082073927681/">Being an Expert-Vetted (Top 1%) on Upwork with over $300K+ earned and a 100% Job Success rate</a> has further reinforced my understanding of what skills truly matter in the industry.</p>
<p>Below, I’ll walk through the essential skills that, in my experience, every aspiring developer or AI engineer should prioritize.</p>
<h2 id="heading-1-why-practical-skills-matter">1. Why Practical Skills Matter</h2>
<p>Many newcomers are drawn to the excitement of <a target="_blank" href="https://aidemos.com/ai-playground">advanced AI concepts</a>—like Generative AI, LLM’s, RAG (Retrieval Augmented Generation), or vector databases. While these innovations are undoubtedly game-changing, they can overshadow the day-to-day coding, debugging and deployment tasks that keep a project running smoothly.</p>
<p><strong>Think about it:</strong> Even the most cutting-edge AI system is useless if it can’t be integrated into a functional application. That requires reliable code, proper database connections, and well-structured APIs. It’s why companies—when hiring interns or fresh grads—emphasize practical software development skills just as much as familiarity with new AI trends.</p>
<p><strong>Key point:</strong> Whether your goal is traditional software development or specialized AI engineering, you’ll stand out by showing you can handle the basics and add value to the team and not liability. Mastering these fundamentals will make it far easier to adopt advanced <a target="_blank" href="https://aidemos.com/">AI tools</a> effectively.</p>
<h2 id="heading-2-embracing-self-learning-amp-adaptability">2. Embracing Self-Learning &amp; Adaptability</h2>
<p>In a world where <a target="_blank" href="https://youtu.be/RALmm6flXII?si=w_gJy6bOginm9IiP"><strong>ChatGPT</strong></a> and <a target="_blank" href="https://www.youtube.com/@FutureSmartAI"><strong>YouTube tutorials</strong></a> are just a click away, memorizing syntax is less important than knowing <strong>how</strong> to discover answers and adapt them to your specific needs. When you face a new challenge—be it <strong>connecting a database</strong> or <strong>troubleshooting an API route</strong>—your ability to learn on the fly is what truly sets you apart.</p>
<ul>
<li><p><strong>Resourcefulness Over Rote Memorization</strong>: Whether you’re copying a snippet from Stack Overflow or ChatGPT, your real value is in customizing and debugging that code for your application.</p>
</li>
<li><p><strong>Hands-On Experience</strong>: Reading documentation is great, but actually <strong>breaking things</strong> and fixing them is how you learn effectively.</p>
</li>
<li><p><strong>Continuous Upskilling</strong>: The AI landscape evolves quickly. If you want to keep up, you need to be comfortable <strong>teaching yourself</strong> new frameworks and libraries.</p>
</li>
</ul>
<h2 id="heading-3-mastering-practical-python">3. Mastering Practical Python</h2>
<p>While you don’t need to be a “Python guru” or a competitive programming champion, you do need to write clear, functional code that gets the job done. Here’s what matters most:</p>
<ul>
<li><p><strong>Use the Right Data Structures:</strong> Know when to use lists for ordered data, dictionaries for key-value pairs, and tuples for lightweight groupings.</p>
</li>
<li><p><strong>Organize Your Code with Functions and Files:</strong> This makes your code more readable, reusable, and easier to debug. keep files small instead of writing long scripts.</p>
</li>
<li><p><strong>Handle Files and Data:</strong> Most real-world software and AI projects involve ingesting or cleaning data. Be comfortable working with CSV, JSON, or other common formats.</p>
</li>
</ul>
<h3 id="heading-common-pitfalls-to-avoid"><strong>Common Pitfalls to Avoid:</strong></h3>
<p>❌ Writing overly complex, hard-to-read code.</p>
<p>❌ Ignoring error handling—always use <code>try-except</code> blocks.</p>
<p>❌ Copy-pasting from Stack Overflow or ChatGPT without understanding the solution.</p>
<p>❌ Writing long, unstructured Python files instead of modular, function-based code split into small files.</p>
<h3 id="heading-hands-on-challenge"><strong>Hands-on Challenge:</strong></h3>
<blockquote>
<p>Write a Python script that reads a JSON file, processes it, and writes an updated file. Keep the script modular by splitting different functionalities into separate functions and files.</p>
</blockquote>
<p>In the professional world, clarity and reliability are often more valuable than clever but cryptic code. Once your Python basics are solid, you’ll be able to pick up new libraries or advanced frameworks with minimal fuss.</p>
<h2 id="heading-4-working-with-apis">4. Working with APIs</h2>
<p>The ability to consume and provide APIs is crucial in modern software development—whether you’re building e-commerce backends, <a target="_blank" href="https://youtu.be/W3FadhIpSmU?si=QQLZyacFcHo70M4v">AI-driven chatbots</a>, or anything in between.</p>
<h3 id="heading-api-consumption"><strong>API Consumption</strong></h3>
<p>Most applications communicate via RESTful APIs. Know how to:</p>
<p>✅ Send requests (<strong>GET</strong>, <strong>POST</strong>, <strong>PUT</strong>, <strong>DELETE</strong>).</p>
<p>✅ Handle authentication (e.g., bearer tokens, API keys).</p>
<p>✅ Parse JSON responses (often deeply nested).</p>
<h3 id="heading-handling-authentication-amp-security"><strong>Handling Authentication &amp; Security</strong></h3>
<p>Expect to encounter JWT (JSON Web Tokens), cookies, and CORS (Cross-Origin Resource Sharing). Understanding these ensures your app communicates securely and efficiently.</p>
<h3 id="heading-working-with-json"><strong>Working with JSON</strong></h3>
<p>APIs typically return JSON, so you’ll deal with Python dictionaries. Knowing how to navigate nested structures and handle edge cases will save you hours of troubleshooting.</p>
<h3 id="heading-practical-use-cases"><strong>Practical Use Cases</strong></h3>
<ul>
<li>Fetch data from a public APIs (e.g., weather, currency exchange), transform the response, and store it for analytics.</li>
</ul>
<h2 id="heading-5-essential-sql-and-database-operations">5. Essential SQL and Database Operations</h2>
<p>No matter what you build, you need a reliable way to store and retrieve data. SQL databases like <strong>MySQL</strong> and <strong>PostgreSQL</strong> are still core technologies in production environments.</p>
<ul>
<li><p><strong>Basic SQL Queries:</strong> Learn how to <code>SELECT</code>, <code>INSERT</code>, <code>UPDATE</code>, and <code>DELETE</code>. These cover most interactions with a relational database.</p>
</li>
<li><p><strong>Connecting with Python:</strong> Use libraries like <code>psycopg2</code> or <code>mysql-connector-python</code> to integrate your Python application with the database.</p>
</li>
</ul>
<h3 id="heading-real-world-examplehttpsyoutube38amtxy2ususi1ryzzvkt95zcq64oampt4569"><a target="_blank" href="https://youtu.be/38aMTXY2usU?si=1RyzZVkT95Zcq64O&amp;t=4569"><strong>Real-World Example</strong></a></h3>
<p>Many LLMs don’t persist previous user inputs once an API call finishes. To maintain conversation history (for a chatbot or a Generative AI assistant):</p>
<ol>
<li><p>Assign a <strong>session ID</strong> for each conversation.</p>
</li>
<li><p>Store all messages (both user and AI) in a SQL database.</p>
</li>
<li><p>For subsequent queries, retrieve the conversation history using that session ID.</p>
</li>
<li><p>Pass the history to the LLM for context, then store the new response.</p>
</li>
</ol>
<p>This simulates “memory” and vastly improves user experience.</p>
<h3 id="heading-practice-task"><strong>Practice Task:</strong></h3>
<blockquote>
<p>Create a simple database that stores user messages with timestamps and session ids. Design a query to retrieve past messages for a specific user session.</p>
</blockquote>
<h2 id="heading-6-creating-your-own-apis-with-fastapihttpsyoutubekvdp4spwcc4siijxk83spo3qlrsim">6. <a target="_blank" href="https://youtu.be/KVdP4SpWcc4?si=IjXk83spo3qlRSim">Creating Your Own APIs with FastAPI</a></h2>
<p>Even if you’re not building complex AI services, knowing how to serve any functionality via an API is invaluable. FastAPI is a popular choice in Python circles due to its simplicity and asynchronous capabilities.</p>
<h3 id="heading-why-fastapi"><strong>Why FastAPI?</strong></h3>
<ul>
<li><p><strong>Speed and Simplicity:</strong> An asynchronous framework that makes handling concurrent requests more efficient.</p>
</li>
<li><p><strong>Auto-Generated Documentation:</strong> Teammates and clients can easily understand and test your endpoints.</p>
</li>
<li><p><strong>Widely Used in AI &amp; ML:</strong> FastAPI has become a go-to for deploying machine learning models quickly.</p>
</li>
</ul>
<h3 id="heading-key-concepts"><strong>Key Concepts</strong></h3>
<ul>
<li><p><strong>Endpoints (Routes):</strong> Clearly define the URLs (e.g., <code>/predict</code>) for specific functionalities.</p>
</li>
<li><p><strong>Sync vs. Async:</strong> Leverage asynchronous functions for better performance when scaling out.</p>
</li>
<li><p><strong>Data Validation:</strong> Validate incoming JSON data before processing to prevent runtime errors.</p>
</li>
</ul>
<h3 id="heading-mini-project"><strong>Mini Project:</strong></h3>
<blockquote>
<p>Build a FastAPI service that accepts text input and returns a sentiment score using an NLP model.</p>
</blockquote>
<h2 id="heading-6-getting-comfortable-with-basic-deploymenthttpsyoutube7fvpn25mmeqsijaygnuikbhizbre3">6. <a target="_blank" href="https://youtu.be/7FVPn25mmEQ?si=JAYGnuiKbHIZbrE3">Getting Comfortable with Basic Deployment</a></h2>
<p>Whether you’re a back-end engineer or a budding AI Engineer, your application needs to be accessible to others to deliver real value. That’s where deployment comes in.</p>
<h3 id="heading-why-deployment-matters"><strong>Why Deployment Matters</strong></h3>
<ul>
<li><p><strong>Accessibility:</strong> A deployed app or API can be accessed by anyone with the right permissions.</p>
</li>
<li><p><strong>Real-World Feedback:</strong> Live usage data and metrics guide iterative improvements.</p>
</li>
<li><p><strong>Collaboration:</strong> Demonstrates you understand the full development lifecycle, not just coding.</p>
</li>
</ul>
<h3 id="heading-common-deployment-options"><strong>Common Deployment Options</strong></h3>
<ol>
<li><p><strong>Cloud VMs:</strong> AWS, Azure, or Google Cloud let you rent servers and install your software stack.</p>
</li>
<li><p><strong>PaaS (Platform as a Service):</strong> Services like Heroku or Render manage much of the infrastructure for you.</p>
</li>
<li><p><strong>Company Infrastructure:</strong> Internships often provide access to enterprise-grade tools, a great way to learn hands-on without personal costs.</p>
</li>
</ol>
<h3 id="heading-hands-on-task"><strong>Hands-on Task:</strong></h3>
<blockquote>
<p>Deploy your FastAPI app to Render or a cloud VM and access it via a public URL.</p>
</blockquote>
<h2 id="heading-7-where-advanced-ai-concepts-fit-in">7. Where Advanced AI Concepts Fit In</h2>
<p>Once you’ve nailed down the basics, exploring advanced AI topics becomes far more rewarding. Whether it’s Generative AI, RAG, vector databases, or frameworks like LangChain, these cutting-edge tools are best leveraged when you can integrate them seamlessly into real applications.</p>
<h3 id="heading-integration-over-isolation"><strong>Integration Over Isolation</strong></h3>
<ul>
<li><p><a target="_blank" href="https://youtu.be/38aMTXY2usU?si=AqdP50LzuuWcPokO"><strong>RAG (Retrieval Augmented Generation)</strong></a><strong>:</strong> Fetch relevant context from a vector database, then pass it to an LLM.</p>
</li>
<li><p><a target="_blank" href="https://youtu.be/5NG8mefEsCU?si=y9AKNx7KsRo2RgnD"><strong>Vector Databases</strong></a><strong>:</strong> Specialized for embedding-based searches, often used in semantic search or question-answering.</p>
</li>
<li><p><a target="_blank" href="https://youtu.be/NQWfvhw7OcI?si=dv_VfrRQYQKwUw2S"><strong>LangChain (and Similar)</strong></a><strong>:</strong> Frameworks that streamline building LLM-powered applications.</p>
</li>
</ul>
<h2 id="heading-real-world-example"><strong>Real-World Example:</strong></h2>
<blockquote>
<p>Suppose you're developing a document search tool for legal professionals. Using RAG and vector databases allows efficient retrieval of case laws based on user queries, improving productivity compared to traditional keyword-based search. and legal precedents for lawyers. It uses RAG to fetch the right references and LLMs to summarize them.</p>
</blockquote>
<h3 id="heading-applied-ai-use-case"><strong>Applied AI Use Case:</strong></h3>
<p>Imagine you're building an AI-powered FAQ chatbot. Instead of just <strong>generating random responses</strong>, you: 1️⃣ <strong>Retrieve relevant FAQs</strong> from a vector database.</p>
<p>2️⃣ <strong>Pass that context to an LLM.</strong></p>
<p>3️⃣ <strong>Generate a response using the AI model.</strong></p>
<p>4️⃣ <strong>Deliver the output via an API.</strong></p>
<p>The entire system <strong>depends</strong> on strong coding, API, and database skills.</p>
<blockquote>
<p>Tip: Instead of jumping straight to LangChain, try manually implementing a RAG pipeline to understand how everything connects.</p>
</blockquote>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/8sSHg1034r0?si=-q_Q35vXNo6R_FcL">https://youtu.be/8sSHg1034r0?si=-q_Q35vXNo6R_FcL</a></div>
<p> </p>
<h2 id="heading-8-conclusion-amp-next-steps">8. Conclusion &amp; Next Steps</h2>
<p>After years of working I can confidently say these foundational skills are what truly enable success—whether you’re a software developer, a data scientist, or an AI specialist. I’ve interviewed and mentored many candidates, and the ones who excel are those who can <strong>both</strong> understand advanced AI concepts <strong>and</strong> handle the essential coding and deployment work.</p>
<h3 id="heading-key-takeaways"><strong>Key Takeaways</strong></h3>
<ul>
<li><p><strong>Practice &amp; Portfolio:</strong> Build small, complete projects that demonstrate your ability to write clean Python code, work with APIs, interact with a database, and (ideally) deploy your application.</p>
</li>
<li><p><strong>Seek Internships &amp; Mentorship:</strong> Real-world experience accelerates learning. You’ll gain exposure to production environments and valuable feedback from peers or senior developers.</p>
</li>
<li><p><strong>Explore Advanced AI Topics:</strong> Once you have the basics locked in, frameworks like LangChain or vector databases can add significant value to your skill set.</p>
</li>
<li><p><strong>Stay Curious &amp; Adaptable:</strong> Technology moves fast—those who keep learning and adapting will always stay relevant.</p>
</li>
</ul>
<p><strong>Final Thought:</strong></p>
<p>The world of tech—especially AI—is growing at an unprecedented pace. Make sure you have the fundamentals in place so you can ride that wave rather than getting washed away by it. With a strong foundation, you’ll be ready to contribute meaningfully to any team, whether you’re a software developer, data engineer, or the next generative AI whiz.</p>
<p>For real-world examples of our work, take a look at our <a target="_blank" href="https://www.futuresmart.ai/case-studies"><strong>case studies</strong></a>.</p>
]]></content:encoded></item><item><title><![CDATA[Freshdesk Ticketing Automation with AI Agents: A Complete Guide]]></title><description><![CDATA[Introduction
At FutureSmart AI, we've helped many clients improve their support systems with AI customer service solutions. Now, we're sharing our proven approach to creating a custom AI assistant for customer service using Freshdesk. This tutorial s...]]></description><link>https://blog.futuresmart.ai/freshdesk-automation-with-ai-agents</link><guid isPermaLink="true">https://blog.futuresmart.ai/freshdesk-automation-with-ai-agents</guid><category><![CDATA[AI]]></category><category><![CDATA[#agent]]></category><category><![CDATA[automation]]></category><category><![CDATA[RAG ]]></category><category><![CDATA[langchain]]></category><category><![CDATA[freshdesk]]></category><dc:creator><![CDATA[Shreyas Dhaware]]></dc:creator><pubDate>Thu, 09 Jan 2025 11:55:31 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1736414395312/24a22eb8-e0a7-41e0-b80b-82bdcc58aafd.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>At <a target="_blank" href="https://www.futuresmart.ai/">FutureSmart AI</a>, we've helped many clients improve their support systems with <strong>AI customer service</strong> solutions. Now, we're sharing our proven approach to creating a custom <strong>AI assistant for customer service</strong> using Freshdesk. This tutorial shows you how to build an <strong>AI customer service chatbot</strong> that delivers real results – from translating customer queries into SQL commands to generating smart responses. Having deployed similar solutions across various industries, we know exactly what works. By the end of this guide, you'll have a fully functional <strong>AI customer support agent</strong> ready to streamline your support operations.</p>
<p>Building on the foundational concepts covered in our previous <a target="_blank" href="https://blog.futuresmart.ai/series/langraph-tutorial"><strong>LangGraph tutorial series</strong></a>, where we explored creating simple AI tools for customer service using nodes and edges, this guide takes it a step further. You'll learn how to develop an agent that seamlessly:</p>
<ul>
<li><p>Translates natural language into precise SQL queries</p>
</li>
<li><p>Interacts with <strong>Freshdesk APIs</strong> to manage tickets</p>
</li>
<li><p>Taps into knowledge bases for accurate responses</p>
</li>
<li><p>Retrieves real-time order status and product details</p>
</li>
<li><p>Handles shipping and return policy inquiries</p>
</li>
<li><p>Generates context-aware responses to customer queries</p>
</li>
</ul>
<p>Drawing from our years of implementing AI solutions, we've distilled the essentials of building a robust <strong>Freshdesk chatbot</strong> that truly delivers results. This tutorial combines real-world applications with step-by-step guidance to help you create an AI-powered chatbot capable of retrieving ticket details, executing database queries, and enhancing customer service automation.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1735384007948/7c4fad54-92f0-4876-9317-21890434e0d2.jpeg" alt class="image--center mx-auto" /></p>
<h2 id="heading-setting-up-the-environment">Setting up the Environment</h2>
<pre><code class="lang-bash">!pip install -U langchain langchain-chroma pypdf sentence-transformers langgraph langchain_openai langchain_community requests os
</code></pre>
<h2 id="heading-setting-up-api-keys">Setting up API Keys</h2>
<p>Before diving into building your Freshdesk AI integration services, it’s crucial to set up your API keys. These keys allow your agent to interact securely with external environments and tools like <strong>Freshdesk</strong> and OpenAI GPT models. Without them, the tools cannot function effectively.</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> getpass
<span class="hljs-keyword">import</span> os

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">_set_env</span>(<span class="hljs-params">var: str</span>):</span>
    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> os.environ.get(var):
        os.environ[var] = getpass.getpass(<span class="hljs-string">f"<span class="hljs-subst">{var}</span>: "</span>)

_set_env(<span class="hljs-string">"FRESHDESK_API_KEY"</span>)
_set_env(<span class="hljs-string">"OPENAI_API_KEY"</span>)
_set_env(<span class="hljs-string">"FESHDESK_URL"</span>)
</code></pre>
<h2 id="heading-creating-the-llm-object">Creating the LLM Object</h2>
<p>Here’s how to initialize the LLM using LangChain <code>ChatOpenAI</code></p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain_openai <span class="hljs-keyword">import</span> ChatOpenAI

llm = ChatOpenAI(model_name=<span class="hljs-string">"gpt-4o"</span>)
</code></pre>
<h2 id="heading-summary-tool-integration">Summary Tool Integration</h2>
<p><strong>Automated customer support</strong> often requires summarizing ticket messages quickly. This tool fetches and processes data via <strong>Freshdesk APIs</strong> to generate concise summaries.</p>
<ol>
<li><p><strong>Get Ticket Message</strong>: Function that gets Ticket Data through Freshdesk API’s given <code>ticket_id</code></p>
<pre><code class="lang-python"> <span class="hljs-keyword">import</span> requests
 <span class="hljs-keyword">import</span> os
 FRESHDESK_API_KEY = os.getenv(<span class="hljs-string">"FRESHDESK_API_KEY"</span>)
 FESHDESK_URL = os.getenv(<span class="hljs-string">"FESHDESK_URL"</span>)

 <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_messages</span>(<span class="hljs-params">ticket_id</span>):</span>
     url = <span class="hljs-string">f'https://<span class="hljs-subst">{FESHDESK_URL}</span>.freshdesk.com/api/v2/tickets/<span class="hljs-subst">{ticket_id}</span>?include=conversations'</span>
     headers = {
         <span class="hljs-string">'Content-Type'</span>: <span class="hljs-string">'application/json'</span>
     }
     auth = (FRESHDESK_API_KEY, <span class="hljs-string">'X'</span>)  <span class="hljs-comment"># Replace 'X' with your actual API key</span>
     response = requests.get(url, headers=headers, auth=auth)
     <span class="hljs-keyword">return</span> response.json()
</code></pre>
</li>
<li><p><strong>Process Message:</strong></p>
<pre><code class="lang-python"> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">process_messages</span>(<span class="hljs-params">messages</span>):</span>
     subject = messages[<span class="hljs-string">'subject'</span>]
     description = messages[<span class="hljs-string">'description_text'</span>]
     conversation = messages[<span class="hljs-string">'conversations'</span>]
     messages = <span class="hljs-string">""</span>
     messages += <span class="hljs-string">"Subject : "</span> + subject + <span class="hljs-string">"\n"</span>
     messages += <span class="hljs-string">"User query : \n"</span> + description + <span class="hljs-string">"\n"</span>
     <span class="hljs-keyword">for</span> message <span class="hljs-keyword">in</span> conversation:
         <span class="hljs-keyword">if</span> message[<span class="hljs-string">'incoming'</span>]:
             messages += <span class="hljs-string">"User Query : \n"</span> + message[<span class="hljs-string">'body_text'</span>] + <span class="hljs-string">"\n"</span>
         <span class="hljs-keyword">else</span>:
             messages += <span class="hljs-string">"Agent Response : \n"</span> + message[<span class="hljs-string">'body_text'</span>] + <span class="hljs-string">"\n"</span>
     <span class="hljs-keyword">return</span> messages
</code></pre>
</li>
<li><p><strong>Summary Chain:</strong> This method consolidates all the input Ticket Messages into a single prompt, which the LLM then processes.</p>
<pre><code class="lang-python"> prompt = ChatPromptTemplate.from_messages(
      [(<span class="hljs-string">"system"</span>, <span class="hljs-string">"""FutureSmart TechStore is committed to being the leading destination for technology enthusiasts worldwide by combining quality, value, and exceptional service. At FutureSmart TechStore, we don’t just sell products—we provide solutions that enhance your lifestyle and work environment. We aim to offer great value to our customers through a range of promotions and discounts, ensuring you get the best deals on the latest technology products.

     Our collection includes a wide range of tech products designed to meet the needs of various customers, Our collection includes a wide range of tech products such as Smart Home Solutions (intelligent lighting systems, advanced security devices), Wearable Technology (fitness trackers, smartwatches), Mobile Accessories (chargers, cases, screen protectors), and Computing Peripherals (keyboards, mice, storage devices), designed to meet the needs of various customers, from tech enthusiasts to everyday users.

     If the query is not in English, please identify the language and generate a summary of the conversation in English. The response should be generated in the same language as the query.Also mention order id and customer name if present.\\n\\n{context}. Summarize:"""</span>)]
     )

 <span class="hljs-comment"># Instantiate chain</span>
 chain = prompt | llm
 result = chain.invoke({<span class="hljs-string">"context"</span>: context})
</code></pre>
</li>
<li><p><strong>Build a Summary Tool:</strong> Define a tool that generates ticket summary</p>
<pre><code class="lang-python"> <span class="hljs-keyword">from</span> langchain.tools <span class="hljs-keyword">import</span> tool
 <span class="hljs-keyword">from</span> pydantic <span class="hljs-keyword">import</span> BaseModel

 <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">SummarizeToolSchema</span>(<span class="hljs-params">BaseModel</span>):</span>
     ticket_id: int

<span class="hljs-meta"> @tool(args_schema=SummarizeToolSchema)</span>
 <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">summarize_tool</span>(<span class="hljs-params">ticket_id: int</span>):</span>
     <span class="hljs-string">"""Tool to Summarize Ticket information based on ticket_id provided
     `ticket_id`: int"""</span>
     print(<span class="hljs-string">"INSIDE SUMMARIZE NODE"</span>, ticket_id)
     <span class="hljs-comment"># Invoke chain</span>
     data = get_messages(ticket_id)
     context = process_messages(data)
     print(context)
     result = chain.invoke({<span class="hljs-string">"context"</span>: context})
     <span class="hljs-keyword">return</span> <span class="hljs-string">f"ticket_id: <span class="hljs-subst">{ticket_id}</span>"</span>, <span class="hljs-string">f"Ticket Summary: <span class="hljs-subst">{result.content}</span>"</span>
</code></pre>
</li>
<li><p><strong>Testing the Tool</strong></p>
<pre><code class="lang-python"> summarize_tool.invoke({<span class="hljs-string">"ticket_id"</span>:<span class="hljs-number">130</span>})
</code></pre>
<pre><code class="lang-python"> <span class="hljs-comment"># output</span>
 INSIDE SUMMARIZE NODE <span class="hljs-number">130</span>
 <span class="hljs-string">'The customer is inquiring about the status of their order #12345, as UPS lost the package. They understood that a new package would be sent immediately, but nearly a month has passed with no update. They are asking for information on what is happening.'</span>
</code></pre>
</li>
</ol>
<h2 id="heading-agentic-rag-tool-integration">Agentic RAG Tool Integration</h2>
<p>The <a target="_blank" href="https://blog.futuresmart.ai/langchain-rag-from-basics-to-production-ready-rag-chatbot"><strong>Retrieval-Augmented Generation (RAG)</strong></a> enhances your AI agent by enabling it to fetch relevant documents and deliver accurate, context-rich responses. Whether you're developing a <strong>RAG chatbot</strong>, building <strong>RAG pipelines</strong>, or exploring <a target="_blank" href="https://www.youtube.com/watch?v=38aMTXY2usU&amp;t=6256s"><strong>RAG for enterprise</strong></a>, this guide will walk you through the integration process step-by-step.</p>
<h3 id="heading-why-choose-rag">Why Choose RAG?</h3>
<ol>
<li><p><strong>Local Knowledge Integration</strong>: RAG uses local knowledge to customize responses based on private datasets.</p>
</li>
<li><p><strong>Post-Generation Customization</strong>: Tweak outputs with <strong>RAG post-generation</strong> techniques or set a <strong>customized reply output</strong>.</p>
</li>
</ol>
<h3 id="heading-steps-to-integrate-the-rag-tool"><strong>Steps to Integrate the RAG Tool</strong></h3>
<ol>
<li><strong>Load Documents</strong>: Use the <code>PyPDFLoader</code> and <code>Docx2txtLoader</code> to load documents for your <strong>RAG app</strong>.</li>
</ol>
<pre><code class="lang-python"> <span class="hljs-keyword">from</span> langchain_community.document_loaders <span class="hljs-keyword">import</span> PyPDFLoader, Docx2txtLoader

 <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">load_documents</span>(<span class="hljs-params">folder_path: str</span>) -&gt; List[Document]:</span>
     documents = []
     <span class="hljs-keyword">for</span> filename <span class="hljs-keyword">in</span> os.listdir(folder_path):
         file_path = os.path.join(folder_path, filename)
         <span class="hljs-keyword">if</span> filename.endswith(<span class="hljs-string">'.pdf'</span>):
             loader = PyPDFLoader(file_path)
         <span class="hljs-keyword">elif</span> filename.endswith(<span class="hljs-string">'.docx'</span>):
             loader = Docx2txtLoader(file_path)
         <span class="hljs-keyword">else</span>:
             print(<span class="hljs-string">f"Unsupported file type: <span class="hljs-subst">{filename}</span>"</span>)
             <span class="hljs-keyword">continue</span>
         documents.extend(loader.load())
     <span class="hljs-keyword">return</span> documents

 folder_path = <span class="hljs-string">"/docs"</span>
 documents = load_documents(folder_path)
 print(<span class="hljs-string">f"Loaded <span class="hljs-subst">{len(documents)}</span> documents from the folder."</span>)
</code></pre>
<ol start="2">
<li><strong>Split Text into Chunks</strong>: Prepare documents for vectorization by splitting them into manageable chunks, which is essential for rag.</li>
</ol>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain_text_splitters <span class="hljs-keyword">import</span> RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
     chunk_size=<span class="hljs-number">1000</span>,
     chunk_overlap=<span class="hljs-number">200</span>,
     length_function=len
)

 splits = text_splitter.split_documents(documents)
 print(<span class="hljs-string">f"Split the documents into <span class="hljs-subst">{len(splits)}</span> chunks."</span>)
</code></pre>
<ol start="3">
<li><strong>Generate Embeddings</strong>: Use <a target="_blank" href="https://blog.futuresmart.ai/i-never-knew-sentence-transformers-could-be-so-useful"><strong>SentenceTransformers</strong></a> to create embeddings for efficient similarity searches in your RAG pipeline.</li>
</ol>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain_community.embeddings <span class="hljs-keyword">import</span> HuggingFaceEmbeddings

model_name = <span class="hljs-string">"sentence-transformers/all-MiniLM-L6-v2"</span>
model_kwargs = {<span class="hljs-string">'device'</span>: <span class="hljs-string">'cpu'</span>}
encode_kwargs = {<span class="hljs-string">'normalize_embeddings'</span>: <span class="hljs-literal">False</span>}
embedding_function = HuggingFaceEmbeddings(
    model_name=model_name,
    model_kwargs=model_kwargs,
    encode_kwargs=encode_kwargs
)
</code></pre>
<ol start="4">
<li><p><strong>Create and Persist a Vector Store</strong>:</p>
<p> Store embeddings for future retrieval with <strong>Chroma</strong>.</p>
<pre><code class="lang-python"> <span class="hljs-keyword">from</span> langchain_chroma <span class="hljs-keyword">import</span> Chroma

 collection_name = <span class="hljs-string">"my_collection"</span>
 vectorstore = Chroma.from_documents(
      collection_name=collection_name,
      documents=splits,
      embedding=embedding_function,
      persist_directory=<span class="hljs-string">"./chroma_db"</span>
 )
</code></pre>
</li>
<li><p><strong>Build the Retriever Tool</strong>:</p>
<p> Implement a retriever to fetch semantically similar documents for user queries. This tool powers <strong>RAG with action reply in specific format</strong> or customized responses.</p>
<pre><code class="lang-python">
 <span class="hljs-keyword">from</span> langchain.tools <span class="hljs-keyword">import</span> tool
 <span class="hljs-keyword">from</span> pydantic <span class="hljs-keyword">import</span> BaseModel
 <span class="hljs-keyword">from</span> langchain_chroma <span class="hljs-keyword">import</span> Chroma

 <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">RagToolSchema</span>(<span class="hljs-params">BaseModel</span>):</span>
     question: str

<span class="hljs-meta"> @tool(args_schema=RagToolSchema)</span>
 <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">retriever_tool</span>(<span class="hljs-params">question: str</span>):</span>
     <span class="hljs-string">"""Tool to Retrieve Semantically Similar documents to answer User Questions
     related to FutureSmart Tech Store and its Processess.

     `question`: str
     """</span>
     print(<span class="hljs-string">"INSIDE RETRIEVER NODE"</span>, question)

     vectorstore = Chroma(
         collection_name=<span class="hljs-string">"my_collection"</span>,
         embedding_function=embedding_function,
         persist_directory=<span class="hljs-string">"./chroma_db"</span>
     )

     retriever = vectorstore.as_retriever(search_kwargs={<span class="hljs-string">"k"</span>: <span class="hljs-number">5</span>})
     retriever_result = retriever.invoke(question)

     context = <span class="hljs-string">"\n\n"</span>.join(doc.page_content <span class="hljs-keyword">for</span> doc <span class="hljs-keyword">in</span> retriever_result)
     <span class="hljs-keyword">return</span> <span class="hljs-string">f"(context: <span class="hljs-subst">{context}</span>)"</span>
</code></pre>
<p> This tool allows your AI agent to retrieve relevant chunks of information from your document database, making it highly effective for knowledge-based tasks.</p>
</li>
</ol>
<p>You can test the <code>retriever</code> right now to see how it is performing</p>
<pre><code class="lang-python">retriever = vectorstore.as_retriever(search_kwargs={<span class="hljs-string">"k"</span>: <span class="hljs-number">2</span>})
<span class="hljs-comment"># pass question</span>
retriever_tool.invoke({<span class="hljs-string">"question"</span>: <span class="hljs-string">"what are different payment methods supported"</span>})
print(retriever_results)
</code></pre>
<pre><code class="lang-python">INSIDE RETRIEVER NODE what are different payment methods supported
<span class="hljs-string">'At FutureSmart TechStore, weaimtomakeyourshoppingexperienceasseamlessandsecureaspossible. Wesupport avarietyof payment methodstosuit yourconvenience.\nSupportedPaymentMethods\n1. CreditandDebitCards○ Visa○ Mastercard○ AmericanExpress○ RuPay2. DigitalWallets○ Paytm○ PhonePe○ GooglePay○ AmazonPay3. NetBanking○ Supportedforall majorIndianbanksincludingHDFC, ICICI, SBI, AxisBank, andmore.4. UPIPayments○ SeamlesspaymentsusingUnifiedPaymentsInterface(UPI).5. CashonDelivery(COD)○ Availableforselect locationsinIndia.○ MaximumCODlimit: INR10,000.6. EMIOptions○ No-cost EMI andstandardEMI optionsareavailableformajorcredit cards.○ EMI optionsdependonthebankandcardtype.7. BankTransfers○ Direct banktransfersaresupportedforbulkorbusinesspurchases.\nPaymentSecurity\n● All onlinetransactionsareprocessedthroughsecureandencryptedgatewaystoensurethesafetyof yourinformation.● Wecomplywiththelatest PCI DSS(Payment CardIndustryDataSecurityStandard)guidelines.\nImportantNotes\n\nAt FutureSmart TechStore, weaimtomakeyourshoppingexperienceasseamlessandsecureaspossible. Wesupport avarietyof payment methodstosuit yourconvenience.\nSupportedPaymentMethods\n1. CreditandDebitCards○ Visa○ Mastercard○ AmericanExpress○ RuPay2. DigitalWallets○ Paytm○ PhonePe○ GooglePay○ AmazonPay3. NetBanking○ Supportedforall majorIndianbanksincludingHDFC, ICICI, SBI, AxisBank, andmore.4. UPIPayments○ SeamlesspaymentsusingUnifiedPaymentsInterface(UPI).5. CashonDelivery(COD)○ Availableforselect locationsinIndia.○ MaximumCODlimit: INR10,000.6. EMIOptions○ No-cost EMI andstandardEMI optionsareavailableformajorcredit cards.○ EMI optionsdependonthebankandcardtype.7. BankTransfers○ Direct banktransfersaresupportedforbulkorbusinesspurchases.\nPaymentSecurity\n● All onlinetransactionsareprocessedthroughsecureandencryptedgatewaystoensurethesafetyof yourinformation.● Wecomplywiththelatest PCI DSS(Payment CardIndustryDataSecurityStandard)guidelines.\nImportantNotes'</span>
</code></pre>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">Checkout How we <a target="_self" href="https://youtu.be/8sSHg1034r0?si=KWNLKnIQ-AxF7Y0_">build RAG application from scratch without using LangChain or LlamaIndex</a></div>
</div>

<h2 id="heading-nl2sql-tool-integration">NL2SQL Tool Integration</h2>
<p>We now have the Summarization and RAG tools ready for the AI Agent, leaving only the NL2SQL tool to be built.</p>
<p>The SQL Agent serves as a bridge between natural language and SQL databases by generating and executing SQL queries from user questions. This allows the AI Agent to efficiently handle and answer database-related queries.</p>
<h2 id="heading-steps-to-integrate-the-nl2sql-tool"><strong>Steps to Integrate the NL2SQL Tool</strong></h2>
<p>We're using a MySQL database containing an e-commerce dataset to test SQL queries. The dataset includes all Product and Order tables.</p>
<ol>
<li><p><strong>Initialize the Database Connection</strong><br /> First, establish a connection with the <strong>LangChain SQL Database</strong> utility.</p>
<pre><code class="lang-python"> <span class="hljs-keyword">from</span> langchain_community.utilities <span class="hljs-keyword">import</span> SQLDatabase
 <span class="hljs-keyword">import</span> os
 db_user = os.getenv(<span class="hljs-string">"DB_USER"</span>)
 db_password = os.getenv(<span class="hljs-string">"DB_PASSWORD"</span>)
 db_host = os.getenv(<span class="hljs-string">"DB_HOST"</span>)
 db_name = os.getenv(<span class="hljs-string">"DB_NAME"</span>)

 db = SQLDatabase.from_uri(<span class="hljs-string">f"mysql+pymysql://<span class="hljs-subst">{db_user}</span>:<span class="hljs-subst">{db_password}</span>@<span class="hljs-subst">{db_host}</span>/<span class="hljs-subst">{db_name}</span>"</span>,sample_rows_in_table_info=<span class="hljs-number">1</span>)
</code></pre>
</li>
<li><p><strong>Clean SQL Queries</strong><br /> This function is very important. We often see this problem in many of our client’s projects that the SQL query generated from the LLM consists of some unnecessary symbols, texts, backticks, etc. As a result, you will get an error while executing this query. Thus, we require a function for error-free <strong>text-to-SQL conversion</strong>.</p>
<pre><code class="lang-python"> <span class="hljs-keyword">import</span> re        

 <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">clean_sql_query</span>(<span class="hljs-params">text: str</span>) -&gt; str:</span>

     <span class="hljs-comment"># Step 1: Remove code block syntax and any SQL-related tags</span>
     <span class="hljs-comment"># This handles variations like ```sql, ```SQL, ```SQLQuery, etc.</span>
     block_pattern = <span class="hljs-string">r"```(?:sql|SQL|SQLQuery|mysql|postgresql)?\s*(.*?)\s*```"</span>
     text = re.sub(block_pattern, <span class="hljs-string">r"\1"</span>, text, flags=re.DOTALL)

     <span class="hljs-comment"># Step 2: Handle "SQLQuery:" prefix and similar variations</span>
     <span class="hljs-comment"># This will match patterns like "SQLQuery:", "SQL Query:", "MySQL:", etc.</span>
     prefix_pattern = <span class="hljs-string">r"^(?:SQL\s*Query|SQLQuery|MySQL|PostgreSQL|SQL)\s*:\s*"</span>
     text = re.sub(prefix_pattern, <span class="hljs-string">""</span>, text, flags=re.IGNORECASE)

     <span class="hljs-comment"># Step 3: Extract the first SQL statement if there's random text after it</span>
     <span class="hljs-comment"># Look for a complete SQL statement ending with semicolon</span>
     sql_statement_pattern = <span class="hljs-string">r"(SELECT.*?;)"</span>
     sql_match = re.search(sql_statement_pattern, text, flags=re.IGNORECASE | re.DOTALL)
     <span class="hljs-keyword">if</span> sql_match:
         text = sql_match.group(<span class="hljs-number">1</span>)

     <span class="hljs-comment"># Step 4: Remove backticks around identifiers</span>
     text = re.sub(<span class="hljs-string">r'`([^`]*)`'</span>, <span class="hljs-string">r'\1'</span>, text)

     <span class="hljs-comment"># Step 5: Normalize whitespace</span>
     <span class="hljs-comment"># Replace multiple spaces with single space</span>
     text = re.sub(<span class="hljs-string">r'\s+'</span>, <span class="hljs-string">' '</span>, text)

     <span class="hljs-comment"># Step 6: Preserve newlines for main SQL keywords to maintain readability</span>
     keywords = [<span class="hljs-string">'SELECT'</span>, <span class="hljs-string">'FROM'</span>, <span class="hljs-string">'WHERE'</span>, <span class="hljs-string">'GROUP BY'</span>, <span class="hljs-string">'HAVING'</span>, <span class="hljs-string">'ORDER BY'</span>,
                <span class="hljs-string">'LIMIT'</span>, <span class="hljs-string">'JOIN'</span>, <span class="hljs-string">'LEFT JOIN'</span>, <span class="hljs-string">'RIGHT JOIN'</span>, <span class="hljs-string">'INNER JOIN'</span>,
                <span class="hljs-string">'OUTER JOIN'</span>, <span class="hljs-string">'UNION'</span>, <span class="hljs-string">'VALUES'</span>, <span class="hljs-string">'INSERT'</span>, <span class="hljs-string">'UPDATE'</span>, <span class="hljs-string">'DELETE'</span>]

     <span class="hljs-comment"># Case-insensitive replacement for keywords</span>
     pattern = <span class="hljs-string">'|'</span>.join(<span class="hljs-string">r'\b{}\b'</span>.format(k) <span class="hljs-keyword">for</span> k <span class="hljs-keyword">in</span> keywords)
     text = re.sub(<span class="hljs-string">f'(<span class="hljs-subst">{pattern}</span>)'</span>, <span class="hljs-string">r'\n\1'</span>, text, flags=re.IGNORECASE)

     <span class="hljs-comment"># Step 7: Final cleanup</span>
     <span class="hljs-comment"># Remove leading/trailing whitespace and extra newlines</span>
     text = text.strip()
     text = re.sub(<span class="hljs-string">r'\n\s*\n'</span>, <span class="hljs-string">'\n'</span>, text)

     <span class="hljs-keyword">return</span> text
</code></pre>
</li>
<li><p><strong>Construct a SQL chain</strong></p>
<pre><code class="lang-python"> <span class="hljs-keyword">from</span> langchain_community.tools <span class="hljs-keyword">import</span> QuerySQLDatabaseTool

 <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_sql_chain</span>():</span>
     execute_query = QuerySQLDataBaseTool(db=db)

     prompt = ChatPromptTemplate.from_messages([
         (<span class="hljs-string">"system"</span>, <span class="hljs-string">"""You are a MySQL expert. Given an input question, 
         create a syntactically correct MySQL query to run. 
         Use the LIKE operator for all string matching queries, 
         ensuring proper use of wildcards (% or *) for partial matches. 
         Only use = operator when quering for order_id or product_id column. 
         Here is the relevant table information: {table_info}"""</span>),
         (<span class="hljs-string">"human"</span>, <span class="hljs-string">"{human_message}"</span>)
     ])

     write_query = prompt | llm | StrOutputParser()

     chain = (
         RunnablePassthrough.assign(
             query=write_query | RunnableLambda(clean_sql_query)
         ).assign(
             result=itemgetter(<span class="hljs-string">"query"</span>) | execute_query
         )
     )

     <span class="hljs-keyword">return</span> chain
</code></pre>
</li>
<li><p><strong>Create the NL2SQL Tool</strong><br /> Here’s how to build and integrate the <strong>NL2SQL tool</strong>:</p>
<pre><code class="lang-python"> <span class="hljs-keyword">from</span> langchain_community.tools <span class="hljs-keyword">import</span> QuerySQLDatabaseTool
 <span class="hljs-keyword">from</span> operator <span class="hljs-keyword">import</span> itemgetter
 <span class="hljs-keyword">import</span> re
 <span class="hljs-keyword">from</span> langchain_core.output_parsers <span class="hljs-keyword">import</span> StrOutputParser
 <span class="hljs-keyword">from</span> langchain_core.prompts <span class="hljs-keyword">import</span> PromptTemplate
 <span class="hljs-keyword">from</span> langchain_core.runnables <span class="hljs-keyword">import</span> RunnablePassthrough, RunnableLambda

 <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">SQLToolSchema</span>(<span class="hljs-params">BaseModel</span>):</span>
     question: str
     order_id: str
     product_id: str = <span class="hljs-literal">None</span>

<span class="hljs-meta"> @tool(args_schema=SQLToolSchema)</span>
 <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">nl2sql_tool</span>(<span class="hljs-params">question: str, order_id: str, product_id: str</span>):</span>
   <span class="hljs-string">"""Tool to get Product detail or Order status
   `question`: str, `order_id`: str, `product_id`: str"""</span>

   print(<span class="hljs-string">"INSIDE NL2SQL TOOL"</span>, question)
   chain = get_sql_chain()
   table_info = db.get_table_info() 
   human_message = <span class="hljs-string">f"question: <span class="hljs-subst">{question}</span>, order id: <span class="hljs-subst">{order_id}</span>, product_id: <span class="hljs-subst">{product_id}</span>, sql_query:"</span> 
   print(<span class="hljs-string">"human_message:"</span>, human_message)
   response = chain.invoke({<span class="hljs-string">"human_message"</span>: human_message, <span class="hljs-string">"table_info"</span>:table_info})
   <span class="hljs-keyword">return</span> <span class="hljs-string">f"Product detail and Order status: <span class="hljs-subst">{response[<span class="hljs-string">'result'</span>]}</span>, SQL Query used:<span class="hljs-subst">{response[<span class="hljs-string">'query'</span>]}</span>"</span>
</code></pre>
</li>
<li><p><strong>Test the Tool</strong><br /> Use a sample query to verify functionality:</p>
<pre><code class="lang-python"> question = <span class="hljs-string">"Check Order status"</span>
 order_id = <span class="hljs-string">"8"</span>
 product_id = <span class="hljs-string">""</span>
 result = nl2sql_tool.invoke({<span class="hljs-string">"question"</span>: question,<span class="hljs-string">"order_id"</span>: order_id, <span class="hljs-string">"product_id"</span>:product_id})
 print(<span class="hljs-string">f"Question: <span class="hljs-subst">{question}</span>"</span>)
 print(<span class="hljs-string">f"Answer: <span class="hljs-subst">{result}</span>"</span>)
</code></pre>
<pre><code class="lang-python">  <span class="hljs-comment"># output</span>
 INSIDE NL2SQL TOOL Check Order status
 human_message: question: Check Order status, order id: <span class="hljs-number">8</span>, product_id: , sql_query:
 Question: Check Order status
 Answer: Product detail <span class="hljs-keyword">and</span> Order status: [(<span class="hljs-string">'Processing'</span>,)], SQL Query used:SELECT order_status 
 FROM Orders 
 WHERE order_id = <span class="hljs-number">8</span>;
</code></pre>
</li>
</ol>
<blockquote>
<p>To master NL2SQL (Natural Language to SQL) with advanced techniques like Few-Shot example prompts (dynamic and static), dynamic table selection, and enhancing SQL query generation, here's a <a target="_blank" href="https://blog.futuresmart.ai/mastering-natural-language-to-sql-with-langchain-nl2sql">detailed guide</a></p>
</blockquote>
<h2 id="heading-reply-tool">Reply Tool</h2>
<p>Let's build a Reply Tool for your <strong>AI-powered automated answering service</strong>. This tool helps your Freshdesk agent create professional <strong>automatic replies with specific messages</strong> for each customer query. It tracks every response with a unique ID, making follow-ups simple and organized. We've designed this tool to ensure your support team can deliver consistent, high-quality answers every time.</p>
<ol>
<li><p><strong>Create Reply Function</strong>: This function hits the POST endpoint of the specific ticket. It uses the ticket's ID and the reply content (provided as input)</p>
<pre><code class="lang-python"> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create_reply</span>(<span class="hljs-params">ticket_id, body</span>):</span>
     <span class="hljs-keyword">try</span>:
         url = <span class="hljs-string">f'https://<span class="hljs-subst">{FRESHDESK_URL}</span>.freshdesk.com/api/v2/tickets/<span class="hljs-subst">{ticket_id}</span>/reply'</span>
         headers = {
             <span class="hljs-string">'Content-Type'</span>: <span class="hljs-string">'application/json'</span>
         }
         auth = (FRESHDESK_API_KEY, <span class="hljs-string">'X'</span>)  <span class="hljs-comment"># Replace 'X' with your actual API key</span>
         data = {
             <span class="hljs-string">"body"</span>: body
         }
         response = requests.post(url, headers=headers, auth=auth, json=data)
         <span class="hljs-keyword">return</span> response.json()
     <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
         print(<span class="hljs-string">"Error in create_reply : "</span>, e)
         <span class="hljs-keyword">return</span> <span class="hljs-string">""</span>
</code></pre>
</li>
<li><p><strong>Reply Tool</strong>: This code creates a tool that generates professional replies for customer tickets using AI. It takes ticket details and generates a customized reply, which is then sent to the customer via the Freshdesk system.</p>
<pre><code class="lang-python"> <span class="hljs-keyword">from</span> langchain.tools <span class="hljs-keyword">import</span> tool
 <span class="hljs-keyword">from</span> pydantic <span class="hljs-keyword">import</span> BaseModel

 <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">ReplyToTicket</span>(<span class="hljs-params">BaseModel</span>):</span>
     information: str
     ticket_id: str

<span class="hljs-meta"> @tool(args_schema=ReplyToTicket)</span>
 <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">reply_to_ticket_tool</span>(<span class="hljs-params">information: str, ticket_id: str</span>):</span>
     <span class="hljs-string">"""Tool that will send a reply to customer with a ticket id
     `information`: str, 
     `ticket_id`: str"""</span>

     print(<span class="hljs-string">"INSIDE REPLY TOOL"</span>, ticket_id)
     prompt = ChatPromptTemplate.from_messages([
         (<span class="hljs-string">"system"</span>, <span class="hljs-string">"""
 **Introduction to FutureSmart TechStore:**
 FutureSmart TechStore is a leading retailer specializing in cutting-edge technology products, ensuring you have access to the latest innovations to meet your needs. We pride ourselves on our customer service and offer support in multiple languages to cater to our diverse clientele. Our inventory includes a wide range of electronics, smart home devices, and accessories, all curated to enhance your tech lifestyle.

 **Guidelines for Responding to Customer Queries:**

 **Identify Language and Respond in English:**
    Analyze the language of the customer's query to ensure accurate understanding. While responding, provide all replies in English to maintain consistency.

 **Order Information and Query Specifics:**
    Utilize provided order details to address specific customer queries accurately. Ensure responses are tailored to address only the aspects of the order that the customer has inquired about. Do not include unnecessary order details unless requested.

 By adhering to these guidelines, ensure that your responses are informative, accurate, and aligned with FutureSmart TechStore's commitment to customer satisfaction. Remember to always maintain a friendly and professional tone in all communications.

 You are an AI assistant tasked with crafting a professional customer reply in Gmail format. Mention FutureSmart TechStore Support, 9876543210 Phone number. Avoid placeholders or incomplete information, and provide clear, actionable steps to resolve the issue"""</span>),
         (<span class="hljs-string">"human"</span>, <span class="hljs-string">"context: {information}"</span> )
     ])
     chain = prompt | llm | StrOutputParser()
     body = chain.invoke({<span class="hljs-string">"information"</span>: information})
     res = {}
     res = create_reply(ticket_id, body)
     <span class="hljs-keyword">return</span> body, <span class="hljs-string">f"reply id: <span class="hljs-subst">{res.get(<span class="hljs-string">'id'</span>)}</span>"</span>
</code></pre>
</li>
<li><p><strong>Test the tool</strong></p>
<pre><code class="lang-python"> information = <span class="hljs-string">"""
 Customer Name: John Doe
 Order Number: FS123456789
 Issue: The customer received a damaged smart home device (a smart speaker) and wants a replacement. 
 Additional Information: The order was delivered on 28th December 2024, and the damage was noted upon unboxing. The customer has attached photos of the damaged item for reference and is requesting a quick resolution.
 """</span>
 ticket_id = <span class="hljs-string">"12345"</span>

 reply_to_ticket_tool.invoke({<span class="hljs-string">"information"</span>: information, <span class="hljs-string">"ticket_id"</span>: ticket_id})
</code></pre>
</li>
</ol>
<h2 id="heading-combining-the-tools">Combining the Tools</h2>
<pre><code class="lang-python">tools = [summarize_tool, retriever_tool, nl2sql_tool, reply_to_ticket_tool]
llm_with_tools = llm.bind_tools(tools)
</code></pre>
<h2 id="heading-building-the-langgraph">Building the LangGraph</h2>
<p>LangGraph enables you to define a stateful workflow for your AI agent. By structuring nodes and edges, you can define how the agent processes user inputs and transitions between tools.</p>
<h3 id="heading-steps-to-build-the-langgraph"><strong>Steps to Build the LangGraph</strong></h3>
<ol>
<li><p><strong>Define the State</strong>: Create a <code>State</code> dictionary to manage the agent’s inputs and outputs.</p>
<pre><code class="lang-python"> <span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> Annotated
 <span class="hljs-keyword">from</span> typing_extensions <span class="hljs-keyword">import</span> TypedDict
 <span class="hljs-keyword">from</span> langgraph.graph.message <span class="hljs-keyword">import</span> add_messages

 <span class="hljs-comment"># Setting up the graph state</span>
 <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">State</span>(<span class="hljs-params">TypedDict</span>):</span>
     messages: Annotated[list, add_messages]
</code></pre>
</li>
<li><p><strong>Define Checkpointer:</strong> <code>MemorySaver</code> allow LangGraph agents to persist their state within and across multiple interactions.</p>
<pre><code class="lang-python"> <span class="hljs-keyword">from</span> langgraph.checkpoint.memory <span class="hljs-keyword">import</span> MemorySaver

 memory = MemorySaver()
</code></pre>
</li>
<li><p><strong>Add Nodes</strong>: Add nodes for the chatbot and tools to handle user queries and invoke the tools.</p>
<pre><code class="lang-python"> <span class="hljs-keyword">from</span> langgraph.graph <span class="hljs-keyword">import</span> StateGraph 

 <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">chatbot</span>(<span class="hljs-params">state: State</span>):</span>
      <span class="hljs-keyword">return</span> {<span class="hljs-string">"messages"</span>: [llm_with_tools.invoke(state[<span class="hljs-string">"messages"</span>])]}

  graph_builder = StateGraph(State)
  graph_builder.add_node(<span class="hljs-string">"chatbot"</span>, chatbot)

  tool_node = ToolNode(tools=tools)
  graph_builder.add_node(<span class="hljs-string">"tools"</span>, tool_node)
</code></pre>
</li>
<li><p><strong>Define Edges</strong>: Use conditional edges to determine when the agent should switch between nodes.</p>
<pre><code class="lang-python"> <span class="hljs-keyword">from</span> langgraph.prebuilt <span class="hljs-keyword">import</span> tools_condition

  graph_builder.add_conditional_edges(<span class="hljs-string">"chatbot"</span>, tools_condition)
  graph_builder.add_edge(<span class="hljs-string">"tools"</span>, <span class="hljs-string">"chatbot"</span>)
  graph_builder.set_entry_point(<span class="hljs-string">"chatbot"</span>)
</code></pre>
</li>
<li><p><strong>Compile the Graph</strong>: Finalize the graph for execution.</p>
<pre><code class="lang-python"> graph = graph_builder.compile(checkpointer=memory)
</code></pre>
</li>
</ol>
<h2 id="heading-testing-the-ai-agent">Testing the AI Agent</h2>
<p>Once the LangGraph is set up, you can test the agent by simulating user inputs. This ensures the tools and workflows are functioning as expected.</p>
<h3 id="heading-interactive-testing"><strong>Interactive Testing</strong></h3>
<p>Run the following code to test your AI agent interactively:</p>
<pre><code class="lang-python">config = {<span class="hljs-string">"configurable"</span>: {<span class="hljs-string">"thread_id"</span>: <span class="hljs-string">"1"</span>}}

<span class="hljs-keyword">while</span> <span class="hljs-literal">True</span>:
    user_input = input(<span class="hljs-string">"User: "</span>)
    <span class="hljs-keyword">if</span> user_input.lower() <span class="hljs-keyword">in</span> [<span class="hljs-string">"quit"</span>, <span class="hljs-string">"exit"</span>, <span class="hljs-string">"q"</span>]:
        print(<span class="hljs-string">"Goodbye!"</span>)
        <span class="hljs-keyword">break</span>

    <span class="hljs-keyword">for</span> event <span class="hljs-keyword">in</span> graph.stream({<span class="hljs-string">"messages"</span>: [(<span class="hljs-string">"user"</span>, user_input)]}, config):
        <span class="hljs-keyword">for</span> value <span class="hljs-keyword">in</span> event.values():
            print(<span class="hljs-string">"Assistant:"</span>, value[<span class="hljs-string">"messages"</span>][<span class="hljs-number">-1</span>].content)
</code></pre>
<p>You can provide queries like:</p>
<ul>
<li><p>You can ask questions based on your knowledge base. Do FutureSmart TechStore Support Rupay Credit Card (Trigger Retriever or RAG tool)</p>
</li>
<li><p>"Brief me about the ticket with ticket id 130?" (Trigger Summary Tool)</p>
</li>
<li><p>"Tell me the delivery status of order id 10" (Trigger NL2SQL tool)</p>
</li>
</ul>
<p>The AI agent will invoke the appropriate tool to generate responses.</p>
<p><strong>A Conversation between Agent and User:</strong></p>
<pre><code class="lang-yaml"><span class="hljs-attr">User:</span>  <span class="hljs-string">reply</span> <span class="hljs-string">to</span> <span class="hljs-string">ticket</span> <span class="hljs-number">100</span>
<span class="hljs-attr">Assistant:</span> 
<span class="hljs-string">INSIDE</span> <span class="hljs-string">SUMMARIZE</span> <span class="hljs-string">NODE</span> <span class="hljs-number">100</span>
<span class="hljs-attr">Subject :</span> <span class="hljs-string">Order</span> <span class="hljs-number">8</span>
<span class="hljs-attr">User query :</span> 
<span class="hljs-string">Dear</span> <span class="hljs-string">Support</span> <span class="hljs-string">Team,</span>  <span class="hljs-string">I</span> <span class="hljs-string">am</span> <span class="hljs-string">deeply</span> <span class="hljs-string">disappointed</span> <span class="hljs-string">that</span> <span class="hljs-string">my</span> <span class="hljs-string">USB-C</span> <span class="hljs-string">Charger</span> <span class="hljs-string">(Order</span> <span class="hljs-comment">#8), expected by Dec 23, did not arrive in time for an important occasion. To make matters worse, I was informed I cannot cancel the order. Please provide an update on when I will receive the item.  I hope for a prompt resolution.  Thank you.</span>

<span class="hljs-attr">Assistant:</span> [<span class="hljs-string">"ticket_id: 100"</span>, <span class="hljs-string">"Ticket Summary: The customer, deeply disappointed, reports that their USB-C Charger (Order #8) did not arrive by the expected date of December 23 for an important occasion. Additionally, they were informed that they cannot cancel the order. They are requesting an update on when they will receive the item and hope for a prompt resolution. Customer name is not provided."</span>]
<span class="hljs-attr">Assistant:</span> 
<span class="hljs-string">INSIDE</span> <span class="hljs-string">NL2SQL</span> <span class="hljs-string">TOOL</span> <span class="hljs-string">Update</span> <span class="hljs-string">on</span> <span class="hljs-string">delivery</span> <span class="hljs-string">status</span> <span class="hljs-string">for</span> <span class="hljs-string">Order</span> <span class="hljs-comment">#8</span>
<span class="hljs-attr">Assistant: Product detail and Order status:</span> [<span class="hljs-string">('Processing'</span>,<span class="hljs-string">)</span>]<span class="hljs-string">,</span> <span class="hljs-string">SQL</span> <span class="hljs-string">Query</span> <span class="hljs-string">used:SELECT</span> <span class="hljs-string">order_status</span> 
<span class="hljs-string">FROM</span> <span class="hljs-string">Orders</span> 
<span class="hljs-string">WHERE</span> <span class="hljs-string">order_id</span> <span class="hljs-string">=</span> <span class="hljs-number">8</span><span class="hljs-string">;</span>
<span class="hljs-attr">Assistant:</span> 
<span class="hljs-string">INSIDE</span> <span class="hljs-string">REPLY</span> <span class="hljs-string">TOOL</span> <span class="hljs-number">100</span>
<span class="hljs-attr">Assistant:</span> [<span class="hljs-string">"Subject: Update on Your USB-C Charger Order\n\nDear [Customer's Name],\n\nThank you for contacting FutureSmart TechStore and sharing your concerns about the delay in your USB-C Charger delivery. We understand how crucial this item is for your upcoming occasion, and we sincerely apologize for any inconvenience caused by the delay.\n\nAs of now, your order status is still marked as 'Processing'. Please rest assured that we are actively working to expedite the process and get your charger to you as soon as possible. We are committed to keeping you updated with any changes to your order status.\n\nIn the meantime, if you have any further questions or if there's anything else we can do to assist you, please do not hesitate to reach out to us. You can contact FutureSmart TechStore Support at 9876543210.\n\nThank you for your patience and understanding. We appreciate your trust in FutureSmart TechStore and are dedicated to resolving this promptly.\n\nBest regards,\n\n[Your Name]  \nFutureSmart TechStore Support  \n9876543210"</span>, <span class="hljs-string">"reply id: 11000010366579"</span>]
<span class="hljs-attr">Assistant:</span> <span class="hljs-string">I</span> <span class="hljs-string">have</span> <span class="hljs-string">sent</span> <span class="hljs-string">a</span> <span class="hljs-string">reply</span> <span class="hljs-string">to</span> <span class="hljs-string">the</span> <span class="hljs-string">customer</span> <span class="hljs-string">regarding</span> <span class="hljs-string">ticket</span> <span class="hljs-comment">#100. The customer has been informed about the current status of their order and that it is still marked as "Processing". We have assured them that we are actively working to expedite the process and apologize for any inconvenience caused. The reply includes our contact information for further assistance.</span>
<span class="hljs-attr">User:</span>  <span class="hljs-string">is</span> <span class="hljs-string">visa</span> <span class="hljs-string">credit</span> <span class="hljs-string">card</span> <span class="hljs-string">accepted</span> <span class="hljs-string">by</span> <span class="hljs-string">futuresmart</span> <span class="hljs-string">techstore</span>
<span class="hljs-attr">Assistant:</span> 
<span class="hljs-string">INSIDE</span> <span class="hljs-string">RETRIEVER</span> <span class="hljs-string">NODE</span> <span class="hljs-string">Is</span> <span class="hljs-string">Visa</span> <span class="hljs-string">credit</span> <span class="hljs-string">card</span> <span class="hljs-string">accepted</span> <span class="hljs-string">by</span> <span class="hljs-string">FutureSmart</span> <span class="hljs-string">TechStore?</span>
<span class="hljs-attr">Assistant:</span> <span class="hljs-string">(context:</span> <span class="hljs-string">At</span> <span class="hljs-string">FutureSmart</span> <span class="hljs-string">TechStore,</span> <span class="hljs-string">weaimtomakeyourshoppingexperienceasseamlessandsecureaspossible.</span> <span class="hljs-string">Wesupport</span> <span class="hljs-string">avarietyof</span> <span class="hljs-string">payment</span> <span class="hljs-string">methodstosuit</span> <span class="hljs-string">yourconvenience.</span>
            <span class="hljs-string">SupportedPaymentMethods</span>
            <span class="hljs-number">1</span><span class="hljs-string">.</span> <span class="hljs-string">CreditandDebitCards○</span> <span class="hljs-string">Visa○</span> <span class="hljs-string">Mastercard○</span> <span class="hljs-string">AmericanExpress○</span> <span class="hljs-string">RuPay2.</span> <span class="hljs-string">DigitalWallets○</span> <span class="hljs-string">Paytm○</span> <span class="hljs-string">PhonePe○</span> <span class="hljs-string">GooglePay○</span> <span class="hljs-string">AmazonPay3.</span> <span class="hljs-string">NetBanking○</span> <span class="hljs-string">Supportedforall</span> <span class="hljs-string">majorIndianbanksincludingHDFC,</span> <span class="hljs-string">ICICI,</span> <span class="hljs-string">SBI,</span> <span class="hljs-string">AxisBank,</span> <span class="hljs-string">andmore.4.</span> <span class="hljs-string">UPIPayments○</span> <span class="hljs-string">SeamlesspaymentsusingUnifiedPaymentsInterface(UPI).5.</span> <span class="hljs-string">CashonDelivery(COD)○</span> <span class="hljs-string">Availableforselect</span> <span class="hljs-string">locationsinIndia.○</span> <span class="hljs-attr">MaximumCODlimit:</span> <span class="hljs-string">INR10,000.6.</span> <span class="hljs-string">EMIOptions○</span> <span class="hljs-literal">No</span><span class="hljs-string">-cost</span> <span class="hljs-string">EMI</span> <span class="hljs-string">andstandardEMI</span> <span class="hljs-string">optionsareavailableformajorcredit</span> <span class="hljs-string">cards.○</span> <span class="hljs-string">EMI</span> <span class="hljs-string">optionsdependonthebankandcardtype.7.</span> <span class="hljs-string">BankTransfers○</span> <span class="hljs-string">Direct</span> <span class="hljs-string">banktransfersaresupportedforbulkorbusinesspurchases.</span>
            <span class="hljs-string">PaymentSecurity</span>
            <span class="hljs-string">●</span> <span class="hljs-string">All</span> <span class="hljs-string">onlinetransactionsareprocessedthroughsecureandencryptedgatewaystoensurethesafetyof</span> <span class="hljs-string">yourinformation.●</span> <span class="hljs-string">Wecomplywiththelatest</span> <span class="hljs-string">PCI</span> <span class="hljs-string">DSS(Payment</span> <span class="hljs-string">CardIndustryDataSecurityStandard)guidelines.</span>
            <span class="hljs-string">ImportantNotes</span>

            <span class="hljs-string">●</span> <span class="hljs-string">Software,</span> <span class="hljs-string">CDs,</span> <span class="hljs-string">orDVDsthat</span> <span class="hljs-string">havebeenopened.●</span> <span class="hljs-string">Itemsmarkedas"Final</span> <span class="hljs-string">Sale"or"Non-Returnable"at</span> <span class="hljs-string">thetimeof</span> <span class="hljs-string">purchase.●</span> <span class="hljs-string">Gift</span> <span class="hljs-string">cardsorpromotional</span> <span class="hljs-string">items.</span>
            <span class="hljs-string">HowtoInitiateaReturn</span>
            <span class="hljs-attr">1. Contact Us:</span> <span class="hljs-string">Reachout</span> <span class="hljs-string">toourcustomerserviceteamviaemail</span> <span class="hljs-string">atsupport@futuresmarttechstore.comorcall</span> <span class="hljs-string">usat</span> <span class="hljs-string">+91-XXXXXXXXXXwithyourorderdetails.2.</span> <span class="hljs-attr">Approval:</span> <span class="hljs-string">Onceyourreturnrequest</span> <span class="hljs-string">isapproved,</span> <span class="hljs-string">youwill</span> <span class="hljs-string">rLxGNHXaoMujveJRY4ajftSQYhZbWuZe6g(RMA)numberanddetailedinstructions.3.</span> <span class="hljs-attr">ShiptheItem:</span> <span class="hljs-string">Packtheitemsecurely,</span> <span class="hljs-string">includetheRMAnumber,</span> <span class="hljs-attr">andsendit toourreturnaddress:</span> <span class="hljs-string">FutureSmart</span> <span class="hljs-string">TechStoreReturnsDepartment</span> <span class="hljs-string">Mumbai,</span> <span class="hljs-attr">IndiaPIN:</span> <span class="hljs-number">400001</span>
            <span class="hljs-string">RefundProcess</span>
            <span class="hljs-string">●</span> <span class="hljs-string">Refundswill</span> <span class="hljs-string">beprocessedoncethereturneditemisinspectedandapproved.●</span> <span class="hljs-string">Refundswill</span> <span class="hljs-string">beissuedtotheoriginal</span> <span class="hljs-string">payment</span> <span class="hljs-string">methodwithin7-10businessdaysafterapproval.●</span> <span class="hljs-string">Shippingfeesarenon-refundableunlessthereturnisduetoamistakeonourpart</span> <span class="hljs-string">(e.g.,wrongordefectiveitem).</span>
            <span class="hljs-string">ExchangePolicy</span>

            <span class="hljs-string">Foranypayment-relatedqueries,</span> <span class="hljs-string">feel</span> <span class="hljs-string">freetocontact</span> <span class="hljs-string">usat</span> <span class="hljs-string">support@futuresmarttechstore.comorcall</span> <span class="hljs-string">usat</span> <span class="hljs-string">+91-XXXXXXXXXX.</span>

            <span class="hljs-string">At</span> <span class="hljs-string">FutureSmart</span> <span class="hljs-string">TechStore,</span> <span class="hljs-string">wearecommittedtodeliveringyourordersswiftlyandsecurely.Belowarethedetailsof</span> <span class="hljs-string">ourshippingpoliciestoensureasmoothandtransparent</span> <span class="hljs-string">shoppingexperience.</span>
            <span class="hljs-string">ShippingLocations</span>
            <span class="hljs-string">●</span> <span class="hljs-string">WecurrentlyshipacrossIndia.●</span> <span class="hljs-string">International</span> <span class="hljs-string">shippingisnot</span> <span class="hljs-string">availableat</span> <span class="hljs-string">thistime.</span>
            <span class="hljs-string">ShippingCharges</span>
            <span class="hljs-string">●</span> <span class="hljs-attr">StandardShipping:</span> <span class="hljs-string">FreeforordersaboveINR1,000.○</span> <span class="hljs-string">OrdersbelowINR1,000will</span> <span class="hljs-string">incurashippingfeeof</span> <span class="hljs-string">INR50.●</span> <span class="hljs-attr">ExpressShipping:</span> <span class="hljs-string">Availableat</span> <span class="hljs-string">INR150foreligiblelocations.</span>
            <span class="hljs-string">DeliveryTimeframes</span>
            <span class="hljs-string">●</span> <span class="hljs-attr">StandardShipping:</span> <span class="hljs-number">3</span><span class="hljs-string">-7businessdays.●</span> <span class="hljs-attr">ExpressShipping:</span> <span class="hljs-number">1</span><span class="hljs-string">-3businessdays.●</span> <span class="hljs-string">Deliverytimesmayvarybasedonlocationandavailability.</span>
            <span class="hljs-string">OrderProcessing</span>
            <span class="hljs-string">●</span> <span class="hljs-string">Ordersareprocessedwithin24-48hoursof</span> <span class="hljs-string">payment</span> <span class="hljs-string">confirmation(excludingweekendsandholidays).●</span> <span class="hljs-string">Youwill</span> <span class="hljs-string">receiveaconfirmationemail</span> <span class="hljs-string">withtrackingdetailsonceyourorderisshipped.</span>
            <span class="hljs-string">TrackingYourOrder</span>
            <span class="hljs-string">●</span> <span class="hljs-string">All</span> <span class="hljs-string">ordersareshippedwithatrackingnumber.</span> <span class="hljs-string">Youcanmonitortheprogressof</span> <span class="hljs-string">yourdeliveryviathelinkprovidedinyourconfirmationemail.</span>
            <span class="hljs-string">UndeliverablePackages</span>

            <span class="hljs-string">At</span> <span class="hljs-string">FutureSmart</span> <span class="hljs-string">TechStore,</span> <span class="hljs-string">westrivetoensurecustomersatisfactionwitheverypurchase.</span> <span class="hljs-string">Ifyouarenot</span> <span class="hljs-string">completelysatisfiedwithyourorder,</span> <span class="hljs-string">weofferahassle-freereturnpolicytomaketheprocessassmoothaspossible.</span>
            <span class="hljs-string">EligibilityforReturns</span>
            <span class="hljs-string">●</span> <span class="hljs-string">Productscanbereturnedwithin30daysof</span> <span class="hljs-string">receipt.●</span> <span class="hljs-string">Itemsmust</span> <span class="hljs-string">beintheiroriginal</span> <span class="hljs-string">condition,</span> <span class="hljs-string">unused,</span> <span class="hljs-string">andintheoriginal</span> <span class="hljs-string">packagingwithalltags,</span> <span class="hljs-string">manuals,</span> <span class="hljs-string">andaccessoriesincluded.●</span> <span class="hljs-string">Returnsareapplicableforitemsthat</span> <span class="hljs-string">aredamaged,</span> <span class="hljs-string">defective,</span> <span class="hljs-string">ornot</span> <span class="hljs-string">asdescribed.</span>
            <span class="hljs-string">Non-ReturnableItems</span>
            <span class="hljs-attr">Thefollowingitemsarenot eligibleforreturn:</span>
            <span class="hljs-string">●</span> <span class="hljs-string">Software,</span> <span class="hljs-string">CDs,</span> <span class="hljs-string">orDVDsthat</span> <span class="hljs-string">havebeenopened.●</span> <span class="hljs-string">Itemsmarkedas"Final</span> <span class="hljs-string">Sale"or"Non-Returnable"at</span> <span class="hljs-string">thetimeof</span> <span class="hljs-string">purchase.●</span> <span class="hljs-string">Gift</span> <span class="hljs-string">cardsorpromotional</span> <span class="hljs-string">items.</span>
            <span class="hljs-string">HowtoInitiateaReturn)</span>
<span class="hljs-attr">Assistant:</span> <span class="hljs-literal">Yes</span><span class="hljs-string">,</span> <span class="hljs-string">FutureSmart</span> <span class="hljs-string">TechStore</span> <span class="hljs-string">accepts</span> <span class="hljs-string">Visa</span> <span class="hljs-string">credit</span> <span class="hljs-string">cards</span> <span class="hljs-string">as</span> <span class="hljs-string">a</span> <span class="hljs-string">payment</span> <span class="hljs-string">method.</span> <span class="hljs-string">They</span> <span class="hljs-string">also</span> <span class="hljs-string">support</span> <span class="hljs-string">Mastercard,</span> <span class="hljs-string">American</span> <span class="hljs-string">Express,</span> <span class="hljs-string">and</span> <span class="hljs-string">RuPay.</span>
</code></pre>
<p>Let's break down how our AI agent handled this support request – it's a perfect example of multiple tools working together:</p>
<p><strong>Summary Tool in Action:</strong> From Ticket #100, it quickly identified the critical details:</p>
<ul>
<li><p>Order #8 for a USB-C Charger</p>
</li>
<li><p>Expected delivery: Dec 23</p>
</li>
<li><p>Customer pain points: Delivery delay and cancellation concerns</p>
</li>
</ul>
<p><strong>NL2SQL Tool at Work:</strong></p>
<ul>
<li><p>Translated the customer's query into precise SQL</p>
</li>
<li><p>Retrieved real-time status: "Processing"</p>
</li>
<li><p>Zero manual database handling is needed</p>
</li>
</ul>
<p><strong>Reply Tool Magic:</strong> Created a professional response that:</p>
<ul>
<li><p>Acknowledged the delay with empathy</p>
</li>
<li><p>Provided current order status</p>
</li>
<li><p>Outlined next steps</p>
</li>
<li><p>Added support contact details</p>
</li>
</ul>
<p><strong>RAG Tool's Smart Retrieval:</strong></p>
<ul>
<li><p>Instantly pulled payment policy details</p>
</li>
<li><p>Confirmed Visa acceptance</p>
</li>
<li><p>Added value by mentioning other payment options (Mastercard, Amex)</p>
</li>
</ul>
<p>See how each tool contributes to creating a complete, professional support experience. This is exactly how your AI agent will handle real customer queries – with precision and context awareness.</p>
<h3 id="heading-visualizing-the-langgraph">Visualizing the LangGraph</h3>
<p>Visualization helps you understand the workflow of your AI agent and how it transitions between nodes and tools.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> IPython.display <span class="hljs-keyword">import</span> Image, display

<span class="hljs-keyword">try</span>:
    display(Image(graph.get_graph().draw_mermaid_png()))
<span class="hljs-keyword">except</span> Exception:
    print(<span class="hljs-string">"Error generating graph visualization."</span>)
</code></pre>
<p>The generated diagram will showcase nodes (<code>chatbot</code>, <code>tools</code>) and the transitions between them, providing a clear overview of your AI agent’s workflow.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1732629457981/85bce952-5850-45c4-86cf-20964c6ab41a.png?auto=compress,format&amp;format=webp" alt class="image--center mx-auto" /></p>
<h2 id="heading-improvements">Improvements</h2>
<ol>
<li><p>At Futuresmart AI, we have observed in many projects that if you implement methods like Contextual Retrieval, reranking, etc then the performance of the Agentic RAG system will improve.</p>
</li>
<li><p>Set up <strong>Workflow Automation</strong> to handle routine tasks and use AI-driven analytics for actionable insights. Having deployed these solutions for clients, we know which workflows deliver the best ROI.</p>
</li>
<li><p>Empower agents to resolve tickets with AI-suggested replies from knowledge sources ensuring accuracy with a human-in-the-loop.</p>
</li>
</ol>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Congratulations! You've just built a powerful Freshdesk AI agent that combines LangGraph, RAG, and NL2SQL capabilities. Your agent can now handle everything from ticket summaries to product queries, making customer support smoother and more efficient.</p>
<p>But this is just the beginning. At FutureSmart AI, we've helped companies transform their customer support from reactive to proactive using these exact techniques. Our solutions have cut response times and boosted customer satisfaction. Want to see similar results for your business? Check out how we've helped companies like yours in our <a target="_blank" href="https://www.futuresmart.ai/case-studies">case studies</a>.</p>
<p>We have another <a target="_blank" href="https://blog.futuresmart.ai/hubspot-ai-agent">AI agent for HubSpot</a> that revolutionizes CRM management by interpreting user intent, automating tasks, and integrating tools for seamless operations. Whether you need support automation or CRM enhancement, we're here to guide you through your AI journey. Reach out at <a target="_blank" href="mailto:contact@futuresmart.ai">contact@futuresmart.ai</a> to learn more.</p>
<p>Stay tuned for more <a target="_blank" href="https://blog.futuresmart.ai/">tutorials</a> and insights as we continue to explore cutting-edge AI advancements, helping you stay ahead in the rapidly evolving tech landscape!</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text"><a target="_self" href="https://github.com/PradipNichite/FutureSmart-AI-Blog/tree/main/Freshdesk%20Ticketing%20Automation%20with%20AI%20Agents"><strong>Get the Full Code in our GitHub</strong></a></div>
</div>]]></content:encoded></item><item><title><![CDATA[Build a Multi-Agent System with LangGraph: A Complete Tutorial on Agent Orchestration]]></title><description><![CDATA[Introduction
In the last three blogs in our Ultimate Langraph Tutorial Series, we highlighted different components of LangGraph for beginners, Long-term Memory Support, and building an AI agent with custom tools support. After implementing these syst...]]></description><link>https://blog.futuresmart.ai/multi-agent-system-with-langgraph</link><guid isPermaLink="true">https://blog.futuresmart.ai/multi-agent-system-with-langgraph</guid><category><![CDATA[AI]]></category><category><![CDATA[langchain]]></category><category><![CDATA[Multi-Agent Systems (MAS)]]></category><category><![CDATA[RAG ]]></category><category><![CDATA[#agent]]></category><dc:creator><![CDATA[Rounak Show]]></dc:creator><pubDate>Sun, 05 Jan 2025 05:58:13 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1735370930881/f6664a0d-fbb5-487a-9793-331955273157.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>In the last three blogs in our <a target="_blank" href="https://blog.futuresmart.ai/series/langraph-tutorial">Ultimate Langraph Tutorial</a> Series, we highlighted different components of LangGraph for beginners, Long-term Memory Support, and building an AI agent with custom tools support. After implementing these systems for various enterprise clients, we at <a target="_blank" href="https://www.futuresmart.ai/">Futuresmart AI</a>, have observed that as systems grow, they can become complex and hard to manage.</p>
<p>To solve this, we break down the application into smaller, <strong>independent agents</strong> - a pattern we've successfully deployed in production environments for multiple clients. This approach offers modularity, specialization, and control. We can work on individual agents separately, create expert agents, and control how they communicate. Having built numerous such systems in FutureSmart AI, we've seen firsthand how this architecture significantly improves <strong>maintainability</strong> and <strong>scalability</strong>.</p>
<p>In this blog post, we will learn to create a complete <strong>Multi-Agent System</strong> from scratch using <a target="_blank" href="https://blog.futuresmart.ai/langgraph-tutorial-for-beginners">LangGraph</a>. Here, there will be one <strong>Supervisor Agent</strong> which can communicate with other specialized agents. Each agent has their own set of tools, mirroring how we at Futuresmart AI structure enterprise-grade AI solutions.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1735243479772/e2c48b8a-4992-42af-b2ec-3d045082d794.png" alt="multi agent system with LangGraph" class="image--center mx-auto" /></p>
<h2 id="heading-setting-up-the-environment">Setting up the Environment</h2>
<p>To get started, you need to install the below dependencies</p>
<pre><code class="lang-python">%%capture --no-stderr
%pip install -U langgraph langchain_community langchain_openai langchain_experimental langchain-chroma pypdf sentence-transformers
</code></pre>
<h2 id="heading-setting-up-api-keys"><strong>Setting up API Keys</strong></h2>
<p>Before diving into building your LangGraph AI agents, it's crucial to set up your API keys. These keys allow your agent to interact with external tools like <a target="_blank" href="https://tavily.com/"><strong>Tavily Search</strong></a> and <a target="_blank" href="https://openai.com/index/openai-api/"><strong>OpenAI GPT</strong></a> models securely. Without them, the tools cannot function effectively.</p>
<p>Here we are using the OpenAI model, but you can use any LLM of your choice.</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> getpass
<span class="hljs-keyword">import</span> os

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">_set_env</span>(<span class="hljs-params">var: str</span>):</span>
    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> os.environ.get(var):
        os.environ[var] = getpass.getpass(<span class="hljs-string">f"<span class="hljs-subst">{var}</span>: "</span>)

_set_env(<span class="hljs-string">"TAVILY_API_KEY"</span>)
_set_env(<span class="hljs-string">"OPENAI_API_KEY"</span>)
</code></pre>
<h2 id="heading-creating-the-llm-object"><strong>Creating the LLM Object</strong></h2>
<p>Here’s how to initialize the LLM using LangChain <code>ChatOpenAI</code></p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain_openai <span class="hljs-keyword">import</span> ChatOpenAI

llm = ChatOpenAI(model_name=<span class="hljs-string">"gpt-4o"</span>)
</code></pre>
<h2 id="heading-create-tools">Create Tools</h2>
<p>In this blog, we will make three agents. One agent can do web research with a search engine tool. One agent can retrieve documents and answers based on that. And one agent can query the SQL database.</p>
<p>Now, in our previous LangGraph blog, we have already shown how to create a WebSearch Tool, a Retriever Tool, and an NL2SQL Tool. You can follow those steps to create tools for this tutorial.</p>
<ul>
<li><p><a target="_blank" href="https://blog.futuresmart.ai/langgraph-agent-with-rag-and-nl2sql#heading-websearch-tool-integration">Create Web Search Tool</a></p>
</li>
<li><p><a target="_blank" href="https://blog.futuresmart.ai/langgraph-agent-with-rag-and-nl2sql#heading-agentic-rag-tool-integration">Create Retriever Tool</a> (For RAG)</p>
</li>
<li><p><a target="_blank" href="https://blog.futuresmart.ai/langgraph-agent-with-rag-and-nl2sql#heading-nl2sql-tool-integration">Create NL2SQL Tool</a></p>
</li>
</ul>
<p>Now, once you have the tools ready, we are ready to proceed with creating the AI Agents.</p>
<h2 id="heading-creating-the-supervisor-agent">Creating the Supervisor Agent</h2>
<p>The supervisor agent is responsible for managing the conversation flow between different specialized agents. It decides which agent should handle the current request and when the task is complete. Let's see how to implement this:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> Literal
<span class="hljs-keyword">from</span> typing_extensions <span class="hljs-keyword">import</span> TypedDict
<span class="hljs-keyword">from</span> langgraph.graph <span class="hljs-keyword">import</span> MessagesState, START, END
<span class="hljs-keyword">from</span> langgraph.types <span class="hljs-keyword">import</span> Command

<span class="hljs-comment"># Define available agents</span>
members = [<span class="hljs-string">"web_researcher"</span>, <span class="hljs-string">"rag"</span>, <span class="hljs-string">"nl2sql"</span>]
<span class="hljs-comment"># Add FINISH as an option for task completion</span>
options = members + [<span class="hljs-string">"FINISH"</span>]

<span class="hljs-comment"># Create system prompt for supervisor</span>
system_prompt = (
    <span class="hljs-string">"You are a supervisor tasked with managing a conversation between the"</span>
    <span class="hljs-string">f" following workers: <span class="hljs-subst">{members}</span>. Given the following user request,"</span>
    <span class="hljs-string">" respond with the worker to act next. Each worker will perform a"</span>
    <span class="hljs-string">" task and respond with their results and status. When finished,"</span>
    <span class="hljs-string">" respond with FINISH."</span>
)

<span class="hljs-comment"># Define router type for structured output</span>
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Router</span>(<span class="hljs-params">TypedDict</span>):</span>
    <span class="hljs-string">"""Worker to route to next. If no workers needed, route to FINISH."""</span>
    next: Literal[<span class="hljs-string">"web_researcher"</span>, <span class="hljs-string">"rag"</span>, <span class="hljs-string">"nl2sql"</span>, <span class="hljs-string">"FINISH"</span>]

<span class="hljs-comment"># Create supervisor node function</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">supervisor_node</span>(<span class="hljs-params">state: MessagesState</span>) -&gt; Command[Literal["web_researcher", "rag", "nl2sql", "__end__"]]:</span>
    messages = [
        {<span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>, <span class="hljs-string">"content"</span>: system_prompt},
    ] + state[<span class="hljs-string">"messages"</span>]
    response = llm.with_structured_output(Router).invoke(messages)
    goto = response[<span class="hljs-string">"next"</span>]
    print(<span class="hljs-string">f"Next Worker: <span class="hljs-subst">{goto}</span>"</span>)
    <span class="hljs-keyword">if</span> goto == <span class="hljs-string">"FINISH"</span>:
        goto = END
    <span class="hljs-keyword">return</span> Command(goto=goto)
</code></pre>
<p>The supervisor agent works by:</p>
<ol>
<li><p>Taking the current conversation state as input</p>
</li>
<li><p>Using the system prompt to understand its role</p>
</li>
<li><p>Making a decision about which agent should act next</p>
</li>
<li><p>Returning a command that directs the flow to the chosen agent</p>
</li>
</ol>
<h2 id="heading-implementing-individual-agents">Implementing Individual Agents</h2>
<p>Now, let's create our specialized agents using a custom function that creates individual LangGraph graphs for each agent. This approach allows each agent to have its own workflow while maintaining consistency in how they process requests and use tools.</p>
<p>First, let's define our custom <code>create_agent</code> function:</p>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">AgentState</span>(<span class="hljs-params">TypedDict</span>):</span>
    <span class="hljs-string">"""The state of the agent."""</span>
    messages: Annotated[Sequence[BaseMessage], add_messages]

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create_agent</span>(<span class="hljs-params">llm, tools</span>):</span>
    llm_with_tools = llm.bind_tools(tools)
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">chatbot</span>(<span class="hljs-params">state: AgentState</span>):</span>
        <span class="hljs-keyword">return</span> {<span class="hljs-string">"messages"</span>: [llm_with_tools.invoke(state[<span class="hljs-string">"messages"</span>])]}

    graph_builder = StateGraph(AgentState)
    graph_builder.add_node(<span class="hljs-string">"agent"</span>, chatbot)

    tool_node = ToolNode(tools=tools)
    graph_builder.add_node(<span class="hljs-string">"tools"</span>, tool_node)

    graph_builder.add_conditional_edges(
        <span class="hljs-string">"agent"</span>,
        tools_condition,
    )
    graph_builder.add_edge(<span class="hljs-string">"tools"</span>, <span class="hljs-string">"agent"</span>)
    graph_builder.set_entry_point(<span class="hljs-string">"agent"</span>)
    <span class="hljs-keyword">return</span> graph_builder.compile()
</code></pre>
<p>This custom function:</p>
<ol>
<li><p>Takes an LLM and a list of tools as input</p>
</li>
<li><p>Creates a new StateGraph for the agent</p>
</li>
<li><p>Sets up the necessary nodes for the agent and tool execution</p>
</li>
<li><p>Configures the flow between nodes</p>
</li>
<li><p>Returns a compiled graph that represents the agent's complete workflow</p>
</li>
</ol>
<p>Now, let's create our specialized agents using this function:</p>
<h3 id="heading-web-research-agent">Web Research Agent</h3>
<pre><code class="lang-python">websearch_agent = create_agent(llm, [web_search_tool])

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">web_research_node</span>(<span class="hljs-params">state: MessagesState</span>) -&gt; Command[Literal["supervisor"]]:</span>
    result = websearch_agent.invoke(state)
    <span class="hljs-keyword">return</span> Command(
        update={
            <span class="hljs-string">"messages"</span>: [
                HumanMessage(content=result[<span class="hljs-string">"messages"</span>][<span class="hljs-number">-1</span>].content, name=<span class="hljs-string">"web_researcher"</span>)
            ]
        },
        goto=<span class="hljs-string">"supervisor"</span>,
    )
</code></pre>
<h3 id="heading-rag-agent">RAG Agent</h3>
<pre><code class="lang-python">rag_agent = create_agent(llm, [retriever_tool])

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">rag_node</span>(<span class="hljs-params">state: MessagesState</span>) -&gt; Command[Literal["supervisor"]]:</span>
    result = rag_agent.invoke(state)
    <span class="hljs-keyword">return</span> Command(
        update={
            <span class="hljs-string">"messages"</span>: [
                HumanMessage(content=result[<span class="hljs-string">"messages"</span>][<span class="hljs-number">-1</span>].content, name=<span class="hljs-string">"rag"</span>)
            ]
        },
        goto=<span class="hljs-string">"supervisor"</span>,
    )
</code></pre>
<h3 id="heading-nl2sql-agent">NL2SQL Agent</h3>
<pre><code class="lang-python">nl2sql_agent = create_agent(llm, [nl2sql_tool])

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">nl2sql_node</span>(<span class="hljs-params">state: MessagesState</span>) -&gt; Command[Literal["supervisor"]]:</span>
    result = nl2sql_agent.invoke(state)
    <span class="hljs-keyword">return</span> Command(
        update={
            <span class="hljs-string">"messages"</span>: [
                HumanMessage(content=result[<span class="hljs-string">"messages"</span>][<span class="hljs-number">-1</span>].content, name=<span class="hljs-string">"nl2sql"</span>)
            ]
        },
        goto=<span class="hljs-string">"supervisor"</span>,
    )
</code></pre>
<h2 id="heading-putting-it-all-together">Putting It All Together</h2>
<p>Finally, let's create the main graph that connects all our agents:</p>
<pre><code class="lang-python">builder = StateGraph(MessagesState)
builder.add_edge(START, <span class="hljs-string">"supervisor"</span>)
builder.add_node(<span class="hljs-string">"supervisor"</span>, supervisor_node)
builder.add_node(<span class="hljs-string">"web_researcher"</span>, web_research_node)
builder.add_node(<span class="hljs-string">"rag"</span>, rag_node)
builder.add_node(<span class="hljs-string">"nl2sql"</span>, nl2sql_node)
graph = builder.compile()
</code></pre>
<p>This creates a complete multi-agent system where:</p>
<ol>
<li><p>The supervisor receives the initial request</p>
</li>
<li><p>It routes the request to the appropriate specialized agent</p>
</li>
<li><p>The specialized agent processes the request using its tools</p>
</li>
<li><p>Control returns to the supervisor to decide the next step</p>
</li>
<li><p>The process continues until the supervisor determines the task is complete</p>
</li>
</ol>
<h2 id="heading-visualizing-the-langgraph"><strong>Visualizing the LangGraph</strong></h2>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> IPython.display <span class="hljs-keyword">import</span> Image, display

<span class="hljs-keyword">try</span>:
    display(Image(graph.get_graph().draw_mermaid_png()))
<span class="hljs-keyword">except</span> Exception:
    <span class="hljs-comment"># You can put your exception handling code here</span>
    <span class="hljs-keyword">pass</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1735367936570/8a14ff15-e7b0-4fa5-8810-f323eefde557.png" alt="visualizing graph of multi agent langgraph workflow" class="image--center mx-auto" /></p>
<h2 id="heading-testing-the-system">Testing the System</h2>
<p>Let's test our multi-agent system with a couple of examples:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Example: Complex Query Using Multiple Agents</span>
input_question = <span class="hljs-string">"Find the founder of FutureSmart AI and then do a web research on him"</span>
<span class="hljs-keyword">for</span> s <span class="hljs-keyword">in</span> graph.stream(
    {<span class="hljs-string">"messages"</span>: [(<span class="hljs-string">"user"</span>, input_question)]}, 
    subgraphs=<span class="hljs-literal">True</span>
):
    print(s)
    print(<span class="hljs-string">"----"</span>)
</code></pre>
<p><strong>Output:</strong></p>
<pre><code class="lang-python">Next Worker: rag
((), {<span class="hljs-string">'supervisor'</span>: <span class="hljs-literal">None</span>})
----
INSIDE RETRIEVER NODE
((<span class="hljs-string">'rag:7c5458df-0abd-944a-27f7-b0bad49ccf3d'</span>,), {<span class="hljs-string">'agent'</span>: {<span class="hljs-string">'messages'</span>: [AIMessage(content=<span class="hljs-string">''</span>, additional_kwargs={<span class="hljs-string">'tool_calls'</span>: [{<span class="hljs-string">'id'</span>: <span class="hljs-string">'call_fK9lMHGrtubenQ697xpd2ZZ2'</span>, <span class="hljs-string">'function'</span>: {<span class="hljs-string">'arguments'</span>: <span class="hljs-string">'{"question":"Who is the founder of FutureSmart AI?"}'</span>, <span class="hljs-string">'name'</span>: <span class="hljs-string">'retriever_tool'</span>}, <span class="hljs-string">'type'</span>: <span class="hljs-string">'function'</span>}], <span class="hljs-string">'refusal'</span>: <span class="hljs-literal">None</span>}, response_metadata={<span class="hljs-string">'token_usage'</span>: {<span class="hljs-string">'completion_tokens'</span>: <span class="hljs-number">25</span>, <span class="hljs-string">'prompt_tokens'</span>: <span class="hljs-number">70</span>, <span class="hljs-string">'total_tokens'</span>: <span class="hljs-number">95</span>, <span class="hljs-string">'completion_tokens_details'</span>: {<span class="hljs-string">'accepted_prediction_tokens'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'audio_tokens'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'reasoning_tokens'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'rejected_prediction_tokens'</span>: <span class="hljs-number">0</span>}, <span class="hljs-string">'prompt_tokens_details'</span>: {<span class="hljs-string">'audio_tokens'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'cached_tokens'</span>: <span class="hljs-number">0</span>}}, <span class="hljs-string">'model_name'</span>: <span class="hljs-string">'gpt-4o-2024-08-06'</span>, <span class="hljs-string">'system_fingerprint'</span>: <span class="hljs-string">'fp_d28bcae782'</span>, <span class="hljs-string">'finish_reason'</span>: <span class="hljs-string">'tool_calls'</span>, <span class="hljs-string">'logprobs'</span>: <span class="hljs-literal">None</span>}, id=<span class="hljs-string">'run-e538251e-24e9-45ac-a5b7-b4ce111615ad-0'</span>, tool_calls=[{<span class="hljs-string">'name'</span>: <span class="hljs-string">'retriever_tool'</span>, <span class="hljs-string">'args'</span>: {<span class="hljs-string">'question'</span>: <span class="hljs-string">'Who is the founder of FutureSmart AI?'</span>}, <span class="hljs-string">'id'</span>: <span class="hljs-string">'call_fK9lMHGrtubenQ697xpd2ZZ2'</span>, <span class="hljs-string">'type'</span>: <span class="hljs-string">'tool_call'</span>}], usage_metadata={<span class="hljs-string">'input_tokens'</span>: <span class="hljs-number">70</span>, <span class="hljs-string">'output_tokens'</span>: <span class="hljs-number">25</span>, <span class="hljs-string">'total_tokens'</span>: <span class="hljs-number">95</span>, <span class="hljs-string">'input_token_details'</span>: {<span class="hljs-string">'audio'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'cache_read'</span>: <span class="hljs-number">0</span>}, <span class="hljs-string">'output_token_details'</span>: {<span class="hljs-string">'audio'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'reasoning'</span>: <span class="hljs-number">0</span>}})]}})
----
((<span class="hljs-string">'rag:7c5458df-0abd-944a-27f7-b0bad49ccf3d'</span>,), {<span class="hljs-string">'tools'</span>: {<span class="hljs-string">'messages'</span>: [ToolMessage(content=<span class="hljs-string">'FutureSmart AI provides customized speech to text services, employing cutting-\nedge speech recognition technologies to cater to specific client needs. Ideal for \ncreating efficient documentation and enabling voice-driven commands, this \nsolution boosts productivity and accessibility.\n\nFutureSmart AI provides custom Natural Language Processing (NLP) \nsolutions for companies looking to get ahead of the future. Our \ndedicated team of Data Scientists and ML Engineers provides an end-\nto-end solution from data labeling to modeling and deploying an ML \nmodel tailored to your specific use case. \nFounder: Pradip Nichite \n \nServices: \nText Classification \nAt FutureSmart AI, we develop custom text classification solutions using \nadvanced NLP techniques tailored to your specific business requirements. \nLeveraging Python, Pytorch, and Hugging Face transformers, we enable precise \ndata categorization across applications such as intent detection, document \ncategorization, and sentiment analysis, enhancing your decision-making \nprocesses and operational efficiency. \n \nChatbots \nWe specialize in creating custom chatbots that integrate seamlessly with your \nbusiness environment. Using semantic search and large language models, our'</span>, name=<span class="hljs-string">'retriever_tool'</span>, id=<span class="hljs-string">'fe12dcaa-a380-437f-8c24-5a7cbf6ab031'</span>, tool_call_id=<span class="hljs-string">'call_fK9lMHGrtubenQ697xpd2ZZ2'</span>)]}})
----
((<span class="hljs-string">'rag:7c5458df-0abd-944a-27f7-b0bad49ccf3d'</span>,), {<span class="hljs-string">'agent'</span>: {<span class="hljs-string">'messages'</span>: [AIMessage(content=<span class="hljs-string">''</span>, additional_kwargs={<span class="hljs-string">'tool_calls'</span>: [{<span class="hljs-string">'id'</span>: <span class="hljs-string">'call_nvmRMsfWcg0YVC9xeTqxZO7z'</span>, <span class="hljs-string">'function'</span>: {<span class="hljs-string">'arguments'</span>: <span class="hljs-string">'{"question": "Who is Pradip Nichite?"}'</span>, <span class="hljs-string">'name'</span>: <span class="hljs-string">'retriever_tool'</span>}, <span class="hljs-string">'type'</span>: <span class="hljs-string">'function'</span>}, {<span class="hljs-string">'id'</span>: <span class="hljs-string">'call_IGzCvWkpkzlpwlhFR1MR80U4'</span>, <span class="hljs-string">'function'</span>: {<span class="hljs-string">'arguments'</span>: <span class="hljs-string">'{"question": "What is the professional background of Pradip Nichite?"}'</span>, <span class="hljs-string">'name'</span>: <span class="hljs-string">'retriever_tool'</span>}, <span class="hljs-string">'type'</span>: <span class="hljs-string">'function'</span>}], <span class="hljs-string">'refusal'</span>: <span class="hljs-literal">None</span>}, response_metadata={<span class="hljs-string">'token_usage'</span>: {<span class="hljs-string">'completion_tokens'</span>: <span class="hljs-number">67</span>, <span class="hljs-string">'prompt_tokens'</span>: <span class="hljs-number">322</span>, <span class="hljs-string">'total_tokens'</span>: <span class="hljs-number">389</span>, <span class="hljs-string">'completion_tokens_details'</span>: {<span class="hljs-string">'accepted_prediction_tokens'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'audio_tokens'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'reasoning_tokens'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'rejected_prediction_tokens'</span>: <span class="hljs-number">0</span>}, <span class="hljs-string">'prompt_tokens_details'</span>: {<span class="hljs-string">'audio_tokens'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'cached_tokens'</span>: <span class="hljs-number">0</span>}}, <span class="hljs-string">'model_name'</span>: <span class="hljs-string">'gpt-4o-2024-08-06'</span>, <span class="hljs-string">'system_fingerprint'</span>: <span class="hljs-string">'fp_d28bcae782'</span>, <span class="hljs-string">'finish_reason'</span>: <span class="hljs-string">'tool_calls'</span>, <span class="hljs-string">'logprobs'</span>: <span class="hljs-literal">None</span>}, id=<span class="hljs-string">'run-a4c6ec42-3aec-449c-ab39-ac029109eaad-0'</span>, tool_calls=[{<span class="hljs-string">'name'</span>: <span class="hljs-string">'retriever_tool'</span>, <span class="hljs-string">'args'</span>: {<span class="hljs-string">'question'</span>: <span class="hljs-string">'Who is Pradip Nichite?'</span>}, <span class="hljs-string">'id'</span>: <span class="hljs-string">'call_nvmRMsfWcg0YVC9xeTqxZO7z'</span>, <span class="hljs-string">'type'</span>: <span class="hljs-string">'tool_call'</span>}, {<span class="hljs-string">'name'</span>: <span class="hljs-string">'retriever_tool'</span>, <span class="hljs-string">'args'</span>: {<span class="hljs-string">'question'</span>: <span class="hljs-string">'What is the professional background of Pradip Nichite?'</span>}, <span class="hljs-string">'id'</span>: <span class="hljs-string">'call_IGzCvWkpkzlpwlhFR1MR80U4'</span>, <span class="hljs-string">'type'</span>: <span class="hljs-string">'tool_call'</span>}], usage_metadata={<span class="hljs-string">'input_tokens'</span>: <span class="hljs-number">322</span>, <span class="hljs-string">'output_tokens'</span>: <span class="hljs-number">67</span>, <span class="hljs-string">'total_tokens'</span>: <span class="hljs-number">389</span>, <span class="hljs-string">'input_token_details'</span>: {<span class="hljs-string">'audio'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'cache_read'</span>: <span class="hljs-number">0</span>}, <span class="hljs-string">'output_token_details'</span>: {<span class="hljs-string">'audio'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'reasoning'</span>: <span class="hljs-number">0</span>}})]}})
----
INSIDE RETRIEVER NODE
INSIDE RETRIEVER NODE
((<span class="hljs-string">'rag:7c5458df-0abd-944a-27f7-b0bad49ccf3d'</span>,), {<span class="hljs-string">'tools'</span>: {<span class="hljs-string">'messages'</span>: [ToolMessage(content=<span class="hljs-string">'FutureSmart AI provides customized speech to text services, employing cutting-\nedge speech recognition technologies to cater to specific client needs. Ideal for \ncreating efficient documentation and enabling voice-driven commands, this \nsolution boosts productivity and accessibility.\n\nFutureSmart AI provides custom Natural Language Processing (NLP) \nsolutions for companies looking to get ahead of the future. Our \ndedicated team of Data Scientists and ML Engineers provides an end-\nto-end solution from data labeling to modeling and deploying an ML \nmodel tailored to your specific use case. \nFounder: Pradip Nichite \n \nServices: \nText Classification \nAt FutureSmart AI, we develop custom text classification solutions using \nadvanced NLP techniques tailored to your specific business requirements. \nLeveraging Python, Pytorch, and Hugging Face transformers, we enable precise \ndata categorization across applications such as intent detection, document \ncategorization, and sentiment analysis, enhancing your decision-making \nprocesses and operational efficiency. \n \nChatbots \nWe specialize in creating custom chatbots that integrate seamlessly with your \nbusiness environment. Using semantic search and large language models, our'</span>, name=<span class="hljs-string">'retriever_tool'</span>, id=<span class="hljs-string">'57d1c6f3-b789-4ae1-84c4-c156ca34d3c1'</span>, tool_call_id=<span class="hljs-string">'call_nvmRMsfWcg0YVC9xeTqxZO7z'</span>), ToolMessage(content=<span class="hljs-string">'FutureSmart AI provides customized speech to text services, employing cutting-\nedge speech recognition technologies to cater to specific client needs. Ideal for \ncreating efficient documentation and enabling voice-driven commands, this \nsolution boosts productivity and accessibility.\n\nFutureSmart AI provides custom Natural Language Processing (NLP) \nsolutions for companies looking to get ahead of the future. Our \ndedicated team of Data Scientists and ML Engineers provides an end-\nto-end solution from data labeling to modeling and deploying an ML \nmodel tailored to your specific use case. \nFounder: Pradip Nichite \n \nServices: \nText Classification \nAt FutureSmart AI, we develop custom text classification solutions using \nadvanced NLP techniques tailored to your specific business requirements. \nLeveraging Python, Pytorch, and Hugging Face transformers, we enable precise \ndata categorization across applications such as intent detection, document \ncategorization, and sentiment analysis, enhancing your decision-making \nprocesses and operational efficiency. \n \nChatbots \nWe specialize in creating custom chatbots that integrate seamlessly with your \nbusiness environment. Using semantic search and large language models, our'</span>, name=<span class="hljs-string">'retriever_tool'</span>, id=<span class="hljs-string">'a498df53-f77a-4bfb-abbf-9153790295e5'</span>, tool_call_id=<span class="hljs-string">'call_IGzCvWkpkzlpwlhFR1MR80U4'</span>)]}})
----
((<span class="hljs-string">'rag:7c5458df-0abd-944a-27f7-b0bad49ccf3d'</span>,), {<span class="hljs-string">'agent'</span>: {<span class="hljs-string">'messages'</span>: [AIMessage(content=<span class="hljs-string">"The founder of FutureSmart AI is Pradip Nichite. Unfortunately, the current retrieval did not provide additional information specifically about Pradip Nichite's professional background or further personal details. For more comprehensive insights, you might consider conducting a more extensive web search or accessing professional networking sites like LinkedIn."</span>, additional_kwargs={<span class="hljs-string">'refusal'</span>: <span class="hljs-literal">None</span>}, response_metadata={<span class="hljs-string">'token_usage'</span>: {<span class="hljs-string">'completion_tokens'</span>: <span class="hljs-number">63</span>, <span class="hljs-string">'prompt_tokens'</span>: <span class="hljs-number">888</span>, <span class="hljs-string">'total_tokens'</span>: <span class="hljs-number">951</span>, <span class="hljs-string">'completion_tokens_details'</span>: {<span class="hljs-string">'accepted_prediction_tokens'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'audio_tokens'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'reasoning_tokens'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'rejected_prediction_tokens'</span>: <span class="hljs-number">0</span>}, <span class="hljs-string">'prompt_tokens_details'</span>: {<span class="hljs-string">'audio_tokens'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'cached_tokens'</span>: <span class="hljs-number">0</span>}}, <span class="hljs-string">'model_name'</span>: <span class="hljs-string">'gpt-4o-2024-08-06'</span>, <span class="hljs-string">'system_fingerprint'</span>: <span class="hljs-string">'fp_d28bcae782'</span>, <span class="hljs-string">'finish_reason'</span>: <span class="hljs-string">'stop'</span>, <span class="hljs-string">'logprobs'</span>: <span class="hljs-literal">None</span>}, id=<span class="hljs-string">'run-bc47c29c-0693-41c5-88be-b322e1fbb096-0'</span>, usage_metadata={<span class="hljs-string">'input_tokens'</span>: <span class="hljs-number">888</span>, <span class="hljs-string">'output_tokens'</span>: <span class="hljs-number">63</span>, <span class="hljs-string">'total_tokens'</span>: <span class="hljs-number">951</span>, <span class="hljs-string">'input_token_details'</span>: {<span class="hljs-string">'audio'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'cache_read'</span>: <span class="hljs-number">0</span>}, <span class="hljs-string">'output_token_details'</span>: {<span class="hljs-string">'audio'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'reasoning'</span>: <span class="hljs-number">0</span>}})]}})
----
((), {<span class="hljs-string">'rag'</span>: {<span class="hljs-string">'messages'</span>: [HumanMessage(content=<span class="hljs-string">"The founder of FutureSmart AI is Pradip Nichite. Unfortunately, the current retrieval did not provide additional information specifically about Pradip Nichite's professional background or further personal details. For more comprehensive insights, you might consider conducting a more extensive web search or accessing professional networking sites like LinkedIn."</span>, additional_kwargs={}, response_metadata={}, name=<span class="hljs-string">'rag'</span>)]}})
----
Next Worker: web_researcher
((), {<span class="hljs-string">'supervisor'</span>: <span class="hljs-literal">None</span>})
----
((<span class="hljs-string">'web_researcher:509bb5e2-bf9e-2c1d-5c65-978a73d5e94c'</span>,), {<span class="hljs-string">'agent'</span>: {<span class="hljs-string">'messages'</span>: [AIMessage(content=<span class="hljs-string">''</span>, additional_kwargs={<span class="hljs-string">'tool_calls'</span>: [{<span class="hljs-string">'id'</span>: <span class="hljs-string">'call_L5d4KhCSPsT5HmHTpHmnyryx'</span>, <span class="hljs-string">'function'</span>: {<span class="hljs-string">'arguments'</span>: <span class="hljs-string">'{"query":"Pradip Nichite"}'</span>, <span class="hljs-string">'name'</span>: <span class="hljs-string">'tavily_search_results_json'</span>}, <span class="hljs-string">'type'</span>: <span class="hljs-string">'function'</span>}], <span class="hljs-string">'refusal'</span>: <span class="hljs-literal">None</span>}, response_metadata={<span class="hljs-string">'token_usage'</span>: {<span class="hljs-string">'completion_tokens'</span>: <span class="hljs-number">23</span>, <span class="hljs-string">'prompt_tokens'</span>: <span class="hljs-number">161</span>, <span class="hljs-string">'total_tokens'</span>: <span class="hljs-number">184</span>, <span class="hljs-string">'completion_tokens_details'</span>: {<span class="hljs-string">'accepted_prediction_tokens'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'audio_tokens'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'reasoning_tokens'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'rejected_prediction_tokens'</span>: <span class="hljs-number">0</span>}, <span class="hljs-string">'prompt_tokens_details'</span>: {<span class="hljs-string">'audio_tokens'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'cached_tokens'</span>: <span class="hljs-number">0</span>}}, <span class="hljs-string">'model_name'</span>: <span class="hljs-string">'gpt-4o-2024-08-06'</span>, <span class="hljs-string">'system_fingerprint'</span>: <span class="hljs-string">'fp_e161c81bbd'</span>, <span class="hljs-string">'finish_reason'</span>: <span class="hljs-string">'tool_calls'</span>, <span class="hljs-string">'logprobs'</span>: <span class="hljs-literal">None</span>}, id=<span class="hljs-string">'run-35518542-a3b6-424b-b4c7-f8fbb56cffd6-0'</span>, tool_calls=[{<span class="hljs-string">'name'</span>: <span class="hljs-string">'tavily_search_results_json'</span>, <span class="hljs-string">'args'</span>: {<span class="hljs-string">'query'</span>: <span class="hljs-string">'Pradip Nichite'</span>}, <span class="hljs-string">'id'</span>: <span class="hljs-string">'call_L5d4KhCSPsT5HmHTpHmnyryx'</span>, <span class="hljs-string">'type'</span>: <span class="hljs-string">'tool_call'</span>}], usage_metadata={<span class="hljs-string">'input_tokens'</span>: <span class="hljs-number">161</span>, <span class="hljs-string">'output_tokens'</span>: <span class="hljs-number">23</span>, <span class="hljs-string">'total_tokens'</span>: <span class="hljs-number">184</span>, <span class="hljs-string">'input_token_details'</span>: {<span class="hljs-string">'audio'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'cache_read'</span>: <span class="hljs-number">0</span>}, <span class="hljs-string">'output_token_details'</span>: {<span class="hljs-string">'audio'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'reasoning'</span>: <span class="hljs-number">0</span>}})]}})
----
((<span class="hljs-string">'web_researcher:509bb5e2-bf9e-2c1d-5c65-978a73d5e94c'</span>,), {<span class="hljs-string">'tools'</span>: {<span class="hljs-string">'messages'</span>: [ToolMessage(content=<span class="hljs-string">'[{"url": "https://www.youtube.com/c/PradipNichiteAI", "content": "Hello, my name is Pradip Nichite. I am a 🚀 Top Rated Plus Data Science Freelancer with 8+ years of experience, specializing in NLP and Back-End Development. Founder of FutureSmart AI, helping"}, {"url": "https://www.youtube.com/channel/UCwpCmuWq_NPVLNyr8z1IGGQ", "content": "I\'m Pradip Nichite, a Top Rated Plus freelance Data Scientist on Upwork 💼, a successful digital nomad 🌍, and an entrepreneur. My journey in freelancing has led me to earn over $200K 💰"}]'</span>, name=<span class="hljs-string">'tavily_search_results_json'</span>, id=<span class="hljs-string">'5daeafe8-e673-425e-9d7e-35f49ccae710'</span>, tool_call_id=<span class="hljs-string">'call_L5d4KhCSPsT5HmHTpHmnyryx'</span>, artifact={<span class="hljs-string">'query'</span>: <span class="hljs-string">'Pradip Nichite'</span>, <span class="hljs-string">'follow_up_questions'</span>: <span class="hljs-literal">None</span>, <span class="hljs-string">'answer'</span>: <span class="hljs-literal">None</span>, <span class="hljs-string">'images'</span>: [], <span class="hljs-string">'results'</span>: [{<span class="hljs-string">'title'</span>: <span class="hljs-string">'Pradip Nichite - YouTube'</span>, <span class="hljs-string">'url'</span>: <span class="hljs-string">'https://www.youtube.com/c/PradipNichiteAI'</span>, <span class="hljs-string">'content'</span>: <span class="hljs-string">'Hello, my name is Pradip Nichite. I am a 🚀 Top Rated Plus Data Science Freelancer with 8+ years of experience, specializing in NLP and Back-End Development. Founder of FutureSmart AI, helping'</span>, <span class="hljs-string">'score'</span>: <span class="hljs-number">0.8080827</span>, <span class="hljs-string">'raw_content'</span>: <span class="hljs-literal">None</span>}, {<span class="hljs-string">'title'</span>: <span class="hljs-string">'Pradip Nichite - YouTube'</span>, <span class="hljs-string">'url'</span>: <span class="hljs-string">'https://www.youtube.com/channel/UCwpCmuWq_NPVLNyr8z1IGGQ'</span>, <span class="hljs-string">'content'</span>: <span class="hljs-string">"I'm Pradip Nichite, a Top Rated Plus freelance Data Scientist on Upwork 💼, a successful digital nomad 🌍, and an entrepreneur. My journey in freelancing has led me to earn over $200K 💰"</span>, <span class="hljs-string">'score'</span>: <span class="hljs-number">0.7636429</span>, <span class="hljs-string">'raw_content'</span>: <span class="hljs-literal">None</span>}], <span class="hljs-string">'response_time'</span>: <span class="hljs-number">1.73</span>})]}})
----
((<span class="hljs-string">'web_researcher:509bb5e2-bf9e-2c1d-5c65-978a73d5e94c'</span>,), {<span class="hljs-string">'agent'</span>: {<span class="hljs-string">'messages'</span>: [AIMessage(content=<span class="hljs-string">'Pradip Nichite is a Top Rated Plus Data Science Freelancer with over 8 years of experience, specializing in Natural Language Processing (NLP) and Back-End Development. He is the founder of FutureSmart AI. Additionally, Pradip is recognized as a successful digital nomad and entrepreneur, having earned over $200K through freelancing, primarily on platforms like Upwork. For more insights, you can explore his [YouTube channel](https://www.youtube.com/c/PradipNichiteAI), where he shares more about his experiences and expertise.'</span>, additional_kwargs={<span class="hljs-string">'refusal'</span>: <span class="hljs-literal">None</span>}, response_metadata={<span class="hljs-string">'token_usage'</span>: {<span class="hljs-string">'completion_tokens'</span>: <span class="hljs-number">116</span>, <span class="hljs-string">'prompt_tokens'</span>: <span class="hljs-number">346</span>, <span class="hljs-string">'total_tokens'</span>: <span class="hljs-number">462</span>, <span class="hljs-string">'completion_tokens_details'</span>: {<span class="hljs-string">'accepted_prediction_tokens'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'audio_tokens'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'reasoning_tokens'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'rejected_prediction_tokens'</span>: <span class="hljs-number">0</span>}, <span class="hljs-string">'prompt_tokens_details'</span>: {<span class="hljs-string">'audio_tokens'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'cached_tokens'</span>: <span class="hljs-number">0</span>}}, <span class="hljs-string">'model_name'</span>: <span class="hljs-string">'gpt-4o-2024-08-06'</span>, <span class="hljs-string">'system_fingerprint'</span>: <span class="hljs-string">'fp_e161c81bbd'</span>, <span class="hljs-string">'finish_reason'</span>: <span class="hljs-string">'stop'</span>, <span class="hljs-string">'logprobs'</span>: <span class="hljs-literal">None</span>}, id=<span class="hljs-string">'run-32d923b8-a3be-420a-a06f-35a7e27c68bb-0'</span>, usage_metadata={<span class="hljs-string">'input_tokens'</span>: <span class="hljs-number">346</span>, <span class="hljs-string">'output_tokens'</span>: <span class="hljs-number">116</span>, <span class="hljs-string">'total_tokens'</span>: <span class="hljs-number">462</span>, <span class="hljs-string">'input_token_details'</span>: {<span class="hljs-string">'audio'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'cache_read'</span>: <span class="hljs-number">0</span>}, <span class="hljs-string">'output_token_details'</span>: {<span class="hljs-string">'audio'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'reasoning'</span>: <span class="hljs-number">0</span>}})]}})
----
((), {<span class="hljs-string">'web_researcher'</span>: {<span class="hljs-string">'messages'</span>: [HumanMessage(content=<span class="hljs-string">'Pradip Nichite is a Top Rated Plus Data Science Freelancer with over 8 years of experience, specializing in Natural Language Processing (NLP) and Back-End Development. He is the founder of FutureSmart AI. Additionally, Pradip is recognized as a successful digital nomad and entrepreneur, having earned over $200K through freelancing, primarily on platforms like Upwork. For more insights, you can explore his [YouTube channel](https://www.youtube.com/c/PradipNichiteAI), where he shares more about his experiences and expertise.'</span>, additional_kwargs={}, response_metadata={}, name=<span class="hljs-string">'web_researcher'</span>)]}})
----
Next Worker: FINISH
((), {<span class="hljs-string">'supervisor'</span>: <span class="hljs-literal">None</span>})
----
</code></pre>
<blockquote>
<p>Do not confuse by seeing {'supervisor': None} . This issue has been faced by others also. The multi-agent system will still work as intended.</p>
</blockquote>
<h2 id="heading-things-to-improve">Things to Improve</h2>
<ul>
<li><p>Add <a target="_blank" href="https://blog.futuresmart.ai/how-to-build-langgraph-agent-with-long-term-memory">memory support</a> to allow agents to remember previous interactions and maintain context</p>
</li>
<li><p>Improved the system prompt of the supervisor agent for better decision-making</p>
</li>
<li><p>Add more relevant tools to each individual agent</p>
</li>
</ul>
<h2 id="heading-conclusion">Conclusion</h2>
<p>In this tutorial, we've built a sophisticated multi-agent system using LangGraph that demonstrates how to create specialized agents working together under centralized supervision. The system showcases advanced concepts in agent orchestration, tool integration, and state management - areas where we at Futuresmart AI, have successfully delivered numerous enterprise solutions.</p>
<p>This architecture provides several advantages:</p>
<ol>
<li><p><strong>Modularity</strong>: Each agent has a specific role and can be modified independently</p>
</li>
<li><p><strong>Scalability</strong>: New agents can be added without changing existing ones</p>
</li>
<li><p><strong>Flexibility</strong>: The supervisor can dynamically choose the best agent for each task</p>
</li>
<li><p><strong>Control</strong>: The workflow is clearly defined and manageable</p>
</li>
</ol>
<p>If you found this guide useful and want to explore more, then you should definitely check our YouTube video on the new <strong>multi-agent framework from OpenAI (Swarm)</strong></p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/g9E7VNeZItM?si=YzBARGIRZx7gW7bP">https://youtu.be/g9E7VNeZItM?si=YzBARGIRZx7gW7bP</a></div>
<p> </p>
<h2 id="heading-more-learning-resources">📚 More Learning Resources</h2>
<p>If you prefer visual learning, check out these step-by-step tutorials:</p>
<ul>
<li><p><a target="_blank" href="https://youtu.be/8sSHg1034r0?si=WMdqqbALooCybCH7"><strong>Build RAG Applications from Scratch Without LangChain or LlamaIndex</strong></a> – Learn to build RAG applications without relying on frameworks for better customization and debugging.</p>
</li>
<li><p><a target="_blank" href="https://youtu.be/38aMTXY2usU?si=nTz0lqlcmqHhIoNs"><strong>LangChain RAG Course: From Basics to a Production-Ready RAG Chatbot</strong></a> – A comprehensive guide on implementing RAG using LangChain, covering everything from basics to production deployment.</p>
</li>
<li><p><a target="_blank" href="https://youtu.be/fss6CrmQU2Y?si=UQMT8HXzaSVXMYqq"><strong>Mastering Natural Language to SQL with LangChain and LangSmith | NL2SQL</strong></a> – A hands-on tutorial on converting natural language queries into SQL using LangChain and LangSmith.</p>
</li>
</ul>
<hr />
<p>At <strong>FutureSmart AI</strong>, we help businesses develop state-of-the-art AI solutions tailored to their needs. We've implemented similar multi-agent architectures for various industries, from customer service automation to complex AI Interview systems.</p>
<p>If you have inquiries, feel free to reach out to us at <a target="_blank" href="mailto:contact@futuresmart.ai"><strong>contact@futuresmart.ai</strong></a>. For real-world examples of our work, take a look at our <a target="_blank" href="https://www.futuresmart.ai/case-studies"><strong>case studies</strong></a>, where we showcase how our expertise in LangGraph and other AI technologies has delivered measurable business value.</p>
<p>Stay tuned for the next tutorial in this series, where we'll explore more advanced patterns and optimizations in multi-agent systems.</p>
<h2 id="heading-resources">Resources</h2>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text"><a target="_self" href="https://github.com/PradipNichite/FutureSmart-AI-Blog/tree/main/Multi%20Agent%20system%20with%20LangGraph">Get the Full Code in our GitHub</a></div>
</div>

<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text"><a target="_self" href="https://langchain-ai.github.io/langgraph/concepts/multi_agent/#multi-agent-architectures">Know more about different Multi-agent systems from LangGraph</a></div>
</div>]]></content:encoded></item><item><title><![CDATA[Building a Powerful HubSpot AI Agent: A Comprehensive Guide]]></title><description><![CDATA[Introduction
Managing the core objects of HubSpot CRM - Contacts, Companies, Tickets, and Deals - is crucial for business operations. However, manually tracking and updating these objects can be a difficult task, prone to errors and inefficiencies. T...]]></description><link>https://blog.futuresmart.ai/hubspot-ai-agent</link><guid isPermaLink="true">https://blog.futuresmart.ai/hubspot-ai-agent</guid><category><![CDATA[AI]]></category><category><![CDATA[HubSpot]]></category><category><![CDATA[ai-agent]]></category><dc:creator><![CDATA[Ruchir Tidke]]></dc:creator><pubDate>Thu, 12 Dec 2024 07:50:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1733816595490/fdeb72af-c6b2-460e-9275-1f7c77b877d7.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3 id="heading-introduction"><strong>Introduction</strong></h3>
<p>Managing the core objects of HubSpot CRM - <strong>Contacts</strong>, <strong>Companies</strong>, <strong>Tickets</strong>, and <strong>Deals</strong> - is crucial for business operations. However, manually tracking and updating these objects can be a difficult task, prone to errors and inefficiencies. The static workflows and manual processes can limit your team's productivity.</p>
<p>This is where the <strong>HubSpot AI Agent</strong> excels. But, building a custom HubSpot agent that can intelligently interact with your CRM data and perform tasks autonomously is a challenging task, requiring extensive coding and technical expertise.</p>
<p>Based on our work developing custom HubSpot AI Agents for clients, we’ll guide you through the process of creating your own HubSpot agent using powerful tools integrated within the <a target="_blank" href="https://blog.futuresmart.ai/langgraph-agent-with-rag-and-nl2sql"><strong>LangGraph framework</strong></a>. This hubspot developers tutorial offers a practical, step-by-step approach to building a fully functional AI agent capable of performing complex tasks.</p>
<p>Our work has shown that the HubSpot Agent, when integrated with LLMs, can intelligently interact with CRM data. It processes natural language queries, identifies relevant CRM objects, and autonomously performs operations like <strong><em>creating deals, updating contacts, resolving tickets, tracking deals, or generating insights</em></strong>. This ability to interpret user intent, call APIs, and combine multiple tools makes it much more than a CRM assistant—it’s a game-changer.</p>
<h3 id="heading-how-the-hubspot-agent-works"><strong>How the Hubspot Agent Works</strong></h3>
<p>Let’s understand the high-level overview of the workflow. This LangGraph Agent utilizes a structured workflow and Tool-calling capabilities.</p>
<ol>
<li><p><strong>Query Understanding</strong>: Our Main Agent uses LLM (Large Language Model) to analyze user queries and identify the intent.</p>
</li>
<li><p><strong>Tool Calling</strong>: The agent intelligently determines the appropriate workflow and APIs or set of tools required to execute the task.</p>
</li>
<li><p><strong>Dynamic Execution</strong>: It seamlessly performs operations like creating a contact, updating a deal, or fetching CRM data.</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1733462113450/2902b468-4665-4975-b933-fd5addc478e7.png" alt class="image--center mx-auto" /></p>
</li>
</ol>
<p>If you are not aware of LangGraph Agents, Nodes, Tools, and Edges, then we highly recommend reading our <a target="_blank" href="https://blog.futuresmart.ai/langgraph-tutorial-for-beginners">beginner-friendly LangGraph Tutorial</a></p>
<h3 id="heading-setting-up-your-langgraph-agent-for-hubspot"><strong>Setting Up Your LangGraph Agent for HubSpot</strong></h3>
<p>Before diving into the main coding part, we need to follow some key steps to set up our HubSpot Dashboard. If you have already configured it, then feel free to skip this section.</p>
<ul>
<li><p><strong>Creating a Private App in HubSpot</strong>: Create a Private app to generate a Private Key.</p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1733465291411/730377d5-dda6-4013-be7d-5475e0a757c9.png" alt class="image--center mx-auto" /></p>
</li>
<li><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1733905544529/328c8260-a62d-41ac-9ee5-fcdb583d5b18.png" alt class="image--center mx-auto" /></p>
<p>  <strong>Selecting Scopes</strong>: Define permissions to control access to CRM objects like Contacts or Deals.</p>
</li>
<li><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1733905716114/327ee383-40ad-4ab8-8670-e45acf9b40aa.png" alt class="image--center mx-auto" /></p>
</li>
<li><p><strong>Generating Private Key</strong>: Generate an API key to securely connect your agent with HubSpot.</p>
</li>
<li><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1733905748644/a5b2cde0-a7be-4d46-ae6b-b76c2c3de7e2.png" alt class="image--center mx-auto" /></p>
<p>  <strong>Installing the Python HubSpot API SDK</strong>: Install and configure the <code>hubspot-api-client</code> Python library for interactions with HubSpot’s CRM objects.</p>
</li>
</ul>
<h2 id="heading-setting-up-the-environment"><strong>Setting up the Environment</strong></h2>
<p>To get started, you need to install the below dependencies</p>
<pre><code class="lang-python">!pip install -U hubspot-api-client langchain langgraph langchain-openai pydantic python-dotenv
</code></pre>
<h2 id="heading-setting-up-api-keys"><strong>Setting up API Keys</strong></h2>
<p>Before diving into building your Hubspot AI agent, it's crucial to set up your API keys. These keys allow your agent to interact securely with external environments and tools like <strong>Hubspot</strong> and <a target="_blank" href="https://openai.com/index/openai-api/"><strong>OpenAI</strong> <strong>GPT</strong></a> models. Without them, the tools cannot function effectively.</p>
<p>Here we are using the OpenAI model, but you can use any LLM of your choice.</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> getpass
<span class="hljs-keyword">import</span> os

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">_set_env</span>(<span class="hljs-params">var: str</span>):</span>
    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> os.environ.get(var):
        os.environ[var] = getpass.getpass(<span class="hljs-string">f"<span class="hljs-subst">{var}</span>: "</span>)

_set_env(<span class="hljs-string">"HUBSPOT_PRIVATE_KEY"</span>)
_set_env(<span class="hljs-string">"OPENAI_API_KEY"</span>)
</code></pre>
<h2 id="heading-creating-the-llm-object"><strong>Creating the LLM Object</strong></h2>
<p>Here’s how to initialize the LLM using LangChain <code>ChatOpenAI</code></p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain_openai <span class="hljs-keyword">import</span> ChatOpenAI

llm = ChatOpenAI(model_name=<span class="hljs-string">"gpt-4o"</span>)
</code></pre>
<h2 id="heading-creating-tools-for-ai-agent">Creating Tools for AI Agent</h2>
<p>We will create a few custom tools that our langgraph agent can use to interact with Hubspot CRM.</p>
<h3 id="heading-dynamic-search-tool-integration"><strong>Dynamic Search Tool Integration</strong></h3>
<p>The <strong>Dynamic Search</strong> tool enables to fetch relevant records from CRM.</p>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">SearchRequestSchema</span>(<span class="hljs-params">BaseModel</span>):</span>
    object_type: str
    filterGroups: List[FilterGroup]
    properties: Optional[List[str]] = <span class="hljs-literal">None</span>
    limit: Optional[int] = <span class="hljs-number">10</span>
    sorts: Optional[List[str]] = <span class="hljs-literal">None</span>
    after: Optional[str] = <span class="hljs-literal">None</span>
    propertiesWithHistory: Optional[List[str]] = <span class="hljs-literal">None</span>
    archived: Optional[bool] = <span class="hljs-literal">False</span>

<span class="hljs-meta">@tool(args_schema=SearchRequestSchema)</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">dynamic_search</span>(<span class="hljs-params">
    object_type: str,
    filterGroups: List[FilterGroup],
    properties: Optional[List[str]] = None,
    limit: Optional[int] = <span class="hljs-number">10</span>
</span>):</span>
    <span class="hljs-string">"""Generic tool for searching objects on HubSpot Ex: contacts,deals,tickets,companies."""</span>
    object_to_search_method = {
        <span class="hljs-string">'contacts'</span>: <span class="hljs-string">'crm.contacts.search_api.do_search'</span>,
        <span class="hljs-string">'deals'</span>: <span class="hljs-string">'crm.deals.search_api.do_search'</span>,
        <span class="hljs-string">'tickets'</span>: <span class="hljs-string">'crm.tickets.search_api.do_search'</span>,
        <span class="hljs-string">'companies'</span>: <span class="hljs-string">'crm.companies.search_api.do_search'</span>,
        <span class="hljs-string">'line_items'</span>: <span class="hljs-string">'crm.line_items.search_api.do_search'</span>,
        <span class="hljs-string">'quotes'</span>: <span class="hljs-string">'crm.quotes.search_api.do_search'</span>
    }

    <span class="hljs-keyword">if</span> object_type <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> object_to_search_method:
        <span class="hljs-keyword">return</span> {<span class="hljs-string">"Error"</span>: <span class="hljs-string">f"Unsupported object type: <span class="hljs-subst">{object_type}</span>"</span>}

    <span class="hljs-keyword">try</span>:
        api_client = get_client()
        search_method_path = object_to_search_method[object_type]
        search_method = getattr(
            api_client.crm, search_method_path.split(<span class="hljs-string">'.'</span>)[<span class="hljs-number">1</span>]
        ).search_api.do_search

        <span class="hljs-comment"># Prepare filter groups for the request</span>
        formatted_filter_groups = [
            {<span class="hljs-string">"filters"</span>: [filter.dict() <span class="hljs-keyword">for</span> filter <span class="hljs-keyword">in</span> group.filters]} <span class="hljs-keyword">for</span> group <span class="hljs-keyword">in</span> filterGroups
        ]

        <span class="hljs-comment"># Construct the search request</span>
        public_object_search_request = PublicObjectSearchRequest(
            filter_groups=formatted_filter_groups,
            properties=properties <span class="hljs-keyword">or</span> [],
            limit=limit
        )

        <span class="hljs-comment"># Send the request</span>
        api_response = search_method(public_object_search_request=public_object_search_request)
        <span class="hljs-keyword">return</span> {<span class="hljs-string">"status"</span>: <span class="hljs-string">"success"</span>, <span class="hljs-string">"response_details"</span>: api_response}
    <span class="hljs-keyword">except</span> ApiException <span class="hljs-keyword">as</span> e:
        <span class="hljs-keyword">return</span> {<span class="hljs-string">"Exception when retrieving object information"</span>: str(e)}
    <span class="hljs-keyword">except</span> ValueError <span class="hljs-keyword">as</span> e:
        <span class="hljs-keyword">return</span> {<span class="hljs-string">"Error"</span>: str(e)}
</code></pre>
<p><strong>Output:</strong></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1733563895899/51d3553e-f9e5-4106-a198-0211d057fd96.png" alt /></p>
<h3 id="heading-create-object-tool-integration"><strong>Create Object Tool Integration</strong></h3>
<p>The <strong>Create Object</strong> tool will create different crm objects and helps in associations between them as per user's questions.</p>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">CreateObjectSchema</span>(<span class="hljs-params">BaseModel</span>):</span>
    object_type: str
    properties: Dict[str, str]

<span class="hljs-meta">@tool(args_schema=CreateObjectSchema)</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create_object</span>(<span class="hljs-params">object_type: str, properties: Dict[str, str]</span>):</span>
    <span class="hljs-string">""" Tool for creating a CRM object (e.g., contact, deal, ticket) on HubSpot. """</span>
    <span class="hljs-keyword">try</span>:
        <span class="hljs-keyword">from</span> hubspot.crm.objects <span class="hljs-keyword">import</span> SimplePublicObjectInputForCreate

        <span class="hljs-comment"># Mapping of object types to their respective API creation methods</span>
        object_to_create_method = {
            <span class="hljs-string">'contacts'</span>: <span class="hljs-string">'crm.contacts.basic_api.create'</span>,
            <span class="hljs-string">'deals'</span>: <span class="hljs-string">'crm.deals.basic_api.create'</span>,
            <span class="hljs-string">'tickets'</span>: <span class="hljs-string">'crm.tickets.basic_api.create'</span>,
            <span class="hljs-string">'companies'</span>: <span class="hljs-string">'crm.companies.basic_api.create'</span>,
            <span class="hljs-string">'line_items'</span>: <span class="hljs-string">'crm.line_items.basic_api.create'</span>,
            <span class="hljs-string">'quotes'</span>: <span class="hljs-string">'crm.quotes.basic_api.create'</span>,
        }

        <span class="hljs-comment"># Check if the object type is supported</span>
        <span class="hljs-keyword">if</span> object_type <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> object_to_create_method:
            <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">f"Unsupported object type: <span class="hljs-subst">{object_type}</span>"</span>)

        <span class="hljs-comment"># Initialize the API client</span>
        api_client = get_client()

        <span class="hljs-comment"># Get the appropriate API creation method</span>
        create_method_path = object_to_create_method[object_type]
        create_method = getattr(api_client.crm, create_method_path.split(<span class="hljs-string">'.'</span>)[<span class="hljs-number">1</span>]).basic_api.create

        <span class="hljs-comment"># Prepare the object input for creation</span>
        simple_public_object_input_for_create = SimplePublicObjectInputForCreate(properties=properties)

        <span class="hljs-comment"># Call the create method</span>
        api_response = create_method(simple_public_object_input_for_create=simple_public_object_input_for_create)

        <span class="hljs-comment"># Return the API response</span>
        <span class="hljs-keyword">return</span> {
            <span class="hljs-string">"status"</span>: <span class="hljs-string">"success"</span>,
            <span class="hljs-string">"response_details"</span>: api_response,
            <span class="hljs-string">"object_id"</span>: api_response.id
        }

    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        <span class="hljs-comment"># Handle errors and return the error message</span>
        <span class="hljs-keyword">return</span> {<span class="hljs-string">"error"</span>: str(e)}
</code></pre>
<p><strong>Output:</strong></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1733564160035/2b80cbb2-913d-4722-bd91-1f71e3b7a50a.png" alt /></p>
<h3 id="heading-update-object-tool-integration"><strong>Update Object Tool Integration</strong></h3>
<p>The <strong>Update Object</strong> tool will be used to update different crm objects.</p>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">UpdateObjectSchema</span>(<span class="hljs-params">BaseModel</span>):</span>
    object_type: str
    object_id: str
    properties: Dict[str, str]

<span class="hljs-meta">@tool(args_schema=UpdateObjectSchema)</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">update_object</span>(<span class="hljs-params">object_type: str, object_id: str, properties: Dict[str, str]</span>):</span>
    <span class="hljs-string">""" Tool for updating a CRM object (e.g., contact, deal, ticket) on HubSpot using its ID. """</span>
    <span class="hljs-keyword">try</span>:
        <span class="hljs-keyword">from</span> hubspot.crm.objects <span class="hljs-keyword">import</span> SimplePublicObjectInput

        <span class="hljs-comment"># Mapping of object types to their respective API update methods</span>
        object_to_update_method = {
            <span class="hljs-string">'contacts'</span>: <span class="hljs-string">'crm.contacts.basic_api.update'</span>,
            <span class="hljs-string">'deals'</span>: <span class="hljs-string">'crm.deals.basic_api.update'</span>,
            <span class="hljs-string">'tickets'</span>: <span class="hljs-string">'crm.tickets.basic_api.update'</span>,
            <span class="hljs-string">'companies'</span>: <span class="hljs-string">'crm.companies.basic_api.update'</span>,
            <span class="hljs-string">'line_items'</span>: <span class="hljs-string">'crm.line_items.basic_api.update'</span>,
            <span class="hljs-string">'quotes'</span>: <span class="hljs-string">'crm.quotes.basic_api.update'</span>,
        }

        <span class="hljs-comment"># Check if the object type is supported</span>
        <span class="hljs-keyword">if</span> object_type <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> object_to_update_method:
            <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">f"Unsupported object type: <span class="hljs-subst">{object_type}</span>"</span>)

        <span class="hljs-comment"># Initialize the API client</span>
        api_client = get_client()

        <span class="hljs-comment"># Get the appropriate API update method</span>
        update_method_path = object_to_update_method[object_type]
        update_method = getattr(api_client.crm, update_method_path.split(<span class="hljs-string">'.'</span>)[<span class="hljs-number">1</span>]).basic_api.update

        <span class="hljs-comment"># Prepare the object input for updating</span>
        simple_public_object_input = SimplePublicObjectInput(properties=properties)

        <span class="hljs-comment"># Call the update method</span>
        api_response = update_method(object_id, simple_public_object_input)

        <span class="hljs-comment"># Return the API response</span>
        <span class="hljs-keyword">return</span> {
            <span class="hljs-string">"status"</span>: <span class="hljs-string">"success"</span>,
            <span class="hljs-string">"response_details"</span>: api_response,
            <span class="hljs-string">"object_id"</span>: api_response.id
        }

    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        <span class="hljs-comment"># Handle errors and return the error message</span>
        <span class="hljs-keyword">return</span> {<span class="hljs-string">"error"</span>: str(e)}
</code></pre>
<p><strong>Output:</strong></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1733564250456/c55513f8-8012-48d0-946c-ea05f7d398ad.png" alt /></p>
<h2 id="heading-combining-the-tools"><strong>Combining the</strong> <strong>Tools</strong></h2>
<p>To create a fully functional Hubspot AI agent, you need to integrate the Dynamic Search, Create Object, and Update Object tools into the LLM.</p>
<p>Here’s how to combine the tools:</p>
<pre><code class="lang-python">tools = [dynamic_search, create_object, update_object]
llm_with_tools = llm.bind_tools(tools)
</code></pre>
<p>With this configuration, your LLM can invoke the appropriate tool based on the user’s query.</p>
<h2 id="heading-building-the-langgraph"><strong>Building the LangGraph</strong></h2>
<p>LangGraph enables you to define a stateful workflow for your AI agent. By structuring nodes and edges, you can define how the agent processes user inputs and transitions between tools.</p>
<h3 id="heading-steps-to-build-the-langgraph"><strong>Steps to Build the LangGraph</strong></h3>
<ol>
<li><p><strong>Define the State</strong>: Create a <code>State</code> dictionary to manage the agent’s inputs and outputs.</p>
<pre><code class="lang-python"> <span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> Annotated
 <span class="hljs-keyword">from</span> langgraph.graph <span class="hljs-keyword">import</span> StateGraph
 <span class="hljs-keyword">from</span> langgraph.graph.message <span class="hljs-keyword">import</span> add_messages
 <span class="hljs-keyword">from</span> langgraph.checkpoint.memory <span class="hljs-keyword">import</span> MemorySaver

 memory = MemorySaver()

 <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">State</span>(<span class="hljs-params">TypedDict</span>):</span>
     messages: Annotated[list, add_messages]
</code></pre>
</li>
<li><p><strong>Add Nodes</strong>: Add nodes for the chatbot and tools to handle user queries and invoke the tools.</p>
<pre><code class="lang-python"> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">chatbot</span>(<span class="hljs-params">state: State</span>):</span>
      <span class="hljs-keyword">return</span> {<span class="hljs-string">"messages"</span>: [llm_with_tools.invoke(state[<span class="hljs-string">"messages"</span>])]}

  graph_builder = StateGraph(State)
  graph_builder.add_node(<span class="hljs-string">"chatbot"</span>, chatbot)

  tool_node = ToolNode(tools=[dynamic_search, create_object, update_object])
  graph_builder.add_node(<span class="hljs-string">"tools"</span>, tool_node)
</code></pre>
</li>
<li><p><strong>Define Edges</strong>: Use conditional edges to determine when the agent should switch between nodes.</p>
<pre><code class="lang-python">  <span class="hljs-keyword">from</span> langgraph.prebuilt <span class="hljs-keyword">import</span> tools_condition

  graph_builder.add_conditional_edges(<span class="hljs-string">"chatbot"</span>, tools_condition)
  graph_builder.add_edge(<span class="hljs-string">"tools"</span>, <span class="hljs-string">"chatbot"</span>)
  graph_builder.set_entry_point(<span class="hljs-string">"chatbot"</span>)
</code></pre>
</li>
<li><p><strong>Compile the Graph</strong>: Finalize the graph for execution.</p>
<pre><code class="lang-python"> graph = graph_builder.compile(checkpointer=memory)
</code></pre>
</li>
</ol>
<h2 id="heading-testing-the-ai-agent"><strong>Testing the AI Agent</strong></h2>
<p>Once the LangGraph is set up, you can test the agent by simulating user inputs. This ensures the tools and workflows are functioning as expected.</p>
<h3 id="heading-interactive-testing"><strong>Interactive Testing</strong></h3>
<p>Run the following code to test your AI agent interactively:</p>
<pre><code class="lang-python">config = {<span class="hljs-string">"configurable"</span>: {<span class="hljs-string">"thread_id"</span>: <span class="hljs-string">"1"</span>}}

<span class="hljs-keyword">while</span> <span class="hljs-literal">True</span>:
    user_input = input(<span class="hljs-string">"User: "</span>)
    <span class="hljs-keyword">if</span> user_input.lower() <span class="hljs-keyword">in</span> [<span class="hljs-string">"quit"</span>, <span class="hljs-string">"exit"</span>, <span class="hljs-string">"q"</span>]:
        print(<span class="hljs-string">"Goodbye!"</span>)
        <span class="hljs-keyword">break</span>

    <span class="hljs-keyword">for</span> event <span class="hljs-keyword">in</span> graph.stream({<span class="hljs-string">"messages"</span>: [(<span class="hljs-string">"user"</span>, user_input)]}, config):
        <span class="hljs-keyword">for</span> value <span class="hljs-keyword">in</span> event.values():
            print(<span class="hljs-string">"Assistant:"</span>, value[<span class="hljs-string">"messages"</span>][<span class="hljs-number">-1</span>].content)
</code></pre>
<p>You can provide queries like:</p>
<ul>
<li><p><em>You can ask questions based on your Hubspot records.</em></p>
</li>
<li><p><em>"What is the current status of ‘ABC’ Deal?"</em></p>
</li>
<li><p><em>"Tell me the name of any two contacts with association with company ‘XYZ’. " (Trigger Search tool)</em></p>
</li>
</ul>
<p>The AI agent will invoke the appropriate tool to generate responses.</p>
<h3 id="heading-visualizing-the-graph"><strong>Visualizing the Graph</strong></h3>
<p>We can easily plot a flowchart of our LangGraph Workflow.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> IPython.display <span class="hljs-keyword">import</span> Image, display

<span class="hljs-keyword">try</span>:
    display(Image(graph.get_graph().draw_mermaid_png()))
<span class="hljs-keyword">except</span> Exception:
    print(<span class="hljs-string">"Error generating graph visualization."</span>)
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1732629457981/85bce952-5850-45c4-86cf-20964c6ab41a.png?auto=compress,format&amp;format=webp" alt class="image--center mx-auto" /></p>
<p><strong>Steps in the Workflow:</strong></p>
<ol>
<li><p><strong>Receive Query</strong>: A user inputs a query, such as "<em>Update the status of Deal X to Closed Won.</em>"</p>
</li>
<li><p><strong>Interpretation</strong>: The agent uses LLMs to determine the query's intent and required action.</p>
</li>
<li><p><strong>Tool Selection</strong>: Based on the query, the agent selects the appropriate custom HubSpot AI tools.</p>
</li>
<li><p><strong>Execution</strong>: The tool is being executed and you can expect real-time updates in the Hubspot CRM.</p>
</li>
<li><p><strong>Feedback</strong>: Results are shared with the user in an easily understandable format</p>
</li>
</ol>
<h2 id="heading-testing-and-refinement"><strong>Testing and Refinement</strong></h2>
<p>Leveraging the insights we've gained from hubspot api integration services for our clients, we understand that thorough testing and refinement are essential:</p>
<ul>
<li><p><strong>Simulate Real-World Scenarios</strong>: Use varied queries to validate the agent’s ability to handle different tasks.</p>
</li>
<li><p><strong>Validate API Operations</strong>: Ensure operations are performed correctly and efficiently.</p>
</li>
<li><p><strong>Iterate and Improve</strong>: Gather user feedback and refine the agent’s workflows and capabilities over time.</p>
</li>
</ul>
<p>While building and improving this, you may face challenges like we faced. For example, how to handle complex queries or ambiguous questions. Robust error handling can help while building this type of AI application.</p>
<h2 id="heading-conclusion"><strong>Conclusion</strong></h2>
<p>Drawing on the experiences we’ve gained delivering tailored AI solutions to businesses, we can confidently say that the <strong>Hubspot Agent</strong> is more than just a tool—it’s an intelligent assistant that simplifies CRM management, boosts productivity, and saves valuable time. Automating repetitive tasks and enabling intelligent decision-making allows businesses to focus on what matters most—building strong customer relationships and driving growth. Looking ahead, the principles and technologies used in building the HubSpot AI Agent can also be extended to other areas such as Marketing and CMS, enabling even broader automation and intelligent data interaction across your business operations.</p>
<p>If you’re ready to take your HubSpot experience to the next level, building a <strong>Hubspot</strong> Agent is the way forward. It’s not just a CRM upgrade—it’s a smarter, faster way to achieve your business goals.</p>
<p>If you found this guide useful and want to explore more advanced techniques, don’t forget to check out our other <a target="_blank" href="https://blog.futuresmart.ai/"><strong>tutorials</strong></a>. At FutureSmart AI, we help businesses develop state-of-the-art AI solutions tailored to their needs. To see a demo or discuss your requirements, contact us at <a target="_blank" href="mailto:contact@futuresmart.ai">contact@futuresmart.ai</a>.</p>
<p>For real-world examples of how we’ve helped businesses, take a look at our <a target="_blank" href="https://www.futuresmart.ai/case-studies"><strong>case studies</strong></a>, where we showcase the practical value of our expertise.</p>
]]></content:encoded></item><item><title><![CDATA[How to Build a Custom AI Agent Using LangGraph with RAG, NL2SQL and Web Search]]></title><description><![CDATA[Introduction
In this blog post, we will walk you through the process of creating a custom AI agent with three powerful tools: Web Search, Retrieval-Augmented Generation (RAG), and Natural Language to SQL (NL2SQL), all integrated within the LangGraph ...]]></description><link>https://blog.futuresmart.ai/langgraph-agent-with-rag-and-nl2sql</link><guid isPermaLink="true">https://blog.futuresmart.ai/langgraph-agent-with-rag-and-nl2sql</guid><category><![CDATA[langchain]]></category><category><![CDATA[AI]]></category><category><![CDATA[RAG ]]></category><category><![CDATA[SQL]]></category><category><![CDATA[agentic rag]]></category><category><![CDATA[NL2SQL]]></category><category><![CDATA[langgraph]]></category><dc:creator><![CDATA[Rounak Show]]></dc:creator><pubDate>Fri, 29 Nov 2024 06:22:30 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1732644985329/c1897453-313c-4727-be99-916589de888f.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction"><strong>Introduction</strong></h2>
<p>In this blog post, we will walk you through the process of creating a custom AI agent with three powerful tools: Web Search, <a target="_blank" href="https://blog.futuresmart.ai/master-rag-with-langchain-a-practical-guide">Retrieval-Augmented Generation (RAG)</a>, and <a target="_blank" href="https://blog.futuresmart.ai/mastering-natural-language-to-sql-with-langchain-nl2sql">Natural Language to SQL (NL2SQL)</a>, all integrated within the <strong>LangGraph</strong> framework. This guide is designed to provide you with a practical, step-by-step approach to building a fully functional AI agent capable of performing complex tasks such as retrieving real-time data from the web, generating responses based on retrieved information from the knowledge base, and translating natural language queries into SQL database queries. By the end of this tutorial, you will have a working AI agent equipped to handle these diverse functionalities seamlessly.</p>
<p>In the <strong>previous</strong> <a target="_blank" href="https://blog.futuresmart.ai/langgraph-tutorial-for-beginners"><strong>LangGraph Tutorial blog</strong></a>, we explored the foundational concepts behind LangGraph, focusing on creating simple AI agents by defining nodes and edges. This time, we’re taking things a step further. We will incorporate additional tools that extend the agent’s capabilities—allowing it to interact with the web, retrieve specific information from knowledge-base, and query SQL databases through natural language commands. This comprehensive guide is aimed at developers looking to build customizable, task-oriented AI agents that can be deployed in various real-world applications.</p>
<p>We have already built a similar <a target="_blank" href="https://blog.futuresmart.ai/openai-swarm-a-hands-on-introduction">AI Agent using OpenAI Swarm</a>. This time, we are going to do this using LangGraph.</p>
<p>Let’s dive into the setup process and start by building the environment for our AI agent!</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1732639865242/450250f2-b59c-4bab-af75-cf4367c2e61b.png" alt="LangGraph Agent with WebSearch, RAG, NL2SQL Tool" class="image--center mx-auto" /></p>
<h2 id="heading-setting-up-the-environment">Setting up the Environment</h2>
<p>To get started, you need to install the below dependencies</p>
<pre><code class="lang-python">!pip install -U langchain langchain-chroma pypdf sentence-transformers langgraph langchain_openai langchain_community
</code></pre>
<h2 id="heading-setting-up-api-keys">Setting up API Keys</h2>
<p>Before diving into building your LangGraph AI agent, it's crucial to set up your API keys. These keys allow your agent to interact with external tools like <a target="_blank" href="https://tavily.com/"><strong>Tavily Search</strong></a> and <a target="_blank" href="https://openai.com/index/openai-api/"><strong>OpenAI GPT</strong></a> models securely. Without them, the tools cannot function effectively.</p>
<p>Here we are using OpenAI model, but you can use any LLM of your choice.</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> getpass
<span class="hljs-keyword">import</span> os

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">_set_env</span>(<span class="hljs-params">var: str</span>):</span>
    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> os.environ.get(var):
        os.environ[var] = getpass.getpass(<span class="hljs-string">f"<span class="hljs-subst">{var}</span>: "</span>)

_set_env(<span class="hljs-string">"TAVILY_API_KEY"</span>)
_set_env(<span class="hljs-string">"OPENAI_API_KEY"</span>)
</code></pre>
<h2 id="heading-creating-the-llm-object">Creating the LLM Object</h2>
<p>Here’s how to initialize the LLM using LangChain <code>ChatOpenAI</code></p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain_openai <span class="hljs-keyword">import</span> ChatOpenAI

llm = ChatOpenAI(model_name=<span class="hljs-string">"gpt-4o"</span>)
</code></pre>
<h2 id="heading-websearch-tool-integration">WebSearch Tool Integration</h2>
<p>Integrating a WebSearch tool allows your AI agent to fetch real-time information from the web. At <a target="_blank" href="https://www.futuresmart.ai/">FutureSmart AI</a>, we generally use <code>TavilySearch</code> tool for quick and accurate web results. But, you can use any web search tool of your choice.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain_community.tools.tavily_search <span class="hljs-keyword">import</span> TavilySearchResults

web_search_tool = TavilySearchResults(max_results=<span class="hljs-number">2</span>)
</code></pre>
<h2 id="heading-agentic-rag-tool-integration"><strong>Agentic RAG Tool Integration</strong></h2>
<p>The <a target="_blank" href="https://blog.futuresmart.ai/langchain-rag-from-basics-to-production-ready-rag-chatbot">Retrieval-Augmented Generation (RAG)</a> tool enhances your AI agent by enabling it to fetch relevant documents to the user's questions. As a result, users will get more accurate and context-rich responses.</p>
<h4 id="heading-steps-to-integrate-the-rag-tool"><strong>Steps to Integrate the RAG Tool</strong></h4>
<ol>
<li><p><strong>Load Documents</strong>: Use the <code>PyPDFLoader</code> and <code>Docx2txtLoader</code> to load documents from a folder.</p>
<pre><code class="lang-python"> <span class="hljs-keyword">from</span> langchain_community.document_loaders <span class="hljs-keyword">import</span> PyPDFLoader, Docx2txtLoader

 <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">load_documents</span>(<span class="hljs-params">folder_path: str</span>) -&gt; List[Document]:</span>
     documents = []
     <span class="hljs-keyword">for</span> filename <span class="hljs-keyword">in</span> os.listdir(folder_path):
         file_path = os.path.join(folder_path, filename)
         <span class="hljs-keyword">if</span> filename.endswith(<span class="hljs-string">'.pdf'</span>):
             loader = PyPDFLoader(file_path)
         <span class="hljs-keyword">elif</span> filename.endswith(<span class="hljs-string">'.docx'</span>):
             loader = Docx2txtLoader(file_path)
         <span class="hljs-keyword">else</span>:
             print(<span class="hljs-string">f"Unsupported file type: <span class="hljs-subst">{filename}</span>"</span>)
             <span class="hljs-keyword">continue</span>
         documents.extend(loader.load())
     <span class="hljs-keyword">return</span> documents

 folder_path = <span class="hljs-string">"/content/docs"</span>
 documents = load_documents(folder_path)
 print(<span class="hljs-string">f"Loaded <span class="hljs-subst">{len(documents)}</span> documents from the folder."</span>)
</code></pre>
<pre><code class="lang-python"> <span class="hljs-comment"># Output</span>
 Loaded <span class="hljs-number">2</span> documents <span class="hljs-keyword">from</span> the folder.
</code></pre>
</li>
<li><p><strong>Split Text into Chunks</strong>: Prepare documents for vectorization by splitting them into manageable chunks.</p>
<pre><code class="lang-python"> text_splitter = RecursiveCharacterTextSplitter(
     chunk_size=<span class="hljs-number">1000</span>,
     chunk_overlap=<span class="hljs-number">200</span>,
     length_function=len
 )

 splits = text_splitter.split_documents(documents)
 print(<span class="hljs-string">f"Split the documents into <span class="hljs-subst">{len(splits)}</span> chunks."</span>)
</code></pre>
<pre><code class="lang-python"> <span class="hljs-comment"># Output</span>
 Split the documents into <span class="hljs-number">4</span> chunks.
</code></pre>
</li>
<li><p><strong>Generate Embeddings</strong>: Use <a target="_blank" href="https://blog.futuresmart.ai/i-never-knew-sentence-transformers-could-be-so-useful">SentenceTransformers</a> to create embeddings for efficient similarity searches.</p>
<pre><code class="lang-python"> <span class="hljs-keyword">from</span> langchain_community.embeddings.sentence_transformer <span class="hljs-keyword">import</span> SentenceTransformerEmbeddings

 embedding_function = SentenceTransformerEmbeddings(model_name=<span class="hljs-string">"all-MiniLM-L6-v2"</span>)
</code></pre>
</li>
<li><p><strong>Create and Persist a Vector Store</strong>:</p>
<pre><code class="lang-python"> <span class="hljs-keyword">from</span> langchain_chroma <span class="hljs-keyword">import</span> Chroma

 collection_name = <span class="hljs-string">"my_collection"</span>
 vectorstore = Chroma.from_documents(
     collection_name=collection_name,
     documents=splits,
     embedding=embedding_function,
     persist_directory=<span class="hljs-string">"./chroma_db"</span>
 )
</code></pre>
</li>
<li><p><strong>Build the Retriever Tool</strong>:</p>
<pre><code class="lang-python"> <span class="hljs-keyword">from</span> langchain.tools <span class="hljs-keyword">import</span> tool
 <span class="hljs-keyword">from</span> pydantic <span class="hljs-keyword">import</span> BaseModel

 <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">RagToolSchema</span>(<span class="hljs-params">BaseModel</span>):</span>
     question: str

<span class="hljs-meta"> @tool(args_schema=RagToolSchema)</span>
 <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">retriever_tool</span>(<span class="hljs-params">question</span>):</span>
   <span class="hljs-string">"""Tool to Retrieve Semantically Similar documents to answer User Questions related to FutureSmart AI"""</span>
   print(<span class="hljs-string">"INSIDE RETRIEVER NODE"</span>)
   retriever = vectorstore.as_retriever(search_kwargs={<span class="hljs-string">"k"</span>: <span class="hljs-number">2</span>})
   retriever_result = retriever.invoke(question)
   <span class="hljs-keyword">return</span> <span class="hljs-string">"\n\n"</span>.join(doc.page_content <span class="hljs-keyword">for</span> doc <span class="hljs-keyword">in</span> retriever_results)
</code></pre>
</li>
</ol>
<p>This tool allows your AI agent to retrieve relevant chunks of information from your document database, making it highly effective for knowledge-based tasks.</p>
<p>You can test the <code>retriever</code> right now to see how it is performing</p>
<pre><code class="lang-python">retriever = vectorstore.as_retriever(search_kwargs={<span class="hljs-string">"k"</span>: <span class="hljs-number">2</span>})
<span class="hljs-comment"># pass question</span>
retriever_results = retriever.invoke(<span class="hljs-string">"Who is the founder of Futuresmart AI?"</span>)
print(retriever_results)
</code></pre>
<pre><code class="lang-python"><span class="hljs-comment"># Output</span>
[Document(metadata={<span class="hljs-string">'page'</span>: <span class="hljs-number">1</span>, <span class="hljs-string">'source'</span>: <span class="hljs-string">'/content/docs/FutureSmart AI .pdf'</span>}, page_content=<span class="hljs-string">'FutureSmart AI provides customized speech to text services, employing cutting-\nedge speech recognition technologies to cater to specific client needs. Ideal for \ncreating efficient documentation and enabling voice-driven commands, this \nsolution boosts productivity and accessibility.'</span>), 
Document(metadata={<span class="hljs-string">'page'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'source'</span>: <span class="hljs-string">'/content/docs/FutureSmart AI .pdf'</span>}, page_content=<span class="hljs-string">'FutureSmart AI provides custom Natural Language Processing (NLP) \nsolutions for companies looking to get ahead of the future. Our \ndedicated team of Data Scientists and ML Engineers provides an end-\nto-end solution from data labeling to modeling and deploying an ML \nmodel tailored to your specific use case. \nFounder: Pradip Nichite \n \nServices: \nText Classification \nAt FutureSmart AI, we develop custom text classification solutions using \nadvanced NLP techniques tailored to your specific business requirements. \nLeveraging Python, Pytorch, and Hugging Face transformers, we enable precise \ndata categorization across applications such as intent detection, document \ncategorization, and sentiment analysis, enhancing your decision-making \nprocesses and operational efficiency. \n \nChatbots \nWe specialize in creating custom chatbots that integrate seamlessly with your \nbusiness environment. Using semantic search and large language models, our'</span>)]
</code></pre>
<p>If you're interested in building RAG from scratch, <a target="_blank" href="https://youtu.be/8sSHg1034r0">this video</a> is for you.</p>
<h2 id="heading-nl2sql-tool-integration"><strong>NL2SQL Tool Integration</strong></h2>
<p>Now we have Web Search and RAG tool ready for the AI Agent. We only need to build the NL2SQL tool.</p>
<p>The <strong>SQL Agent</strong> bridges the gap between human language and SQL databases by generating and executing SQL queries from natural language questions. It enables your AI agent to answer database-related queries efficiently.</p>
<h4 id="heading-steps-to-integrate-the-nl2sql-tool"><strong>Steps to Integrate the NL2SQL Tool</strong></h4>
<ol>
<li><p><strong>Set Up the Database</strong><br /> We are using the Chinook SQLite database as a sample dataset for testing SQL queries. Check this blog to learn <a target="_blank" href="https://blog.futuresmart.ai/mastering-natural-language-to-sql-with-langchain-nl2sql">how to set up your own database</a> for testing.</p>
<pre><code class="lang-bash"> !wget https://github.com/lerocha/chinook-database/raw/master/ChinookDatabase/DataSources/Chinook_Sqlite.sqlite
 !mv Chinook_Sqlite.sqlite Chinook.db
</code></pre>
</li>
<li><p><strong>Initialize the Database Connection</strong><br /> Connect to the SQLite database using the LangChain <code>SQLDatabase</code> utility:</p>
<pre><code class="lang-python"> <span class="hljs-keyword">from</span> langchain_community.utilities <span class="hljs-keyword">import</span> SQLDatabase

 db = SQLDatabase.from_uri(<span class="hljs-string">"sqlite:///Chinook.db"</span>)
</code></pre>
</li>
<li><p><strong>Clean SQL Queries</strong><br /> This function is very important. We often see this problem in many of our client’s projects that the SQL query generated from the LLM consists of some unnecessary symbols, texts, backticks, etc. As a result, you will get an error while executing this query. Thus, we require a function to clean up these additional texts.</p>
</li>
</ol>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> re

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">clean_sql_query</span>(<span class="hljs-params">text: str</span>) -&gt; str:</span>
    <span class="hljs-string">"""
    Clean SQL query by removing code block syntax, various SQL tags, backticks,
    prefixes, and unnecessary whitespace while preserving the core SQL query.

    Args:
        text (str): Raw SQL query text that may contain code blocks, tags, and backticks

    Returns:
        str: Cleaned SQL query
    """</span>
    <span class="hljs-comment"># Step 1: Remove code block syntax and any SQL-related tags</span>
    <span class="hljs-comment"># This handles variations like ```sql, ```SQL, ```SQLQuery, etc.</span>
    block_pattern = <span class="hljs-string">r"```(?:sql|SQL|SQLQuery|mysql|postgresql)?\s*(.*?)\s*```"</span>
    text = re.sub(block_pattern, <span class="hljs-string">r"\1"</span>, text, flags=re.DOTALL)

    <span class="hljs-comment"># Step 2: Handle "SQLQuery:" prefix and similar variations</span>
    <span class="hljs-comment"># This will match patterns like "SQLQuery:", "SQL Query:", "MySQL:", etc.</span>
    prefix_pattern = <span class="hljs-string">r"^(?:SQL\s*Query|SQLQuery|MySQL|PostgreSQL|SQL)\s*:\s*"</span>
    text = re.sub(prefix_pattern, <span class="hljs-string">""</span>, text, flags=re.IGNORECASE)

    <span class="hljs-comment"># Step 3: Extract the first SQL statement if there's random text after it</span>
    <span class="hljs-comment"># Look for a complete SQL statement ending with semicolon</span>
    sql_statement_pattern = <span class="hljs-string">r"(SELECT.*?;)"</span>
    sql_match = re.search(sql_statement_pattern, text, flags=re.IGNORECASE | re.DOTALL)
    <span class="hljs-keyword">if</span> sql_match:
        text = sql_match.group(<span class="hljs-number">1</span>)

    <span class="hljs-comment"># Step 4: Remove backticks around identifiers</span>
    text = re.sub(<span class="hljs-string">r'`([^`]*)`'</span>, <span class="hljs-string">r'\1'</span>, text)

    <span class="hljs-comment"># Step 5: Normalize whitespace</span>
    <span class="hljs-comment"># Replace multiple spaces with single space</span>
    text = re.sub(<span class="hljs-string">r'\s+'</span>, <span class="hljs-string">' '</span>, text)

    <span class="hljs-comment"># Step 6: Preserve newlines for main SQL keywords to maintain readability</span>
    keywords = [<span class="hljs-string">'SELECT'</span>, <span class="hljs-string">'FROM'</span>, <span class="hljs-string">'WHERE'</span>, <span class="hljs-string">'GROUP BY'</span>, <span class="hljs-string">'HAVING'</span>, <span class="hljs-string">'ORDER BY'</span>,
               <span class="hljs-string">'LIMIT'</span>, <span class="hljs-string">'JOIN'</span>, <span class="hljs-string">'LEFT JOIN'</span>, <span class="hljs-string">'RIGHT JOIN'</span>, <span class="hljs-string">'INNER JOIN'</span>,
               <span class="hljs-string">'OUTER JOIN'</span>, <span class="hljs-string">'UNION'</span>, <span class="hljs-string">'VALUES'</span>, <span class="hljs-string">'INSERT'</span>, <span class="hljs-string">'UPDATE'</span>, <span class="hljs-string">'DELETE'</span>]

    <span class="hljs-comment"># Case-insensitive replacement for keywords</span>
    pattern = <span class="hljs-string">'|'</span>.join(<span class="hljs-string">r'\b{}\b'</span>.format(k) <span class="hljs-keyword">for</span> k <span class="hljs-keyword">in</span> keywords)
    text = re.sub(<span class="hljs-string">f'(<span class="hljs-subst">{pattern}</span>)'</span>, <span class="hljs-string">r'\n\1'</span>, text, flags=re.IGNORECASE)

    <span class="hljs-comment"># Step 7: Final cleanup</span>
    <span class="hljs-comment"># Remove leading/trailing whitespace and extra newlines</span>
    text = text.strip()
    text = re.sub(<span class="hljs-string">r'\n\s*\n'</span>, <span class="hljs-string">'\n'</span>, text)

    <span class="hljs-keyword">return</span> text
</code></pre>
<blockquote>
<p>We create this function to process the LLM-generated SQL Query Output. If you have a better one, feel free to share with everyone in the comment 🙂</p>
</blockquote>
<ol start="4">
<li><p><strong>Create the NL2SQL Tool</strong><br /> Define a tool that generates and executes SQL queries:</p>
<pre><code class="lang-python"> <span class="hljs-keyword">from</span> langchain.chains <span class="hljs-keyword">import</span> create_sql_query_chain
 <span class="hljs-keyword">from</span> langchain_community.tools.sql_database.tool <span class="hljs-keyword">import</span> QuerySQLDataBaseTool
 <span class="hljs-keyword">from</span> operator <span class="hljs-keyword">import</span> itemgetter
 <span class="hljs-keyword">import</span> re
 <span class="hljs-keyword">from</span> langchain_core.output_parsers <span class="hljs-keyword">import</span> StrOutputParser
 <span class="hljs-keyword">from</span> langchain_core.prompts <span class="hljs-keyword">import</span> PromptTemplate
 <span class="hljs-keyword">from</span> langchain_core.runnables <span class="hljs-keyword">import</span> RunnablePassthrough, RunnableLambda

 <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">SQLToolSchema</span>(<span class="hljs-params">BaseModel</span>):</span>
     question: str

<span class="hljs-meta"> @tool(args_schema=SQLToolSchema)</span>
 <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">nl2sql_tool</span>(<span class="hljs-params">question</span>):</span>
   <span class="hljs-string">"""Tool to Generate and Execute SQL Query to answer User Questions related to chinook DB"""</span>
   print(<span class="hljs-string">"INSIDE NL2SQL TOOL"</span>)
   execute_query = QuerySQLDataBaseTool(db=db)
   write_query = create_sql_query_chain(llm, db)

   chain = (
       RunnablePassthrough.assign(query=write_query | RunnableLambda(clean_sql_query)).assign(
           result=itemgetter(<span class="hljs-string">"query"</span>) | execute_query
       )
   )

   response = chain.invoke({<span class="hljs-string">"question"</span>: question})
   <span class="hljs-keyword">return</span> response[<span class="hljs-string">'result'</span>]
</code></pre>
</li>
<li><p><strong>Test the Tool</strong><br /> Use a sample query to verify functionality:</p>
<pre><code class="lang-python"> question = <span class="hljs-string">"How many employees are there?"</span>
 result = nl2sql_tool.invoke({<span class="hljs-string">"question"</span>: question})
 print(<span class="hljs-string">f"Answer: <span class="hljs-subst">{result}</span>"</span>)
</code></pre>
<pre><code class="lang-python"> <span class="hljs-comment"># output</span>
 INSIDE NL2SQL TOOL
 Question: How many employees are there?
 Answer: [(<span class="hljs-number">8</span>,)]
</code></pre>
</li>
</ol>
<h2 id="heading-combining-the-tools"><strong>Combining the Tools</strong></h2>
<p>To create a fully functional LangGraph AI agent, you need to integrate the WebSearch, RAG, and NL2SQL tools into the LLM. And then you will get a search agent, a rag agent, and an SQL agent - all in one.</p>
<p>Here’s how to combine the tools:</p>
<pre><code class="lang-python">tools = [web_search_tool, retriever_tool, nl2sql_tool]
llm_with_tools = llm.bind_tools(tools)
</code></pre>
<p>With this configuration, your LLM can invoke the appropriate tool based on the user’s query, ensuring dynamic and context-aware responses.</p>
<h2 id="heading-building-the-langgraph"><strong>Building the LangGraph</strong></h2>
<p>LangGraph enables you to define a stateful workflow for your AI agent. By structuring nodes and edges, you can define how the agent processes user inputs and transitions between tools.</p>
<h3 id="heading-steps-to-build-the-langgraph"><strong>Steps to Build the LangGraph</strong></h3>
<ol>
<li><p><strong>Define the State</strong>: Create a <code>State</code> dictionary to manage the agent’s inputs and outputs.</p>
<pre><code class="lang-python"> <span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> Annotated
 <span class="hljs-keyword">from</span> langgraph.graph <span class="hljs-keyword">import</span> StateGraph
 <span class="hljs-keyword">from</span> langgraph.graph.message <span class="hljs-keyword">import</span> add_messages

 <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">State</span>(<span class="hljs-params">TypedDict</span>):</span>
     messages: Annotated[list, add_messages]
</code></pre>
</li>
<li><p><strong>Add Nodes</strong>: Add nodes for the chatbot and tools to handle user queries and invoke the tools.</p>
<pre><code class="lang-python"> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">chatbot</span>(<span class="hljs-params">state: State</span>):</span>
     <span class="hljs-keyword">return</span> {<span class="hljs-string">"messages"</span>: [llm_with_tools.invoke(state[<span class="hljs-string">"messages"</span>])]}

 graph_builder = StateGraph(State)
 graph_builder.add_node(<span class="hljs-string">"chatbot"</span>, chatbot)

 tool_node = ToolNode(tools=[web_search_tool, retriever_tool, nl2sql_tool])
 graph_builder.add_node(<span class="hljs-string">"tools"</span>, tool_node)
</code></pre>
</li>
<li><p><strong>Define Edges</strong>: Use conditional edges to determine when the agent should switch between nodes.</p>
<pre><code class="lang-python"> <span class="hljs-keyword">from</span> langgraph.prebuilt <span class="hljs-keyword">import</span> tools_condition

 graph_builder.add_conditional_edges(<span class="hljs-string">"chatbot"</span>, tools_condition)
 graph_builder.add_edge(<span class="hljs-string">"tools"</span>, <span class="hljs-string">"chatbot"</span>)
 graph_builder.set_entry_point(<span class="hljs-string">"chatbot"</span>)
</code></pre>
</li>
<li><p><strong>Compile the Graph</strong>: Finalize the graph for execution.</p>
<pre><code class="lang-python"> graph = graph_builder.compile()
</code></pre>
</li>
</ol>
<h2 id="heading-testing-the-ai-agent"><strong>Testing the AI Agent</strong></h2>
<p>Once the LangGraph is set up, you can test the agent by simulating user inputs. This ensures the tools and workflows are functioning as expected.</p>
<h3 id="heading-interactive-testing"><strong>Interactive Testing</strong></h3>
<p>Run the following code to test your AI agent interactively:</p>
<pre><code class="lang-python">config = {<span class="hljs-string">"configurable"</span>: {<span class="hljs-string">"thread_id"</span>: <span class="hljs-string">"1"</span>}}

<span class="hljs-keyword">while</span> <span class="hljs-literal">True</span>:
    user_input = input(<span class="hljs-string">"User: "</span>)
    <span class="hljs-keyword">if</span> user_input.lower() <span class="hljs-keyword">in</span> [<span class="hljs-string">"quit"</span>, <span class="hljs-string">"exit"</span>, <span class="hljs-string">"q"</span>]:
        print(<span class="hljs-string">"Goodbye!"</span>)
        <span class="hljs-keyword">break</span>

    <span class="hljs-keyword">for</span> event <span class="hljs-keyword">in</span> graph.stream({<span class="hljs-string">"messages"</span>: [(<span class="hljs-string">"user"</span>, user_input)]}, config):
        <span class="hljs-keyword">for</span> value <span class="hljs-keyword">in</span> event.values():
            print(<span class="hljs-string">"Assistant:"</span>, value[<span class="hljs-string">"messages"</span>][<span class="hljs-number">-1</span>].content)
</code></pre>
<p>You can provide queries like:</p>
<ul>
<li><p>You can ask questions based on your knowledge-base (Trigger Retriever or RAG tool)</p>
</li>
<li><p>"What is the current weather in Delhi?" (Trigger Websearch tool)</p>
</li>
<li><p>"Tell me the name of any two employees from the database" (Trigger NL2SQL tool)</p>
</li>
</ul>
<p>The AI agent will invoke the appropriate tool to generate responses.</p>
<h3 id="heading-visualizing-the-langgraph"><strong>Visualizing the LangGraph</strong></h3>
<p>Visualization helps you understand the workflow of your AI agent and how it transitions between nodes and tools.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> IPython.display <span class="hljs-keyword">import</span> Image, display

<span class="hljs-keyword">try</span>:
    display(Image(graph.get_graph().draw_mermaid_png()))
<span class="hljs-keyword">except</span> Exception:
    print(<span class="hljs-string">"Error generating graph visualization."</span>)
</code></pre>
<p>The generated diagram will showcase nodes (<code>chatbot</code>, <code>tools</code>) and the transitions between them, providing a clear overview of your AI agent’s workflow.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1732629457981/85bce952-5850-45c4-86cf-20964c6ab41a.png" alt class="image--center mx-auto" /></p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">Check out the full code on our <a target="_self" href="https://github.com/PradipNichite/FutureSmart-AI-Blog/tree/main/Langgraph%20Agent%20with%20websearch%2C%20RAG%2C%20NL2SQL">GitHub</a></div>
</div>

<h2 id="heading-improvements">Improvements</h2>
<ol>
<li><p>You can implement methods like Contextual Retrieval, reranking, etc to improve the performance of the Agentic RAG system.</p>
</li>
<li><p>You can try implementing few-shot examples, dynamic table selection, custom SQL query generation chain, etc to improve the SQL Agent tool. We would recommend watching this <a target="_blank" href="https://www.youtube.com/watch?v=fss6CrmQU2Y">Complete Natural Language to SQL video</a> for a better understanding.</p>
</li>
<li><p>Moreover, you can always improve the agent by modifying graph creation, writing better prompts, adding human-in-the-loop, etc.</p>
</li>
</ol>
<h2 id="heading-conclusion">Conclusion</h2>
<p>This concludes our step-by-step guide to creating a custom AI agent using LangGraph with <strong>Web Search, RAG, and NL2SQL</strong> tools. By following this tutorial, you've built an AI agent capable of performing diverse tasks such as retrieving real-time data, answering questions based on document-based knowledge, and executing SQL queries directly from natural language commands. This hands-on approach has equipped you with practical knowledge of integrating multiple tools into a stateful AI workflow using LangGraph.</p>
<p>If you found this guide useful and want to explore more advanced techniques, don’t forget to check out our other <a target="_blank" href="https://blog.futuresmart.ai/">tutorials</a>. At FutureSmart AI, we help businesses develop state-of-the-art AI solutions tailored to their needs. For inquiries, feel free to reach out to us at <a target="_blank" href="mailto:contact@futuresmart.ai">contact@futuresmart.ai</a>.</p>
<p>For real-world examples of our work, take a look at our <a target="_blank" href="https://www.futuresmart.ai/case-studies"><strong>case studies</strong></a>, where we showcase the practical value of our expertise.</p>
]]></content:encoded></item><item><title><![CDATA[Building RAG Applications Without LangChain or LlamaIndex]]></title><description><![CDATA[Introduction
In recent months, Retrieval Augmented Generation (RAG) has emerged as a powerful pattern for enhancing Large Language Models (LLMs) with private or domain-specific knowledge. While frameworks like LangChain and LlamaIndex have made RAG i...]]></description><link>https://blog.futuresmart.ai/building-rag-applications-without-langchain-or-llamaindex</link><guid isPermaLink="true">https://blog.futuresmart.ai/building-rag-applications-without-langchain-or-llamaindex</guid><category><![CDATA[vanila rag]]></category><category><![CDATA[RAG ]]></category><dc:creator><![CDATA[Sakalya Mitra]]></dc:creator><pubDate>Tue, 29 Oct 2024 11:07:59 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1730201915871/908b2207-aee1-4689-9c61-90f2c4075470.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>In recent months, Retrieval Augmented Generation (RAG) has emerged as a powerful pattern for enhancing Large Language Models (LLMs) with private or domain-specific knowledge. While frameworks like LangChain and LlamaIndex have made RAG implementation more accessible, they can sometimes feel like black boxes, making debugging and customization challenging.</p>
<h3 id="heading-why-build-rag-from-scratch">Why Build RAG From Scratch?</h3>
<p>When I created a tutorial on building RAG applications with LangChain, many developers reached out with a common challenge: debugging LangChain applications was becoming increasingly difficult. The layers of abstraction, while convenient, were obscuring the underlying mechanics of how RAG actually works.</p>
<p>Consider these common pain points with framework-based RAG implementations:</p>
<ul>
<li><p><strong>Debugging Complexity</strong>: When something goes wrong, tracing the issue through multiple layers of framework abstraction can be time-consuming</p>
</li>
<li><p><strong>Documentation Overhead</strong>: Understanding framework-specific concepts often requires navigating extensive documentation, taking focus away from core RAG principles</p>
</li>
<li><p><strong>Limited Control</strong>: Framework abstractions can make it harder to customize specific components or optimize for your use case</p>
</li>
<li><p><strong>Version Dependencies</strong>: Framework updates can introduce breaking changes or compatibility issues</p>
</li>
</ul>
<h3 id="heading-a-framework-free-approach">A Framework-Free Approach</h3>
<p>In this guide, we'll build a complete RAG application from scratch using only fundamental libraries:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Core dependencies</span>
chromadb        <span class="hljs-comment"># Vector database for document storage</span>
openai          <span class="hljs-comment"># LLM API access</span>
pypdf2          <span class="hljs-comment"># PDF document processing</span>
python-docx     <span class="hljs-comment"># Word document processing</span>
sentence-transformers  <span class="hljs-comment"># Text embeddings</span>
</code></pre>
<p>Our implementation will include all essential RAG components:</p>
<ol>
<li><p><strong>Document Processing</strong>: Handle multiple document formats (PDF, DOCX, TXT)</p>
</li>
<li><p><strong>Vector Storage</strong>: Implement semantic search using ChromaDB</p>
</li>
<li><p><strong>LLM Integration</strong>: Direct interaction with OpenAI's API</p>
</li>
<li><p><strong>Conversational Memory</strong>: Support for follow-up questions and context</p>
</li>
</ol>
<h3 id="heading-what-makes-this-approach-different">What Makes This Approach Different?</h3>
<p>By building without frameworks, you'll gain:</p>
<ul>
<li><p><strong>Deep Understanding</strong>: See how each component of RAG works and interacts</p>
</li>
<li><p><strong>Complete Control</strong>: Customize any part of the pipeline to suit your needs</p>
</li>
<li><p><strong>Simplified Debugging</strong>: Trace issues directly to their source</p>
</li>
<li><p><strong>Easy Maintenance</strong>: No framework-specific knowledge required</p>
</li>
</ul>
<p>Here's a high-level overview of our architecture:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1730201310565/29bf7af6-9e58-46c1-b384-6ce8dd3e175f.png" alt class="image--center mx-auto" /></p>
<p>This architecture shows how we'll:</p>
<ul>
<li><p>Process and index documents into our knowledge base</p>
</li>
<li><p>Handle user queries through semantic search</p>
</li>
<li><p>Maintain conversation history</p>
</li>
<li><p>Generate contextual responses</p>
</li>
</ul>
<h2 id="heading-document-processing-and-indexing">Document Processing and Indexing</h2>
<p>The foundation of any RAG system is its ability to process and index documents effectively. In this section, we'll build a robust document processing pipeline that can handle multiple file formats and prepare documents for semantic search.</p>
<h3 id="heading-document-loading-utilities">Document Loading Utilities</h3>
<p>First, let's create utilities to handle different document formats. Our implementation supports PDF, DOCX, and plain text files:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> docx
<span class="hljs-keyword">import</span> PyPDF2
<span class="hljs-keyword">import</span> os

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">read_text_file</span>(<span class="hljs-params">file_path: str</span>):</span>
    <span class="hljs-string">"""Read content from a text file"""</span>
    <span class="hljs-keyword">with</span> open(file_path, <span class="hljs-string">'r'</span>, encoding=<span class="hljs-string">'utf-8'</span>) <span class="hljs-keyword">as</span> file:
        <span class="hljs-keyword">return</span> file.read()

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">read_pdf_file</span>(<span class="hljs-params">file_path: str</span>):</span>
    <span class="hljs-string">"""Read content from a PDF file"""</span>
    text = <span class="hljs-string">""</span>
    <span class="hljs-keyword">with</span> open(file_path, <span class="hljs-string">'rb'</span>) <span class="hljs-keyword">as</span> file:
        pdf_reader = PyPDF2.PdfReader(file)
        <span class="hljs-keyword">for</span> page <span class="hljs-keyword">in</span> pdf_reader.pages:
            text += page.extract_text() + <span class="hljs-string">"\n"</span>
    <span class="hljs-keyword">return</span> text

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">read_docx_file</span>(<span class="hljs-params">file_path: str</span>):</span>
    <span class="hljs-string">"""Read content from a Word document"""</span>
    doc = docx.Document(file_path)
    <span class="hljs-keyword">return</span> <span class="hljs-string">"\n"</span>.join([paragraph.text <span class="hljs-keyword">for</span> paragraph <span class="hljs-keyword">in</span> doc.paragraphs])
</code></pre>
<p>We then create a unified interface for document reading:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">read_document</span>(<span class="hljs-params">file_path: str</span>):</span>
    <span class="hljs-string">"""Read document content based on file extension"""</span>
    _, file_extension = os.path.splitext(file_path)
    file_extension = file_extension.lower()

    <span class="hljs-keyword">if</span> file_extension == <span class="hljs-string">'.txt'</span>:
        <span class="hljs-keyword">return</span> read_text_file(file_path)
    <span class="hljs-keyword">elif</span> file_extension == <span class="hljs-string">'.pdf'</span>:
        <span class="hljs-keyword">return</span> read_pdf_file(file_path)
    <span class="hljs-keyword">elif</span> file_extension == <span class="hljs-string">'.docx'</span>:
        <span class="hljs-keyword">return</span> read_docx_file(file_path)
    <span class="hljs-keyword">else</span>:
        <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">f"Unsupported file format: <span class="hljs-subst">{file_extension}</span>"</span>)
</code></pre>
<h3 id="heading-text-chunking-strategy">Text Chunking Strategy</h3>
<p>Once we have the raw text, we need to split it into manageable chunks. This is crucial for two reasons:</p>
<ol>
<li><p>More precise semantic search results</p>
</li>
<li><p>Staying within LLM context windows</p>
</li>
</ol>
<p>Here's our implementation of a sentence-aware text chunker:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">split_text</span>(<span class="hljs-params">text: str, chunk_size: int = <span class="hljs-number">500</span></span>):</span>
    <span class="hljs-string">"""Split text into chunks while preserving sentence boundaries"""</span>
    sentences = text.replace(<span class="hljs-string">'\n'</span>, <span class="hljs-string">' '</span>).split(<span class="hljs-string">'. '</span>)
    chunks = []
    current_chunk = []
    current_size = <span class="hljs-number">0</span>

    <span class="hljs-keyword">for</span> sentence <span class="hljs-keyword">in</span> sentences:
        sentence = sentence.strip()
        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> sentence:
            <span class="hljs-keyword">continue</span>

        <span class="hljs-comment"># Ensure proper sentence ending</span>
        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> sentence.endswith(<span class="hljs-string">'.'</span>):
            sentence += <span class="hljs-string">'.'</span>

        sentence_size = len(sentence)

        <span class="hljs-comment"># Check if adding this sentence would exceed chunk size</span>
        <span class="hljs-keyword">if</span> current_size + sentence_size &gt; chunk_size <span class="hljs-keyword">and</span> current_chunk:
            chunks.append(<span class="hljs-string">' '</span>.join(current_chunk))
            current_chunk = [sentence]
            current_size = sentence_size
        <span class="hljs-keyword">else</span>:
            current_chunk.append(sentence)
            current_size += sentence_size

    <span class="hljs-comment"># Add the last chunk if it exists</span>
    <span class="hljs-keyword">if</span> current_chunk:
        chunks.append(<span class="hljs-string">' '</span>.join(current_chunk))

    <span class="hljs-keyword">return</span> chunks
</code></pre>
<p>Key features of our chunking strategy:</p>
<ul>
<li><p>Preserves sentence boundaries</p>
</li>
<li><p>Configurable chunk size (default 500 characters)</p>
</li>
<li><p>Handles varying sentence lengths</p>
</li>
<li><p>Maintains readability of chunks</p>
</li>
</ul>
<h2 id="heading-setting-up-chromadb">Setting Up ChromaDB</h2>
<p>First, let's initialize ChromaDB with sentence transformers for embedding:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> chromadb
<span class="hljs-keyword">from</span> chromadb.utils <span class="hljs-keyword">import</span> embedding_functions

<span class="hljs-comment"># Initialize ChromaDB client with persistence</span>
client = chromadb.PersistentClient(path=<span class="hljs-string">"chroma_db"</span>)

<span class="hljs-comment"># Configure sentence transformer embeddings</span>
sentence_transformer_ef = embedding_functions.SentenceTransformerEmbeddingFunction(
    model_name=<span class="hljs-string">"all-MiniLM-L6-v2"</span>
)

<span class="hljs-comment"># Create or get existing collection</span>
collection = client.get_or_create_collection(
    name=<span class="hljs-string">"documents_collection"</span>,
    embedding_function=sentence_transformer_ef
)
</code></pre>
<p>Key components explained:</p>
<ul>
<li><p><code>PersistentClient</code>: Stores embeddings on disk for data persistence</p>
</li>
<li><p><code>all-MiniLM-L6-v2</code>: A lightweight but effective embedding model</p>
</li>
<li><p><code>get_or_create_collection</code>: Handles both initial creation and subsequent access</p>
</li>
</ul>
<h2 id="heading-inserting-data-into-chromadb">Inserting Data into ChromaDB</h2>
<p>Now we'll create a complete pipeline that processes documents and prepares them for insertion into ChromaDB:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">process_document</span>(<span class="hljs-params">file_path: str</span>):</span>
    <span class="hljs-string">"""Process a single document and prepare it for ChromaDB"""</span>
    <span class="hljs-keyword">try</span>:
        <span class="hljs-comment"># Read the document</span>
        content = read_document(file_path)

        <span class="hljs-comment"># Split into chunks</span>
        chunks = split_text(content)

        <span class="hljs-comment"># Prepare metadata</span>
        file_name = os.path.basename(file_path)
        metadatas = [{<span class="hljs-string">"source"</span>: file_name, <span class="hljs-string">"chunk"</span>: i} <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(len(chunks))]
        ids = [<span class="hljs-string">f"<span class="hljs-subst">{file_name}</span>_chunk_<span class="hljs-subst">{i}</span>"</span> <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(len(chunks))]

        <span class="hljs-keyword">return</span> ids, chunks, metadatas
    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        print(<span class="hljs-string">f"Error processing <span class="hljs-subst">{file_path}</span>: <span class="hljs-subst">{str(e)}</span>"</span>)
        <span class="hljs-keyword">return</span> [], [], []
</code></pre>
<p>The <code>process_document</code> function:</p>
<ol>
<li><p>Reads the document content</p>
</li>
<li><p>Splits it into chunks</p>
</li>
<li><p>Generates unique IDs for each chunk</p>
</li>
<li><p>Creates metadata including source file and chunk number</p>
</li>
</ol>
<h3 id="heading-batch-processing-for-multiple-documents">Batch Processing for Multiple Documents</h3>
<p>To handle multiple documents efficiently:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">add_to_collection</span>(<span class="hljs-params">collection, ids, texts, metadatas</span>):</span>
    <span class="hljs-string">"""Add documents to collection in batches"""</span>
    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> texts:
        <span class="hljs-keyword">return</span>

    batch_size = <span class="hljs-number">100</span>
    <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(<span class="hljs-number">0</span>, len(texts), batch_size):
        end_idx = min(i + batch_size, len(texts))
        collection.add(
            documents=texts[i:end_idx],
            metadatas=metadatas[i:end_idx],
            ids=ids[i:end_idx]
        )

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">process_and_add_documents</span>(<span class="hljs-params">collection, folder_path: str</span>):</span>
    <span class="hljs-string">"""Process all documents in a folder and add to collection"""</span>
    files = [os.path.join(folder_path, file) 
             <span class="hljs-keyword">for</span> file <span class="hljs-keyword">in</span> os.listdir(folder_path) 
             <span class="hljs-keyword">if</span> os.path.isfile(os.path.join(folder_path, file))]

    <span class="hljs-keyword">for</span> file_path <span class="hljs-keyword">in</span> files:
        print(<span class="hljs-string">f"Processing <span class="hljs-subst">{os.path.basename(file_path)}</span>..."</span>)
        ids, texts, metadatas = process_document(file_path)
        add_to_collection(collection, ids, texts, metadatas)
        print(<span class="hljs-string">f"Added <span class="hljs-subst">{len(texts)}</span> chunks to collection"</span>)
</code></pre>
<h3 id="heading-usage-example">Usage Example</h3>
<p>Here's how to use the complete document processing pipeline:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Initialize ChromaDB collection (we'll cover this in detail in the next section)</span>
collection = client.get_or_create_collection(
    name=<span class="hljs-string">"documents_collection"</span>,
    embedding_function=sentence_transformer_ef
)

<span class="hljs-comment"># Process and add documents from a folder</span>
folder_path = <span class="hljs-string">"/docs"</span>
process_and_add_documents(collection, folder_path)
</code></pre>
<p>Example output:</p>
<pre><code class="lang-plaintext">Processing GreenGrow's EcoHarvest System_ A Revolution in Farming.pdf...
Added 6 chunks to collection
Processing Company_ QuantumNext Systems.docx...
Added 2 chunks to collection
Processing GreenGrow Innovations_ Company History.docx...
Added 5 chunks to collection
Processing Company_ TechWave Innovations.docx...
Added 1 chunks to collection
Processing Company_ GreenFields BioTech.docx...
Added 2 chunks to collection
</code></pre>
<h2 id="heading-implementing-semantic-search">Implementing Semantic Search</h2>
<p>Now for the core functionality - semantic search to retrieve relevant documents:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">semantic_search</span>(<span class="hljs-params">collection, query: str, n_results: int = <span class="hljs-number">2</span></span>):</span>
    <span class="hljs-string">"""Perform semantic search on the collection"""</span>
    results = collection.query(
        query_texts=[query],
        n_results=n_results
    )
    <span class="hljs-keyword">return</span> results

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_context_with_sources</span>(<span class="hljs-params">results</span>):</span>
    <span class="hljs-string">"""Extract context and source information from search results"""</span>
    <span class="hljs-comment"># Combine document chunks into a single context</span>
    context = <span class="hljs-string">"\n\n"</span>.join(results[<span class="hljs-string">'documents'</span>][<span class="hljs-number">0</span>])

    <span class="hljs-comment"># Format sources with metadata</span>
    sources = [
        <span class="hljs-string">f"<span class="hljs-subst">{meta[<span class="hljs-string">'source'</span>]}</span> (chunk <span class="hljs-subst">{meta[<span class="hljs-string">'chunk'</span>]}</span>)"</span> 
        <span class="hljs-keyword">for</span> meta <span class="hljs-keyword">in</span> results[<span class="hljs-string">'metadatas'</span>][<span class="hljs-number">0</span>]
    ]

    <span class="hljs-keyword">return</span> context, sources
</code></pre>
<p>Example search usage:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Perform a search</span>
query = <span class="hljs-string">"When was GreenGrow Innovations founded?"</span>
results = semantic_search(collection, query)
results
</code></pre>
<pre><code class="lang-plaintext">{'ids': [['GreenGrow Innovations_ Company History.docx_chunk_0',
   'GreenGrow Innovations_ Company History.docx_chunk_4']],
 'embeddings': None,
 'documents': [['GreenGrow Innovations was founded in 2010 by Sarah Chen...',
   'Despite its growth, GreenGrow remains committed to its original mission of ...']],
 'uris': None,
 'data': None,
 'metadatas': [[{'chunk': 0,
    'source': 'GreenGrow Innovations_ Company History.docx'},
   {'chunk': 4, 'source': 'GreenGrow Innovations_ Company History.docx'}]],
 'distances': [[0.3241303612288011, 0.5925477286632909]],
 'included': [&lt;IncludeEnum.distances: 'distances'&gt;,
  &lt;IncludeEnum.documents: 'documents'&gt;,
  &lt;IncludeEnum.metadatas: 'metadatas'&gt;]}
</code></pre>
<h3 id="heading-understanding-search-results">Understanding Search Results</h3>
<p>ChromaDB search results contain several key components:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">print_search_results</span>(<span class="hljs-params">results</span>):</span>
    <span class="hljs-string">"""Print formatted search results"""</span>
    print(<span class="hljs-string">"\nSearch Results:\n"</span> + <span class="hljs-string">"-"</span> * <span class="hljs-number">50</span>)

    <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(len(results[<span class="hljs-string">'documents'</span>][<span class="hljs-number">0</span>])):
        doc = results[<span class="hljs-string">'documents'</span>][<span class="hljs-number">0</span>][i]
        meta = results[<span class="hljs-string">'metadatas'</span>][<span class="hljs-number">0</span>][i]
        distance = results[<span class="hljs-string">'distances'</span>][<span class="hljs-number">0</span>][i]

        print(<span class="hljs-string">f"\nResult <span class="hljs-subst">{i + <span class="hljs-number">1</span>}</span>"</span>)
        print(<span class="hljs-string">f"Source: <span class="hljs-subst">{meta[<span class="hljs-string">'source'</span>]}</span>, Chunk <span class="hljs-subst">{meta[<span class="hljs-string">'chunk'</span>]}</span>"</span>)
        print(<span class="hljs-string">f"Distance: <span class="hljs-subst">{distance}</span>"</span>)
        print(<span class="hljs-string">f"Content: <span class="hljs-subst">{doc}</span>\n"</span>)
</code></pre>
<pre><code class="lang-plaintext">Search Results:
--------------------------------------------------

Result 1:
Source: GreenGrow Innovations_ Company History.docx, Chunk 0
Content: GreenGrow Innovations was founded in 2010 by Sarah Chen and Michael Rodriguez,
two agricultural engineers with a passion for sustainable farming. The company
started in a small garage in Portland, Oregon, with a simple mission: to make
farming more environmentally friendly and efficient. In its early days,
GreenGrow focused on developing smart irrigation systems that could
significantly reduce water usage in agriculture.

Result 2:
Source: GreenGrow Innovations_ Company History.docx, Chunk 4
Content: Despite its growth, GreenGrow remains committed to its original mission of
promoting sustainable farming practices. The company regularly partners with
universities and research institutions to advance the field of agricultural
technology and hosts annual conferences to share knowledge with farmers and
other industry professionals.

Result 3:
Source: GreenGrow Innovations_ Company History.docx, Chunk 3
Content: This system caught the attention of large-scale farmers across the United
States, propelling GreenGrow to national prominence. Today, GreenGrow
Innovations employs over 200 people and has expanded its operations to include
offices in California and Iowa. The company continues to focus on developing
sustainable agricultural technologies, with ongoing projects in vertical
farming, drought-resistant crop development, and AI-powered farm management
systems.
</code></pre>
<p>The results include:</p>
<ul>
<li><p><code>documents</code>: The actual text chunks</p>
</li>
<li><p><code>metadatas</code>: Associated metadata (source, chunk number)</p>
</li>
<li><p><code>distances</code>: Similarity scores (lower is better)</p>
</li>
</ul>
<h2 id="heading-openai-integration">OpenAI Integration</h2>
<p>Now that we have our document retrieval system in place, let's integrate OpenAI's API to create the generation part of our RAG system. We'll focus on crafting effective prompts and managing the conversation flow.</p>
<h3 id="heading-setting-up-openai">Setting Up OpenAI</h3>
<p>First, let's configure the OpenAI client:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> os
<span class="hljs-keyword">from</span> openai <span class="hljs-keyword">import</span> OpenAI

<span class="hljs-comment"># Initialize OpenAI client</span>
client = OpenAI()

<span class="hljs-comment"># Set your API key</span>
os.environ[<span class="hljs-string">"OPENAI_API_KEY"</span>] = <span class="hljs-string">"your-api-key-here"</span>
</code></pre>
<h3 id="heading-prompt-engineering">Prompt Engineering</h3>
<p>The effectiveness of RAG heavily depends on how we structure our prompts. Let's create a modular prompt system:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_prompt</span>(<span class="hljs-params">context: str, conversation_history: str, query: str</span>):</span>
    <span class="hljs-string">"""Generate a prompt combining context, history, and query"""</span>
    prompt = <span class="hljs-string">f"""Based on the following context and conversation history, 
    please provide a relevant and contextual response. If the answer cannot 
    be derived from the context, only use the conversation history or say 
    "I cannot answer this based on the provided information."

    Context from documents:
    <span class="hljs-subst">{context}</span>

    Previous conversation:
    <span class="hljs-subst">{conversation_history}</span>

    Human: <span class="hljs-subst">{query}</span>

    Assistant:"""</span>

    <span class="hljs-keyword">return</span> prompt
</code></pre>
<h3 id="heading-response-generation">Response Generation</h3>
<p>Let's implement the core response generation function:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">generate_response</span>(<span class="hljs-params">query: str, context: str, conversation_history: str = <span class="hljs-string">""</span></span>):</span>
    <span class="hljs-string">"""Generate a response using OpenAI with conversation history"""</span>
    prompt = get_prompt(context, conversation_history, query)

    <span class="hljs-keyword">try</span>:
        response = client.chat.completions.create(
            model=<span class="hljs-string">"gpt-4"</span>,  <span class="hljs-comment"># or gpt-3.5-turbo for lower cost</span>
            messages=[
                {<span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"You are a helpful assistant that answers questions based on the provided context."</span>},
                {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: prompt}
            ],
            temperature=<span class="hljs-number">0</span>,  <span class="hljs-comment"># Lower temperature for more focused responses</span>
            max_tokens=<span class="hljs-number">500</span>
        )
        <span class="hljs-keyword">return</span> response.choices[<span class="hljs-number">0</span>].message.content
    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        <span class="hljs-keyword">return</span> <span class="hljs-string">f"Error generating response: <span class="hljs-subst">{str(e)}</span>"</span>
</code></pre>
<h2 id="heading-perform-rag-query">Perform RAG query</h2>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">rag_query</span>(<span class="hljs-params">collection, query: str, n_chunks: int = <span class="hljs-number">2</span></span>):</span>
    <span class="hljs-string">"""Perform RAG query: retrieve relevant chunks and generate answer"""</span>
    <span class="hljs-comment"># Get relevant chunks</span>
    results = semantic_search(collection, query, n_chunks)
    context, sources = get_context_with_sources(results)

    <span class="hljs-comment"># Generate response</span>
    response = generate_response(query, context)

    <span class="hljs-keyword">return</span> response, sources
</code></pre>
<pre><code class="lang-python">query = <span class="hljs-string">"When was GreenGrow Innovations founded?"</span>
response, sources = rag_query(collection, query)

<span class="hljs-comment"># Print results</span>
print(<span class="hljs-string">"\nQuery:"</span>, query)
print(<span class="hljs-string">"\nAnswer:"</span>, response)
print(<span class="hljs-string">"\nSources used:"</span>)
<span class="hljs-keyword">for</span> source <span class="hljs-keyword">in</span> sources:
    print(<span class="hljs-string">f"- <span class="hljs-subst">{source}</span>"</span>)
</code></pre>
<pre><code class="lang-plaintext">Query: When was GreenGrow Innovations founded?

Answer: GreenGrow Innovations was founded in 2010.

Sources used:
- GreenGrow Innovations_ Company History.docx (chunk 0)
- GreenGrow Innovations_ Company History.docx (chunk 4)
</code></pre>
<h2 id="heading-building-conversational-memory">Building Conversational Memory</h2>
<p>Conversational memory is crucial for RAG applications to handle follow-up questions and maintain context across interactions. We'll implement a robust system that manages conversation history and enables natural dialogue.</p>
<h3 id="heading-session-management">Session Management</h3>
<p>First, let's implement basic session management to handle multiple conversations:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> uuid
<span class="hljs-keyword">from</span> datetime <span class="hljs-keyword">import</span> datetime
<span class="hljs-keyword">import</span> json

<span class="hljs-comment"># In-memory conversation store</span>
conversations = {}

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create_session</span>():</span>
    <span class="hljs-string">"""Create a new conversation session"""</span>
    session_id = str(uuid.uuid4())
    conversations[session_id] = []
    <span class="hljs-keyword">return</span> session_id
</code></pre>
<h3 id="heading-message-management">Message Management</h3>
<p>Let's implement functions to add and retrieve messages:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">add_message</span>(<span class="hljs-params">session_id: str, role: str, content: str</span>):</span>
    <span class="hljs-string">"""Add a message to the conversation history"""</span>
    <span class="hljs-keyword">if</span> session_id <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> conversations:
        conversations[session_id] = []

    conversations[session_id].append({
        <span class="hljs-string">"role"</span>: role,
        <span class="hljs-string">"content"</span>: content,
        <span class="hljs-string">"timestamp"</span>: datetime.now().isoformat()
    })

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_conversation_history</span>(<span class="hljs-params">session_id: str, max_messages: int = None</span>):</span>
    <span class="hljs-string">"""Get conversation history for a session"""</span>
    <span class="hljs-keyword">if</span> session_id <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> conversations:
        <span class="hljs-keyword">return</span> []

    history = conversations[session_id]
    <span class="hljs-keyword">if</span> max_messages:
        history = history[-max_messages:]

    <span class="hljs-keyword">return</span> history
</code></pre>
<h3 id="heading-formatting-conversation-history">Formatting Conversation History</h3>
<p>Create a formatter for the conversation history:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">format_history_for_prompt</span>(<span class="hljs-params">session_id: str, max_messages: int = <span class="hljs-number">5</span></span>):</span>
    <span class="hljs-string">"""Format conversation history for inclusion in prompts"""</span>
    history = get_conversation_history(session_id, max_messages)
    formatted_history = <span class="hljs-string">""</span>

    <span class="hljs-keyword">for</span> msg <span class="hljs-keyword">in</span> history:
        role = <span class="hljs-string">"Human"</span> <span class="hljs-keyword">if</span> msg[<span class="hljs-string">"role"</span>] == <span class="hljs-string">"user"</span> <span class="hljs-keyword">else</span> <span class="hljs-string">"Assistant"</span>
        formatted_history += <span class="hljs-string">f"<span class="hljs-subst">{role}</span>: <span class="hljs-subst">{msg[<span class="hljs-string">'content'</span>]}</span>\n\n"</span>

    <span class="hljs-keyword">return</span> formatted_history.strip()
</code></pre>
<h3 id="heading-query-contextualization">Query Contextualization</h3>
<p>For follow-up questions, we need to contextualize queries based on conversation history:</p>
<p>First Question : When was GreenGrow Innovations founded?</p>
<p>Follow up : Where is it headquartered?</p>
<p>contextualized_query : Where is GreenGrow Innovations headquartered?</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">contextualize_query</span>(<span class="hljs-params">query: str, conversation_history: str, client: OpenAI</span>):</span>
    <span class="hljs-string">"""Convert follow-up questions into standalone queries"""</span>
    contextualize_prompt = <span class="hljs-string">"""Given a chat history and the latest user question 
    which might reference context in the chat history, formulate a standalone 
    question which can be understood without the chat history. Do NOT answer 
    the question, just reformulate it if needed and otherwise return it as is."""</span>

    <span class="hljs-keyword">try</span>:
        completion = client.chat.completions.create(
            model=<span class="hljs-string">"gpt-4"</span>,
            messages=[
                {<span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>, <span class="hljs-string">"content"</span>: contextualize_prompt},
                {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">f"Chat history:\n<span class="hljs-subst">{conversation_history}</span>\n\nQuestion:\n<span class="hljs-subst">{query}</span>"</span>}
            ]
        )
        <span class="hljs-keyword">return</span> completion.choices[<span class="hljs-number">0</span>].message.content
    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        print(<span class="hljs-string">f"Error contextualizing query: <span class="hljs-subst">{str(e)}</span>"</span>)
        <span class="hljs-keyword">return</span> query  <span class="hljs-comment"># Fallback to original query</span>
</code></pre>
<h2 id="heading-combining-rag-components">Combining RAG Components</h2>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_prompt</span>(<span class="hljs-params">context, conversation_history, query</span>):</span>
  prompt = <span class="hljs-string">f"""Based on the following context and conversation history, please provide a relevant and contextual response.
    If the answer cannot be derived from the context, only use the conversation history or say "I cannot answer this based on the provided information."

    Context from documents:
    <span class="hljs-subst">{context}</span>

    Previous conversation:
    <span class="hljs-subst">{conversation_history}</span>

    Human: <span class="hljs-subst">{query}</span>

    Assistant:"""</span>
  <span class="hljs-keyword">return</span> prompt
</code></pre>
<pre><code class="lang-python"><span class="hljs-comment"># Updated generate response function with conversation history also passed for Chatbot Memory</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">generate_response</span>(<span class="hljs-params">query: str, context: str, conversation_history: str = <span class="hljs-string">""</span></span>):</span>
    <span class="hljs-string">"""Generate a response using OpenAI with conversation history"""</span>
    prompt = get_prompt(context, conversation_history, query)
    <span class="hljs-comment"># print(prompt)</span>

    response = client.chat.completions.create(
        model=<span class="hljs-string">"gpt-4o"</span>,
        messages=[
            {<span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>, <span class="hljs-string">"content"</span>: prompt}
        ],
        temperature=<span class="hljs-number">0</span>,
        max_tokens=<span class="hljs-number">500</span>
    )

    <span class="hljs-keyword">return</span> response.choices[<span class="hljs-number">0</span>].message.content
</code></pre>
<p>Now let's create the main RAG query function that combines everything:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">conversational_rag_query</span>(<span class="hljs-params">
    collection,
    query: str,
    session_id: str,
    n_chunks: int = <span class="hljs-number">3</span>
</span>):</span>
    <span class="hljs-string">"""Perform RAG query with conversation history"""</span>
    <span class="hljs-comment"># Get conversation history</span>
    conversation_history = format_history_for_prompt(session_id)

    <span class="hljs-comment"># Handle follo up questions</span>
    query = contextualize_query(query, conversation_history, client)
    print(<span class="hljs-string">"Contextualized Query:"</span>, query)

    <span class="hljs-comment"># Get relevant chunks</span>
    context, sources = get_context_with_sources(
        semantic_search(collection, query, n_chunks)
    )
    print(<span class="hljs-string">"Context:"</span>, context)
    print(<span class="hljs-string">"Sources:"</span>, sources)


    response = generate_response(query, context, conversation_history)

    <span class="hljs-comment"># Add to conversation history</span>
    add_message(session_id, <span class="hljs-string">"user"</span>, query)
    add_message(session_id, <span class="hljs-string">"assistant"</span>, response)

    <span class="hljs-keyword">return</span> response, sources
</code></pre>
<h3 id="heading-example-usage">Example Usage</h3>
<p>Here's how to use the complete system:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Create a new conversation session</span>
session_id = create_session()

<span class="hljs-comment"># First question</span>
query = <span class="hljs-string">"When was GreenGrow Innovations founded?"</span>
response, sources = conversational_rag_query(
            collection,
            query,
            session_id
)
print(response)
</code></pre>
<p>Example output:</p>
<pre><code class="lang-plaintext">ontextualized Query: When did GreenGrow Innovations come into existence?
Context: GreenGrow Innovations was founded in 2010 by .....
Sources: ['GreenGrow Innovations_ Company History.docx (chunk 0)', 'GreenGrow Innovations_ Company History.docx (chunk 3)', 'GreenGrow Innovations_ Company History.docx (chunk 4)']
GreenGrow Innovations was founded in 2010.
</code></pre>
<pre><code class="lang-python">query = <span class="hljs-string">"Where is it located?"</span>
response, sources = conversational_rag_query(
            collection,
            query,
            session_id
)
print(response)
</code></pre>
<pre><code class="lang-plaintext">Contextualized Query: What is the location of GreenGrow Innovations?
Context: GreenGrow Innovations was founded in 2010 by Sarah Chen and Michael Rodriguez, two agricultural engineers with a passion for sustainable farming. The company started in a small garage in Portland, Oregon, with a simple mission......
Sources: ['GreenGrow Innovations_ Company History.docx (chunk 0)', 'GreenGrow Innovations_ Company History.docx (chunk 3)', 'GreenGrow Innovations_ Company History.docx (chunk 4)']
GreenGrow Innovations started in Portland, Oregon, and has since expanded its operations to include offices in California and Iowa.
</code></pre>
<h2 id="heading-conclusion">Conclusion:</h2>
<p>This tutorial shows you how to build RAG without LangChain or LlamaIndex when you need direct control over your implementation. You'll learn to process documents, perform semantic search, and handle conversations using just ChromaDB and OpenAI's API. All code is production-ready and debuggable – no black boxes.</p>
<h3 id="heading-explore-how-we-can-help-your-business"><strong>Explore How We Can Help Your Business:</strong></h3>
<ul>
<li><p><strong>Case Studies</strong>: Visit <a target="_blank" href="https://www.futuresmart.ai/case-studies"><strong>FutureSmart AI Case Studies</strong></a> to see real-world applications of our NLP solutions</p>
</li>
<li><p><strong>Contact Us</strong>: Have a project in mind? Reach out at <a target="_blank" href="mailto:contact@futuresmart.ai"><strong>contact@futuresmart.ai</strong></a></p>
</li>
</ul>
<p>Let us help you transform your business with custom AI solutions tailored to your specific needs.</p>
<p>Code: <a target="_blank" href="https://github.com/PradipNichite/Youtube-Tutorials/tree/main/Vanilla%20RAG">https://github.com/PradipNichite/Youtube-Tutorials/tree/main/Vanilla%20RAG</a></p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/8sSHg1034r0">https://youtu.be/8sSHg1034r0</a></div>
]]></content:encoded></item><item><title><![CDATA[OpenAI Swarm: A Hands-On Introduction to Multi-Agent Orchestration]]></title><description><![CDATA[Introduction to OpenAI Swarm
Multi-agent orchestration is a trending topic in the field of large language models (LLMs). Instead of relying on a single, all-encompassing LLM, multi-agent systems employ a team of specialized agents, each designed to e...]]></description><link>https://blog.futuresmart.ai/openai-swarm-a-hands-on-introduction</link><guid isPermaLink="true">https://blog.futuresmart.ai/openai-swarm-a-hands-on-introduction</guid><category><![CDATA[open ai swarm]]></category><category><![CDATA[AI]]></category><category><![CDATA[openai]]></category><category><![CDATA[ai agents]]></category><dc:creator><![CDATA[Rounak Show]]></dc:creator><pubDate>Sat, 26 Oct 2024 06:17:31 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1729582347844/2f26aa27-33b2-4f32-bcc2-2e7ab00860eb.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3 id="heading-introduction-to-openai-swarm">Introduction to OpenAI Swarm</h3>
<p>Multi-agent orchestration is a trending topic in the field of large language models (LLMs). <strong>Instead of relying on a single, all-encompassing LLM, multi-agent systems employ a team of specialized agents, each designed to excel at a particular task.</strong> This approach allows for more complex and nuanced problem-solving, as agents can collaborate, share information, and leverage their individual strengths. <a target="_blank" href="https://github.com/openai/swarm">OpenAI Swarm</a> is an experimental framework designed to make multi-agent orchestration more accessible and user-friendly. <strong>Swarm is built on a practical, lightweight approach, prioritizing ease of use and clear, intuitive interactions between agents.</strong></p>
<p>This blog post will provide a hands-on introduction to OpenAI Swarm, with a strong emphasis on practical examples using Python code. You will learn how to set up Swarm, create agents, implement handoffs, and build a simple <strong>multi-agent system</strong> along with <a target="_blank" href="https://blog.futuresmart.ai/harnessing-langchain-and-google-apis">LangChain</a> integration. By the end, you'll have a solid understanding of how to leverage Swarm to build your own multi-agent LLM applications.</p>
<h3 id="heading-setting-up-openai-swarm">Setting Up OpenAI Swarm</h3>
<p>To get started with OpenAI Swarm, you first need to install it. Make sure you have Python 3.10 or later installed on your system. You can install Swarm directly from the GitHub repository using the following command:</p>
<pre><code class="lang-python">pip install git+ssh://git@github.com/openai/swarm.git
</code></pre>
<p>Or:</p>
<pre><code class="lang-python">pip install git+https://github.com/openai/swarm.git
</code></pre>
<p>Next, you need to set up your OpenAI API key. This key allows Swarm to access OpenAI’s language models and other services. You can find your API key on the <a target="_blank" href="https://platform.openai.com/docs/api-reference/authentication">OpenAI website</a>. Once you have your key, you can set it as an environment variable:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> os
os.environ[<span class="hljs-string">'OPENAI_API_KEY'</span>] = <span class="hljs-string">'YOUR_OPENAI_API_KEY'</span>
</code></pre>
<p>Replace ‘YOUR_OPENAI_API_KEY’ with your actual API key. With Swarm installed and your API key set up, you are ready to start building multi-agent systems!</p>
<h3 id="heading-understanding-agents-and-handoffs-in-openai-swarm">Understanding Agents and Handoffs in OpenAI Swarm</h3>
<p>In OpenAI Swarm, <strong>agents are the core building blocks of a multi-agent system</strong>. They encapsulate a set of instructions, functions, and the ability to hand off execution to other agents. <strong>Think of agents as specialized units, each responsible for a specific aspect of a larger task</strong>. For example, in a travel planning system, you could have separate agents for booking flights, hotels, and transportation.</p>
<h3 id="heading-creating-agents-and-handoffs-in-python">Creating Agents and Handoffs in Python</h3>
<p>You can create agents in Python using the <code>Agent</code> class from the <code>swarm</code> module. Here’s a basic example:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> swarm <span class="hljs-keyword">import</span> Swarm, Agent

<span class="hljs-comment"># creating handoffs functions</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">handoff_to_weather_agent</span>():</span>
    <span class="hljs-string">"""Transfer to the weather agent for weather queries."""</span>
    print(<span class="hljs-string">"Handing off to Weather Agent"</span>)
    <span class="hljs-keyword">return</span> weather_agent

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">handoff_to_math_agent</span>():</span>
    <span class="hljs-string">"""Transfer to the math agent for mathematical queries."""</span>
    print(<span class="hljs-string">"Handing off to Math Agent"</span>)
    <span class="hljs-keyword">return</span> math_agent

<span class="hljs-comment"># Initialize the agents with specific roles</span>
math_agent = Agent(
    name=<span class="hljs-string">"Math Agent"</span>,
    instructions=<span class="hljs-string">"You handle only mathematical queries."</span>,
    functions=[handoff_to_weather_agent]
)

weather_agent = Agent(
    name=<span class="hljs-string">"Weather Agent"</span>,
    instructions=<span class="hljs-string">"You handle only weather-related queries."</span>,
    functions=[handoff_to_math_agent]
)
</code></pre>
<p>In this example, two agents (<code>math_agent</code> and <code>weather_agent</code>) are created. Each agent has instructions specifying the types of queries they handle. Additionally, each agent has a handoff function, allowing them to delegate queries outside their area of expertise.</p>
<h3 id="heading-handoffs-between-agents">Handoffs Between Agents</h3>
<p>Let’s see how these agents transfer control using the handoff mechanism. Here’s how the process works for mathematical and weather-related queries:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Initialize the Swarm client</span>
client = Swarm()

<span class="hljs-comment"># Test handoff by asking a math question to the weather agent</span>
messages = [{<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"What is 2+2?"</span>}]
handoff_response = client.run(agent=weather_agent, messages=messages)
print(handoff_response.messages[<span class="hljs-number">-1</span>][<span class="hljs-string">"content"</span>])

<span class="hljs-comment"># Response:</span>
<span class="hljs-comment"># Handing off to Math Agent</span>
<span class="hljs-comment"># The answer to 2 + 2 is 4.</span>
</code></pre>
<p>In this example, we intentionally send a math query (<code>"What is 2+2?"</code>) to the <code>weather_agent</code>. The <code>weather_agent</code> detects this and hands off control to the <code>math_agent</code>, which provides the correct answer.</p>
<h3 id="heading-handling-a-weather-query">Handling a Weather Query</h3>
<p>Now, let’s send a weather-related query to the <code>math_agent</code> and observe the handoff:</p>
<pre><code class="lang-python">messages = [{<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"How is the weather in Canada in December?"</span>}]
response = client.run(agent=math_agent, messages=messages)
print(response.messages[<span class="hljs-number">-1</span>][<span class="hljs-string">"content"</span>])

<span class="hljs-comment"># Response:</span>
<span class="hljs-comment"># Handing off to Weather Agent</span>
<span class="hljs-comment"># Canada generally experiences cold weather during December, with varying conditions depending on the region. Coastal areas like Vancouver may have milder temperatures, while areas like Toronto, Ottawa, and Montreal see cooler, snowy conditions. Northern regions experience extreme cold. Checking a reliable weather service will provide the most accurate information.</span>
</code></pre>
<p>Here, the <code>math_agent</code> receives a weather-related query and hands it off to the <code>weather_agent</code>, which provides an overview of Canada’s typical December weather.</p>
<p>This example illustrates how agents can seamlessly collaborate to handle different types of queries, ensuring a flexible, adaptive multi-agent system.</p>
<h3 id="heading-creating-and-orchestrating-rag-and-nl2sql-agents-with-openai-swarm">Creating and Orchestrating RAG and NL2SQL Agents with OpenAI Swarm</h3>
<p>In this section, we will demonstrate how to set up two key agents: a <strong>RAG Agent</strong> for retrieval-augmented generation (RAG) and an <strong>NL2SQL Agent</strong> for querying a SQL database. These agents will collaborate to handle user queries based on their nature, either by retrieving relevant information from documents or by generating SQL queries to fetch results from a database. Finally, we will orchestrate the agents using a <strong>Central Agent</strong> that determines which agent should handle the query.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1730350299698/b08da49f-c529-45ed-a458-d301eec66fb8.png" alt class="image--center mx-auto" /></p>
<h4 id="heading-step-1-installing-the-required-packages">Step 1: Installing the Required Packages</h4>
<p>Before getting started, make sure you have installed the necessary Python packages. You can install them using the following command:</p>
<pre><code class="lang-bash">pip install langchain langchain-chroma langchain-openai langchain-community pypdf sentence-transformers
</code></pre>
<h4 id="heading-step-2-loading-and-splitting-documents-for-the-rag-agent">Step 2: Loading and Splitting Documents for the RAG Agent</h4>
<p>To create a RAG agent, you first need to load documents that contain the knowledge base. In this example, we load <code>.pdf</code> and <code>.docx</code> files from a folder and split them into smaller chunks for efficient retrieval.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain_community.document_loaders <span class="hljs-keyword">import</span> PyPDFLoader, Docx2txtLoader
<span class="hljs-keyword">from</span> langchain_text_splitters <span class="hljs-keyword">import</span> RecursiveCharacterTextSplitter
<span class="hljs-keyword">from</span> langchain_core.documents <span class="hljs-keyword">import</span> Document
<span class="hljs-keyword">import</span> os

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">load_documents</span>(<span class="hljs-params">folder_path: str</span>) -&gt; List[Document]:</span>
    documents = []
    <span class="hljs-keyword">for</span> filename <span class="hljs-keyword">in</span> os.listdir(folder_path):
        file_path = os.path.join(folder_path, filename)
        <span class="hljs-keyword">if</span> filename.endswith(<span class="hljs-string">'.pdf'</span>):
            loader = PyPDFLoader(file_path)
        <span class="hljs-keyword">elif</span> filename.endswith(<span class="hljs-string">'.docx'</span>):
            loader = Docx2txtLoader(file_path)
        <span class="hljs-keyword">else</span>:
            print(<span class="hljs-string">f"Unsupported file type: <span class="hljs-subst">{filename}</span>"</span>)
            <span class="hljs-keyword">continue</span>
        documents.extend(loader.load())
    <span class="hljs-keyword">return</span> documents

<span class="hljs-comment"># Load documents from a folder</span>
folder_path = <span class="hljs-string">"/content/docs"</span>
documents = load_documents(folder_path)
print(<span class="hljs-string">f"Loaded <span class="hljs-subst">{len(documents)}</span> documents from the folder."</span>)

<span class="hljs-comment"># Split documents into chunks</span>
text_splitter = RecursiveCharacterTextSplitter(chunk_size=<span class="hljs-number">1000</span>, chunk_overlap=<span class="hljs-number">200</span>)
splits = text_splitter.split_documents(documents)
print(<span class="hljs-string">f"Split the documents into <span class="hljs-subst">{len(splits)}</span> chunks."</span>)
</code></pre>
<h4 id="heading-step-3-creating-and-persisting-a-vector-store">Step 3: Creating and Persisting a Vector Store</h4>
<p>Next, we create a vector store using the <code>SentenceTransformerEmbeddings</code> to embed the document chunks for efficient retrieval.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain_community.embeddings.sentence_transformer <span class="hljs-keyword">import</span> SentenceTransformerEmbeddings
<span class="hljs-keyword">from</span> langchain_chroma <span class="hljs-keyword">import</span> Chroma

embedding_function = SentenceTransformerEmbeddings(model_name=<span class="hljs-string">"all-MiniLM-L6-v2"</span>)
collection_name = <span class="hljs-string">"my_collection"</span>

vectorstore = Chroma.from_documents(
    collection_name=collection_name,
    documents=splits,
    embedding=embedding_function,
    persist_directory=<span class="hljs-string">"./chroma_db"</span>
)
print(<span class="hljs-string">"Vector store created and persisted to './chroma_db'"</span>)
</code></pre>
<h4 id="heading-step-4-creating-the-rag-agent">Step 4: Creating the RAG Agent</h4>
<p>We now set up the <strong>RAG Agent</strong>, which retrieves relevant document chunks and generates answers to user queries based on the retrieved context. Here's how the retrieval and generation are done:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain_core.prompts <span class="hljs-keyword">import</span> ChatPromptTemplate
<span class="hljs-keyword">from</span> langchain.schema.runnable <span class="hljs-keyword">import</span> RunnablePassthrough
<span class="hljs-keyword">from</span> langchain_core.output_parsers <span class="hljs-keyword">import</span> StrOutputParser
<span class="hljs-keyword">from</span> langchain_openai <span class="hljs-keyword">import</span> ChatOpenAI

llm = ChatOpenAI(model=<span class="hljs-string">"gpt-4o-mini"</span>)

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">retrieve_and_generate</span>(<span class="hljs-params">question</span>):</span>
    print(<span class="hljs-string">"Calling retrieve_and_generate"</span>)
    template = <span class="hljs-string">"""Answer the question based only on the following context:
    {context}
    Question: {question}
    Answer: """</span>

    prompt = ChatPromptTemplate.from_template(template)

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">docs2str</span>(<span class="hljs-params">docs</span>):</span>
        <span class="hljs-keyword">return</span> <span class="hljs-string">"\n\n"</span>.join(doc.page_content <span class="hljs-keyword">for</span> doc <span class="hljs-keyword">in</span> docs)

    rag_chain = (
        {<span class="hljs-string">"context"</span>: retriever | docs2str, <span class="hljs-string">"question"</span>: RunnablePassthrough()}
        | prompt
        | llm
        | StrOutputParser()
    )

    response = rag_chain.invoke(question)
    <span class="hljs-keyword">return</span> response
</code></pre>
<blockquote>
<p>If you want to dive even deeper into practical implementations of Langchain and retrieval-augmented generation (RAG), don’t miss our detailed video tutorial on <strong>Langchain RAG Course: From Basics to Production-Ready RAG Chatbot</strong>. It provides step-by-step guidance for taking a Langchain project from concept to production.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/38aMTXY2usU?si=PF1ltIPCXeNzn3Nv">https://youtu.be/38aMTXY2usU?si=PF1ltIPCXeNzn3Nv</a></div>
<p> </p>
</blockquote>
<h4 id="heading-step-5-setting-up-the-nl2sql-agent">Step 5: Setting up the NL2SQL Agent</h4>
<p>Next, we create an <strong>NL2SQL Agent</strong> that handles natural language queries and converts them into SQL queries, executing them on a database.</p>
<ol>
<li><p>Download and load the Chinook SQLite database:</p>
<pre><code class="lang-bash"> !wget https://github.com/lerocha/chinook-database/raw/master/ChinookDatabase/DataSources/Chinook_Sqlite.sqlite
 !mv Chinook_Sqlite.sqlite Chinook.db
</code></pre>
</li>
<li><p><strong>Connect to the Database</strong>: Use the <code>SQLDatabase</code> utility from <code>langchain_community</code> to create a connection to <code>Chinook.db</code>.</p>
<pre><code class="lang-python"> <span class="hljs-keyword">from</span> langchain_community.utilities <span class="hljs-keyword">import</span> SQLDatabase
 db = SQLDatabase.from_uri(<span class="hljs-string">"sqlite:///Chinook.db"</span>)
</code></pre>
</li>
<li><p><strong>Setup Environment Variables</strong> for <a target="_blank" href="https://blog.futuresmart.ai/guide-to-langsmith">Langsmith Tracing</a></p>
<pre><code class="lang-python"> <span class="hljs-keyword">import</span> os
 os.environ[<span class="hljs-string">"LANGCHAIN_TRACING_V2"</span>] = <span class="hljs-string">"true"</span>
 os.environ[<span class="hljs-string">"LANGCHAIN_API_KEY"</span>] = <span class="hljs-string">"your_langchain_api_key"</span>
 os.environ[<span class="hljs-string">"LANGCHAIN_PROJECT"</span>] = <span class="hljs-string">"openai-swarm"</span>
</code></pre>
</li>
<li><p><strong>Clean SQL Query Function:</strong> Define a helper function to clean up SQL queries by removing unnecessary markdown syntax. This is useful when you need to process queries embedded in code blocks.</p>
<pre><code class="lang-python"> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">clean_sql_query</span>(<span class="hljs-params">markdown_query</span>):</span>
     <span class="hljs-comment"># Split the query into lines</span>
     lines = markdown_query.strip().split(<span class="hljs-string">'\n'</span>)

     <span class="hljs-comment"># Remove markdown syntax lines</span>
     cleaned_lines = []
     <span class="hljs-keyword">for</span> line <span class="hljs-keyword">in</span> lines:
         <span class="hljs-comment"># Skip lines that only contain backticks and optional language identifier</span>
         <span class="hljs-keyword">if</span> line.strip().startswith(<span class="hljs-string">'```'</span>) <span class="hljs-keyword">or</span> line.strip() == <span class="hljs-string">'sql'</span>:
             <span class="hljs-keyword">continue</span>
         cleaned_lines.append(line)

     <span class="hljs-comment"># Join the remaining lines and clean up extra whitespace</span>
     cleaned_query = <span class="hljs-string">' '</span>.join(cleaned_lines).strip()

     <span class="hljs-comment"># Remove any remaining backticks</span>
     cleaned_query = cleaned_query.replace(<span class="hljs-string">'`'</span>, <span class="hljs-string">''</span>)

     <span class="hljs-comment"># Ensure semicolon at the end if not present</span>
     <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> cleaned_query.strip().endswith(<span class="hljs-string">';'</span>):
         cleaned_query += <span class="hljs-string">';'</span>

     <span class="hljs-keyword">return</span> cleaned_query

 <span class="hljs-comment"># Example usage</span>
 markdown_query = <span class="hljs-string">'''```sql
 SELECT * FROM table;
 ```'''</span>

 cleaned_query = clean_sql_query(markdown_query)
 print(cleaned_query)

 <span class="hljs-comment"># Cleaned Query</span>
 <span class="hljs-comment"># SELECT * FROM table;</span>
</code></pre>
</li>
<li><p><strong>Set up the SQL Query Chain:</strong> Now, define the core logic of your SQL generation and execution process. This involves connecting LangChain components and defining a template to format the response based on the query results.</p>
<pre><code class="lang-python"> <span class="hljs-keyword">from</span> langchain_core.prompts <span class="hljs-keyword">import</span> ChatPromptTemplate
 <span class="hljs-keyword">from</span> langchain.chains <span class="hljs-keyword">import</span> create_sql_query_chain
 <span class="hljs-keyword">from</span> langchain_community.tools.sql_database.tool <span class="hljs-keyword">import</span> QuerySQLDataBaseTool
 <span class="hljs-keyword">from</span> operator <span class="hljs-keyword">import</span> itemgetter
 <span class="hljs-keyword">import</span> re
 <span class="hljs-keyword">from</span> langchain_core.output_parsers <span class="hljs-keyword">import</span> StrOutputParser
 <span class="hljs-keyword">from</span> langchain_core.prompts <span class="hljs-keyword">import</span> PromptTemplate
 <span class="hljs-keyword">from</span> langchain_core.runnables <span class="hljs-keyword">import</span> RunnablePassthrough, RunnableLambda
 <span class="hljs-keyword">from</span> langchain_openai <span class="hljs-keyword">import</span> ChatOpenAI

 sql_prompt = ChatPromptTemplate.from_messages(
     [
         (<span class="hljs-string">"system"</span>, <span class="hljs-string">"You are a SQLite expert expert. Given an input question, create a syntactically correct SQL query to run. Unless otherwise specificed.\n\nHere is the relevant table info: {table_info}\n\n Use max {top_k} rows"</span>),
         (<span class="hljs-string">"human"</span>, <span class="hljs-string">"{input}"</span>),
     ]
 )
 llm = ChatOpenAI(model=<span class="hljs-string">"gpt-4o"</span>)

 <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">sql_response_gen</span>(<span class="hljs-params">question</span>):</span>
   print(<span class="hljs-string">"Calling sql_response_gen"</span>)
   <span class="hljs-comment"># remove_code_block_syntax = lambda text: re.sub(r"```(sql|)\s*(.*?)\s*```", r"\2", text, flags=re.DOTALL)</span>
   execute_query = QuerySQLDataBaseTool(db=db)
   write_query = create_sql_query_chain(llm, db,sql_prompt)

   answer_prompt = PromptTemplate.from_template(
       <span class="hljs-string">"""Given the following user question, corresponding SQL query, and SQL result, answer the user question.

   Question: {question}
   SQL Query: {query}
   SQL Result: {result}
   Answer: """</span>
   )

   chain = (
       RunnablePassthrough.assign(query=write_query | RunnableLambda(clean_sql_query)).assign(
           result=itemgetter(<span class="hljs-string">"query"</span>) | execute_query
       )
       | answer_prompt
       | llm
       | StrOutputParser()
   )

   response = chain.invoke({<span class="hljs-string">"question"</span>: question})
   <span class="hljs-keyword">return</span> response
</code></pre>
</li>
</ol>
<p>Let’s try the NL2SQL function</p>
<pre><code class="lang-python">question = <span class="hljs-string">"How many customers are there?"</span>
result = sql_response_gen(question)

print(<span class="hljs-string">f"Question: <span class="hljs-subst">{question}</span>"</span>)
print(<span class="hljs-string">f"Answer: <span class="hljs-subst">{result}</span>"</span>)

<span class="hljs-comment">## Calling sql_response_gen</span>
<span class="hljs-comment">## Question: How many customers are there?</span>
<span class="hljs-comment">## Answer: There are 59 customers.</span>
</code></pre>
<blockquote>
<p>If you want to watch a detailed video on building NL2SQL chatbots, then watch this video:</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=fss6CrmQU2Y&amp;t=1s">https://www.youtube.com/watch?v=fss6CrmQU2Y&amp;t=1s</a></div>
<p> </p>
</blockquote>
<h4 id="heading-step-6-orchestrating-the-agents-with-a-central-agent">Step 6: Orchestrating the Agents with a Central Agent</h4>
<p>Now that we have both the RAG and NL2SQL tools, we can create RAG and NL2SQL agents and as well as the <strong>Central Agent</strong> that determines which agent should handle the user's query:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> swarm <span class="hljs-keyword">import</span> Swarm, Agent

<span class="hljs-comment"># Define the RAG and NL2SQL agents</span>
rag_agent = Agent(
    name=<span class="hljs-string">"RAG Agent"</span>,
    instructions=<span class="hljs-string">"You retrieve relevant information from the company's knowledge base and generate responses to general queries about the company."</span>,
    functions=[retrieve_and_generate]
)

nl2sql_agent = Agent(
    name=<span class="hljs-string">"NL2SQL Agent"</span>,
    instructions=<span class="hljs-string">"You handle database queries."</span>,
    functions=[sql_response_gen]
)

<span class="hljs-comment"># Define the Central Agent</span>
central_agent = Agent(
    name=<span class="hljs-string">"Central Agent"</span>,
    instructions=<span class="hljs-string">"Determine if the query is about general company information (RAG) or a database query (NL2SQL), and route the query accordingly."</span>
)

<span class="hljs-comment"># Define handoff functions</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">transfer_to_nl2sql</span>():</span>
    print(<span class="hljs-string">"Handing off to the NL2SQL Agent."</span>)
    <span class="hljs-string">"""Transfer the task to the NL2SQL Agent for database queries."""</span>
    <span class="hljs-keyword">return</span> nl2sql_agent

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">transfer_to_rag</span>():</span>
    print(<span class="hljs-string">"Handing off to the RAG agent."</span>)
    <span class="hljs-string">"""Transfer the task to the RAG Agent for general queries."""</span>
    <span class="hljs-keyword">return</span> rag_agent

<span class="hljs-comment"># Attach the handoff functions to the central agent</span>
central_agent.functions = [transfer_to_nl2sql, transfer_to_rag]
</code></pre>
<h4 id="heading-step-7-running-the-central-agent">Step 7: Running the Central Agent</h4>
<p>Finally, we test the <strong>Central Agent</strong> with different user queries to see how it delegates tasks to the appropriate agent:</p>
<pre><code class="lang-python">client = Swarm()

<span class="hljs-comment"># Example 1: Asking about the company</span>
print(<span class="hljs-string">"\n--- Example 1: Asking about the company ---"</span>)
messages = [{<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"What does Futuresmart AI offer?"</span>}]
response = client.run(agent=central_agent, messages=messages)
<span class="hljs-keyword">if</span> isinstance(response, Agent):
    selected_agent = response
    result = selected_agent.functions
    print(result)
<span class="hljs-keyword">else</span>:
    print(response.messages[<span class="hljs-number">-1</span>][<span class="hljs-string">"content"</span>])

<span class="hljs-comment"># Example 2: SQL query about employees</span>
print(<span class="hljs-string">"\n--- Example 2: Asking from the SQL DB ---"</span>)
messages = [{<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"How many employees are there in the database?"</span>}]
response = client.run(agent=central_agent, messages=messages)
<span class="hljs-keyword">if</span> isinstance(response, Agent):
    selected_agent = response
    result = selected_agent.functions
    print(result)
<span class="hljs-keyword">else</span>:
    print(response.messages[<span class="hljs-number">-1</span>][<span class="hljs-string">"content"</span>])
</code></pre>
<p>Output:</p>
<pre><code class="lang-python">--- Example <span class="hljs-number">1</span>: Asking about the company ---
Handing off to the RAG agent.
Calling retrieve_and_generate
FutureSmart AI offers a range of services including customized speech-to-text services, Natural Language Processing (NLP) solutions, text classification, <span class="hljs-keyword">and</span> the creation of custom chatbots. These services are designed to enhance productivity, accessibility, decision-making processes, <span class="hljs-keyword">and</span> operational efficiency through advanced technologies <span class="hljs-keyword">and</span> tailored solutions.

--- Example <span class="hljs-number">2</span>: Asking <span class="hljs-keyword">from</span> the SQL DB ---
Handing off to the NL2SQL Agent.
Calling sql_response_gen
There are <span class="hljs-number">8</span> employees <span class="hljs-keyword">in</span> the database.
</code></pre>
<p>Find all the Python Code Here: <a target="_blank" href="https://github.com/PradipNichite/Youtube-Tutorials/tree/main/OpenAI%20Swarm%20Tutorial">OpenAI Swarm Notebook</a></p>
<h3 id="heading-conclusion">Conclusion</h3>
<p><strong>OpenAI Swarm</strong> represents a valuable contribution to the growing field of multi-agent LLM frameworks. Its lightweight design and focus on user-friendliness make it a great starting point for developers exploring multi-agent orchestration. Swarm simplifies the process of building multi-agent systems with its straightforward approach to defining agents, their capabilities, and their interactions. The handoff mechanism, as illustrated in our previous examples, enables seamless transitions between agents, facilitating complex workflows.</p>
<p>However, it's crucial to recognize that Swarm is still in its experimental stages and primarily serves educational purposes. This means it may not possess the robust features or extensive functionalities of more established frameworks like LangChain, or CrewAI. For instance, Swarm's reliance solely on the OpenAI API may limit its flexibility and integration capabilities. Moreover, the absence of built-in memory management could pose challenges to making personalized agents.</p>
<p>If you found this guide helpful and you're looking to learn more then don’t forget to <a target="_blank" href="https://www.youtube.com/@AIDemosVideos">follow us</a>.</p>
<p>If you're looking to further enhance your skills in multi-agent app development with a more robust framework, I recommend checking out our <a target="_blank" href="https://blog.futuresmart.ai/langgraph-tutorial-for-beginners"><strong>LangGraph Tutorial for Beginners</strong></a></p>
<p>At <strong>FutureSmart AI</strong>, we specialize in helping companies build cutting-edge AI solutions similar to the ones discussed in this blog. To explore how we can assist your business, feel free to reach out to us at <a target="_blank" href="http://contact@futuresmart.ai"><strong>contact@futuresmart.ai</strong></a>.</p>
<p>For real-world examples of our work, take a look at our <a target="_blank" href="https://www.futuresmart.ai/case-studies">case studies</a>, where we showcase the practical value of our expertise.</p>
]]></content:encoded></item><item><title><![CDATA[How to Build LangGraph Agent with Long-term Memory]]></title><description><![CDATA[Introduction
The future of AI depends on solving a key issue with large language models (LLMs): their inability to retain long-term information during extended conversations. This limitation can lead to inaccurate responses in fields like healthcare ...]]></description><link>https://blog.futuresmart.ai/how-to-build-langgraph-agent-with-long-term-memory</link><guid isPermaLink="true">https://blog.futuresmart.ai/how-to-build-langgraph-agent-with-long-term-memory</guid><category><![CDATA[aimemory]]></category><category><![CDATA[langgraph]]></category><category><![CDATA[memory-management]]></category><category><![CDATA[conversational-ai]]></category><category><![CDATA[Machine Learning]]></category><dc:creator><![CDATA[Shreyas Dhaware]]></dc:creator><pubDate>Thu, 24 Oct 2024 09:48:49 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1729571068722/7f896b46-e7d0-41a9-af66-20d06844ab1d.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1 id="heading-introduction"><strong>Introduction</strong></h1>
<p>The future of AI depends on solving a key issue with large language models (LLMs): their inability to retain long-term information during extended conversations. This limitation can lead to inaccurate responses in fields like healthcare and customer support, where continuous context is crucial. Stateful or "memory-augmented" models offer a solution by storing and recalling information over time, enabling more seamless, context-aware interactions. These models provide richer insights, making them ideal for complex, data-driven environments requiring sustained attention and accuracy.</p>
<p>In the context of AI systems, <strong>short-term memory</strong> and <strong>long-term memory</strong> refer to how information is stored and retrieved across different interactions:</p>
<ol>
<li><p><strong>Short-term memory</strong>: It captures information from a single conversation or interaction thread. This memory is typically used to maintain context within a session and allows AI to recall recent actions, queries, and responses. In LangGraph, short-term memory is managed using a state <code>checkpointer</code>. The <code>checkpointer</code> saves a checkpoint of the graph state at every super-step within a Thread.</p>
</li>
<li><p><strong>Long-term memory</strong>: Long-Term Memory is used to retain information across multiple sessions or conversations. Long-term memory enables an AI to remember details about previous interactions even after a session ends. In LangGraph, long-term memory is persisted using a Memory Store. We can define a store to add Personalized Information about a User across threads.</p>
</li>
</ol>
<p>If you're interested in learning how to build and customize LangGraph agents for specific tasks or workflows, I recommend checking out <a target="_blank" href="https://blog.futuresmart.ai/langgraph-tutorial-for-beginners">LangGraph Tutorial: A Comprehensive Guide for Beginners</a></p>
<h1 id="heading-what-is-long-term-memory-in-langchain">What is Long Term Memory in Langchain</h1>
<p>Long Term Memory persists across different threads, allowing the AI to recall user preferences, instructions, or other important data. LangGraph handles long-term memory by saving it in custom "namespaces," which essentially reference specific sets of data stored as JSON documents. Each memory type is a Python class. This kind of memory can be useful for creating more personalized and adaptive user experiences.</p>
<p>A memory type is associated with a list of attributes such as a namespace, key, and value. The <strong>namespace</strong> helps organize and categorize the memory. However, namespaces are flexible and can represent anything, not limited to user-specific information. The <strong>key</strong> is used to represent the memory ID, acting as a unique identifier</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langgraph.store.memory <span class="hljs-keyword">import</span> InMemoryStore
in_memory_store = InMemoryStore()

user_id = <span class="hljs-string">"1"</span>
namespace_for_memory = (user_id, <span class="hljs-string">"memories"</span>)

memory_id = str(uuid.uuid4())
memory = {<span class="hljs-string">"food_preference"</span> : <span class="hljs-string">"I like pizza"</span>}
in_memory_store.put(namespace_for_memory, memory_id, memory)
</code></pre>
<pre><code class="lang-python">memories = in_memory_store.search(namespace_for_memory)
memories[<span class="hljs-number">-1</span>].dict()
{<span class="hljs-string">'value'</span>: {<span class="hljs-string">'food_preference'</span>: <span class="hljs-string">'I like pizza'</span>},
 <span class="hljs-string">'key'</span>: <span class="hljs-string">'07e0caf4-1631-47b7-b15f-65515d4c1843'</span>,
 <span class="hljs-string">'namespace'</span>: [<span class="hljs-string">'1'</span>, <span class="hljs-string">'memories'</span>],
 <span class="hljs-string">'created_at'</span>: <span class="hljs-string">'2024-10-02T17:22:31.590602+00:00'</span>,
 <span class="hljs-string">'updated_at'</span>: <span class="hljs-string">'2024-10-02T17:22:31.590605+00:00'</span>}
</code></pre>
<p>Explore how to enhance your <a target="_blank" href="https://blog.futuresmart.ai/integrating-mem0-with-langchain">Langchain applications with <strong>Mem0</strong></a>, a memory management system that personalizes AI interactions. This blog highlights Mem0's integration, showcasing its <strong>similarity search feature</strong></p>
<h1 id="heading-setting-up-your-environment">Setting Up Your Environment</h1>
<p>Before you begin, ensure you have the necessary prerequisites:</p>
<ul>
<li><p>Python environment with required libraries.</p>
</li>
<li><p>Access to <a target="_blank" href="https://platform.openai.com/docs/overview"><strong>OpenAI</strong></a> API keys.</p>
</li>
</ul>
<h2 id="heading-install-required-packages"><strong>Install Required Packages</strong></h2>
<p>Before setting up your environment, make sure to install the necessary packages. If you're using a Google Colab notebook, you can run the following command in a cell:</p>
<pre><code class="lang-python">pip install -U langgraph langchain_openai
</code></pre>
<h2 id="heading-setting-up-environment-variables">Setting up environment variables</h2>
<p>Once the packages are installed, set up your environment variables:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> getpass
<span class="hljs-keyword">import</span> os

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">_set_env</span>(<span class="hljs-params">var: str</span>):</span>
    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> os.environ.get(var):
        os.environ[var] = getpass.getpass(<span class="hljs-string">f"<span class="hljs-subst">{var}</span>: "</span>)

_set_env(<span class="hljs-string">"OPENAI_API_KEY"</span>)
</code></pre>
<h1 id="heading-building-long-term-memory-example">Building Long-Term Memory Example</h1>
<h2 id="heading-loading-checkpointer-and-storage-long-term-storage">Loading Checkpointer and Storage Long-Term Storage</h2>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langgraph.store.memory <span class="hljs-keyword">import</span> InMemoryStore

in_memory_store = InMemoryStore()
</code></pre>
<p>LangGraph provides an InMemoryStore system to manage long-term memory within conversational agents or workflows. It is not a Persistence Storage.</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> uuid
<span class="hljs-keyword">from</span> langchain_openai <span class="hljs-keyword">import</span> ChatOpenAI
<span class="hljs-keyword">from</span> langchain_core.runnables <span class="hljs-keyword">import</span> RunnableConfig
<span class="hljs-keyword">from</span> langgraph.graph <span class="hljs-keyword">import</span> StateGraph, MessagesState, START, END
<span class="hljs-keyword">from</span> langgraph.store.base <span class="hljs-keyword">import</span> BaseStore
<span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> Annotated, Optional
<span class="hljs-keyword">from</span> langchain_core.tools <span class="hljs-keyword">import</span> InjectedToolArg, tool
</code></pre>
<ul>
<li><p><code>import uuid</code>:</p>
<ul>
<li>This generates universally unique identifiers (UUIDs), often used to create unique references.</li>
</ul>
</li>
<li><p><code>from langgraph.graph import StateGraph, MessagesState, START, END</code>:</p>
<ul>
<li><p><code>StateGraph</code>: Represents a graph that tracks the state of a conversational agent. It allows for storing and handling the flow of state and data.</p>
</li>
<li><p><code>MessagesState</code>: Handles conversation history as part of the agent's memory and automatically appends every interaction to the state.</p>
</li>
</ul>
</li>
<li><p><code>from langchain_core.tools import InjectedToolArg, tool</code>:</p>
<ul>
<li><p>The <code>InjectedToolArg</code> annotation lets us hide certain parameters, like <code>user_id</code>, <code>config</code> from the model, ensuring they're injected automatically at runtime. This keeps sensitive data secure while simplifying the model's responsibilities.</p>
</li>
<li><p><code>tool</code>: A decorator used to define tools or functions that the agent can call as part of its process.</p>
</li>
</ul>
</li>
</ul>
<h2 id="heading-constructing-a-tool">Constructing a Tool</h2>
<pre><code class="lang-python"><span class="hljs-meta">@tool</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">upsert_memory</span>(<span class="hljs-params">
    content: str,
    context: str,
    memory_id: Optional[str] = None,
    *,
    config: Annotated[RunnableConfig, InjectedToolArg],
    store: Annotated[BaseStore, InjectedToolArg],
</span>):</span>
    <span class="hljs-string">"""Upsert a memory in the database.

    If a memory conflicts with an existing one, then just UPDATE the
    existing one by passing in memory_id - don't create two memories
    that are the same. If the user corrects a memory, UPDATE it.

    Args:
        content: The main content of the memory. For example:
            "User expressed interest in learning about French."
        context: Additional context for the memory. For example:
            "This was mentioned while discussing career options in Europe."
        memory_id: ONLY PROVIDE IF UPDATING AN EXISTING MEMORY.
        The memory to overwrite.
    """</span>
    mem_id = memory_id <span class="hljs-keyword">or</span> uuid.uuid4()
    user_id = config[<span class="hljs-string">"configurable"</span>][<span class="hljs-string">"user_id"</span>]
    store.put(
        (<span class="hljs-string">"memories"</span>, user_id),
        key=str(mem_id),
        value={<span class="hljs-string">"content"</span>: content, <span class="hljs-string">"context"</span>: context},
    )
    <span class="hljs-keyword">return</span> <span class="hljs-string">f"Stored memory <span class="hljs-subst">{content}</span>"</span>
</code></pre>
<p>The <code>upsert_memory</code> function either creates a new memory or updates an existing one if it already exists. It allows the user to store memories with two key attributes: <code>content</code> (the main memory) and <code>context</code> (additional information to help frame the memory).</p>
<p>Storing memories as a collection of documents simplifies certain tasks, making each memory more focused and easier to generate while improving recall, since creating new objects for new information is often more effective than trying to update one profile.</p>
<p>For more on profile-based and document-based memory organization, refer to the <a target="_blank" href="https://langchain-ai.github.io/langgraph/concepts/memory/?ref=blog.langchain.dev#representing-memories">official LangGraph documentation</a>. It provides insights into managing complexity in long-term memory updates and balancing efficient recall with accurate information storage.</p>
<h2 id="heading-setting-nodes-for-memory-storage">Setting Nodes for Memory Storage</h2>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">store_memory</span>(<span class="hljs-params">state: MessagesState, config: RunnableConfig, store: BaseStore</span>):</span>
    <span class="hljs-comment"># Extract tool calls from the last message</span>
    tool_calls = state[<span class="hljs-string">"messages"</span>][<span class="hljs-number">-1</span>].tool_calls
    saved_memories=[]
    <span class="hljs-keyword">for</span> tc <span class="hljs-keyword">in</span> tool_calls:
        content = tc[<span class="hljs-string">'args'</span>][<span class="hljs-string">'content'</span>]
        context = tc[<span class="hljs-string">'args'</span>][<span class="hljs-string">'context'</span>]
        saved_memories.append([
            upsert_memory.invoke({<span class="hljs-string">'content'</span>: content, <span class="hljs-string">'context'</span>: context, <span class="hljs-string">'config'</span>:config, <span class="hljs-string">'store'</span>:store})  
        ])
    print(<span class="hljs-string">"saved_memories: "</span>, saved_memories)

    results = [
        {
            <span class="hljs-string">"role"</span>: <span class="hljs-string">"tool"</span>,
            <span class="hljs-string">"content"</span>: mem[<span class="hljs-number">0</span>],
            <span class="hljs-string">"tool_call_id"</span>: tc[<span class="hljs-string">"id"</span>],
        }
        <span class="hljs-keyword">for</span> tc, mem <span class="hljs-keyword">in</span> zip(tool_calls, saved_memories)
    ]
    print(results)
    <span class="hljs-keyword">return</span> {<span class="hljs-string">"messages"</span>: results[<span class="hljs-number">0</span>]}
</code></pre>
<p>In LangGraph, nodes are key components of a workflow that perform distinct tasks, such as communicating with an LLM, interacting with external tools, or processing data. the function <code>store_memory</code> extracts tool calls from the latest message and uses them to save memories in <code>store</code>. Memory is saved by invoking the <code>upsert_memory</code> function, which takes content, and context as inputs. For Tool Message the stored Memory ID is formatted, with the role (<code>"tool"</code>), and the corresponding tool call ID, before being returned. Can return the memory content so the user knows what memory is being stored.</p>
<p>Writing memories in the hot path involves saving them while the application is running, offering benefits like real-time updates—making memory immediately available—and greater transparency. This builds trust, since the system explicitly decides what to save. However, there are downsides. The system is multitasking, balancing memory management with its primary task.</p>
<h3 id="heading-setting-nodes-for-agent">Setting Nodes for Agent</h3>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">call_model</span>(<span class="hljs-params">state: MessagesState, config: RunnableConfig, *, store: BaseStore</span>):</span>
    user_id = config[<span class="hljs-string">"configurable"</span>][<span class="hljs-string">"user_id"</span>]
    namespace = (<span class="hljs-string">"memories"</span>, user_id)
    memories = store.search(namespace)
    info = <span class="hljs-string">"\n"</span>.join(<span class="hljs-string">f"[<span class="hljs-subst">{mem.key}</span>]: <span class="hljs-subst">{mem.value}</span>"</span> <span class="hljs-keyword">for</span> mem <span class="hljs-keyword">in</span> memories)
    <span class="hljs-keyword">if</span> info:
        info = <span class="hljs-string">f"""
    &lt;memories&gt;
    <span class="hljs-subst">{info}</span>
    &lt;/memories&gt;"""</span>

    system_msg = <span class="hljs-string">f'''You are a helpful assistant talking to the user. You must decide whether to store information as memory from list of messages and then answer the user query or directly answer the user query
        User context info: <span class="hljs-subst">{info}</span>'''</span>
    print(<span class="hljs-string">"system_msg:"</span>, system_msg)
    <span class="hljs-comment"># Store new memories if the user asks the model to remember</span>
    last_message = state[<span class="hljs-string">"messages"</span>][<span class="hljs-number">-1</span>]
    print( [{<span class="hljs-string">"type"</span>: <span class="hljs-string">"system"</span>, <span class="hljs-string">"content"</span>: system_msg}] + state[<span class="hljs-string">"messages"</span>])
    response = model.bind_tools([upsert_memory]).invoke(
        [{<span class="hljs-string">"type"</span>: <span class="hljs-string">"system"</span>, <span class="hljs-string">"content"</span>: system_msg}] + state[<span class="hljs-string">"messages"</span>]
    )
    <span class="hljs-keyword">return</span> {<span class="hljs-string">"messages"</span>: response}
</code></pre>
<p>The <code>call_model</code> function retrieves stored memories for a specific user from a storage (<code>store</code>), formats these memories as part of a system message, and uses that information to provide context for generating a response to the user's query. The memories are fetched based on the user ID and are then formatted into a structured XML-like block (<code>&lt;memories&gt;</code> tag) to be included in the system message.</p>
<p>The system message is intended to inform the model of any relevant context from prior interactions to help it determine whether to update the memory or simply respond to the current user query. Finally, the function returns the generated response.</p>
<h2 id="heading-conditional-edge-logic">Conditional Edge Logic</h2>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">route_message</span>(<span class="hljs-params">state: MessagesState</span>):</span>
    <span class="hljs-string">"""Determine the next step based on the presence of tool calls."""</span>
    msg = state[<span class="hljs-string">"messages"</span>][<span class="hljs-number">-1</span>]
    <span class="hljs-keyword">if</span> msg.tool_calls:
        <span class="hljs-comment"># If there are tool calls, we need to store memories</span>
        <span class="hljs-keyword">return</span> <span class="hljs-string">"store_memory"</span>
    <span class="hljs-comment"># Otherwise, finish; user can send the next message</span>
    <span class="hljs-keyword">return</span> END
</code></pre>
<p>The <code>route_message</code> function determines the next action based on the presence of tool calls in the latest message. If tool calls exist, it directs the workflow to store memories by returning "store_memory"; otherwise, it concludes the current process by returning <code>END</code>.</p>
<h2 id="heading-loading-and-compiling-graph">Loading and Compiling Graph</h2>
<pre><code class="lang-python">builder = StateGraph(MessagesState)

builder.add_node(<span class="hljs-string">"call_model"</span>, call_model)
builder.add_node(store_memory)

builder.add_edge(START, <span class="hljs-string">"call_model"</span>)
builder.add_conditional_edges(<span class="hljs-string">"call_model"</span>, route_message, [<span class="hljs-string">"store_memory"</span>, END])
builder.add_edge(<span class="hljs-string">"store_memory"</span>, <span class="hljs-string">"call_model"</span>)

graph = builder.compile(store=in_memory_store)
</code></pre>
<p><code>StateGraph</code> is constructed by adding nodes representing different actions like <code>call_model</code> and <code>store_memory</code>, with conditional routing between them based on the <code>route_message</code> function. After defining the workflow, the graph is compiled into a runnable instance using the provided <code>in_memory_store</code>, enabling the execution of the defined conversational flow. Optionally, a checkpointer <code>MemorySaver</code> can be used for short-term memory handling.</p>
<h2 id="heading-visualize-the-graph"><strong>Visualize the Graph:</strong></h2>
<p>By running a simple Python code, you can visualize the graph with nodes and edges.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1729364267835/e44e3859-4f70-4f43-a55e-1781d8295474.jpeg" alt class="image--center mx-auto" /></p>
<blockquote>
<p>Source: Author</p>
</blockquote>
<h2 id="heading-run-the-graph">Run the graph!</h2>
<pre><code class="lang-python">config = {<span class="hljs-string">"configurable"</span>: {<span class="hljs-string">"thread_id"</span>: <span class="hljs-string">"1"</span>, <span class="hljs-string">"user_id"</span>: <span class="hljs-string">"1"</span>}}
input_message = {<span class="hljs-string">"type"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"Hi! My name is Bob. I love keep updated on Latest Tech"</span>}
<span class="hljs-keyword">for</span> chunk <span class="hljs-keyword">in</span> graph.stream({<span class="hljs-string">"messages"</span>: [input_message]}, config, stream_mode=<span class="hljs-string">"values"</span>):
    chunk[<span class="hljs-string">"messages"</span>][<span class="hljs-number">-1</span>].pretty_print()
</code></pre>
<p>This code initiates a conversation by passing the user's input (<code>input_message</code>) and configuration to the compiled <code>graph</code>. The conversation flow is executed in streaming mode, with the chatbot processing the message and outputting a response, which is printed. To improve this, a loop can be added to continuously interact with the user, feeding their inputs into the graph and displaying the chatbot's responses, enabling an ongoing conversation instead of a single exchange.</p>
<p><strong>Output</strong></p>
<pre><code class="lang-yaml"><span class="hljs-string">================================</span> <span class="hljs-string">Human</span> <span class="hljs-string">Message</span> <span class="hljs-string">=================================</span>

<span class="hljs-string">Hi!</span> <span class="hljs-string">My</span> <span class="hljs-string">name</span> <span class="hljs-string">is</span> <span class="hljs-string">Bob.</span> <span class="hljs-string">I</span> <span class="hljs-string">love</span> <span class="hljs-string">keep</span> <span class="hljs-string">updated</span> <span class="hljs-string">on</span> <span class="hljs-string">Latest</span> <span class="hljs-string">Tech</span>
<span class="hljs-attr">system_msg:</span> <span class="hljs-string">You</span> <span class="hljs-string">are</span> <span class="hljs-string">a</span> <span class="hljs-string">helpful</span> <span class="hljs-string">assistant</span> <span class="hljs-string">talking</span> <span class="hljs-string">to</span> <span class="hljs-string">the</span> <span class="hljs-string">user.</span> <span class="hljs-string">You</span> <span class="hljs-string">must</span> <span class="hljs-string">decide</span> <span class="hljs-string">whether</span> <span class="hljs-string">to</span> <span class="hljs-string">store</span> <span class="hljs-string">information</span> <span class="hljs-string">as</span> <span class="hljs-string">memory</span> <span class="hljs-string">from</span> <span class="hljs-string">list</span> <span class="hljs-string">of</span> <span class="hljs-string">messages</span> <span class="hljs-string">and</span> <span class="hljs-string">then</span> <span class="hljs-string">answer</span> <span class="hljs-string">the</span> <span class="hljs-string">user</span> <span class="hljs-string">query</span> <span class="hljs-string">or</span> <span class="hljs-string">directly</span> <span class="hljs-string">answer</span> <span class="hljs-string">the</span> <span class="hljs-string">user</span> <span class="hljs-string">query</span>
        <span class="hljs-attr">User context info:</span> 
<span class="hljs-string">==================================</span> <span class="hljs-string">Ai</span> <span class="hljs-string">Message</span> <span class="hljs-string">==================================</span>
<span class="hljs-attr">Tool Calls:</span>
  <span class="hljs-string">upsert_memory</span> <span class="hljs-string">(call_DEstE5sEOltuEE4lbfcgZPar)</span>
 <span class="hljs-attr">Call ID:</span> <span class="hljs-string">call_DEstE5sEOltuEE4lbfcgZPar</span>
  <span class="hljs-attr">Args:</span>
    <span class="hljs-attr">content:</span> <span class="hljs-string">User's</span> <span class="hljs-string">name</span> <span class="hljs-string">is</span> <span class="hljs-string">Bob</span> <span class="hljs-string">and</span> <span class="hljs-string">he</span> <span class="hljs-string">loves</span> <span class="hljs-string">keeping</span> <span class="hljs-string">updated</span> <span class="hljs-string">on</span> <span class="hljs-string">the</span> <span class="hljs-string">latest</span> <span class="hljs-string">tech.</span>
    <span class="hljs-attr">context:</span> <span class="hljs-string">Bob</span> <span class="hljs-string">introduced</span> <span class="hljs-string">himself</span> <span class="hljs-string">and</span> <span class="hljs-string">shared</span> <span class="hljs-string">his</span> <span class="hljs-string">interest</span> <span class="hljs-string">in</span> <span class="hljs-string">technology.</span>
<span class="hljs-string">=================================</span> <span class="hljs-string">Tool</span> <span class="hljs-string">Message</span> <span class="hljs-string">=================================</span>

<span class="hljs-string">Stored</span> <span class="hljs-string">memory</span> <span class="hljs-string">User's</span> <span class="hljs-string">name</span> <span class="hljs-string">is</span> <span class="hljs-string">Bob</span> <span class="hljs-string">and</span> <span class="hljs-string">he</span> <span class="hljs-string">loves</span> <span class="hljs-string">keeping</span> <span class="hljs-string">updated</span> <span class="hljs-string">on</span> <span class="hljs-string">the</span> <span class="hljs-string">latest</span> <span class="hljs-string">tech.</span>
<span class="hljs-attr">system_msg:</span> <span class="hljs-string">You</span> <span class="hljs-string">are</span> <span class="hljs-string">a</span> <span class="hljs-string">helpful</span> <span class="hljs-string">assistant</span> <span class="hljs-string">talking</span> <span class="hljs-string">to</span> <span class="hljs-string">the</span> <span class="hljs-string">user.</span> <span class="hljs-string">You</span> <span class="hljs-string">must</span> <span class="hljs-string">decide</span> <span class="hljs-string">whether</span> <span class="hljs-string">to</span> <span class="hljs-string">store</span> <span class="hljs-string">information</span> <span class="hljs-string">as</span> <span class="hljs-string">memory</span> <span class="hljs-string">from</span> <span class="hljs-string">list</span> <span class="hljs-string">of</span> <span class="hljs-string">messages</span> <span class="hljs-string">and</span> <span class="hljs-string">then</span> <span class="hljs-string">answer</span> <span class="hljs-string">the</span> <span class="hljs-string">user</span> <span class="hljs-string">query</span> <span class="hljs-string">or</span> <span class="hljs-string">directly</span> <span class="hljs-string">answer</span> <span class="hljs-string">the</span> <span class="hljs-string">user</span> <span class="hljs-string">query</span>
        <span class="hljs-attr">User context info:</span> 
    <span class="hljs-string">&lt;memories&gt;</span>
    [<span class="hljs-string">7a5a3425-4617-4794-9207-0dd7ffe28f05</span>]<span class="hljs-string">:</span> {<span class="hljs-attr">'content':</span> <span class="hljs-string">"User's name is Bob and he loves keeping updated on the latest tech."</span>, <span class="hljs-attr">'context':</span> <span class="hljs-string">'Bob introduced himself and shared his interest in technology.'</span>}
    <span class="hljs-string">&lt;/memories&gt;</span>
<span class="hljs-string">==================================</span> <span class="hljs-string">Ai</span> <span class="hljs-string">Message</span> <span class="hljs-string">==================================</span>

<span class="hljs-string">Hi</span> <span class="hljs-string">Bob!</span> <span class="hljs-string">It's</span> <span class="hljs-string">great</span> <span class="hljs-string">to</span> <span class="hljs-string">meet</span> <span class="hljs-string">someone</span> <span class="hljs-string">who</span> <span class="hljs-string">loves</span> <span class="hljs-string">staying</span> <span class="hljs-string">updated</span> <span class="hljs-string">on</span> <span class="hljs-string">the</span> <span class="hljs-string">latest</span> <span class="hljs-string">tech.</span> <span class="hljs-string">How</span> <span class="hljs-string">can</span> <span class="hljs-string">I</span> <span class="hljs-string">assist</span> <span class="hljs-string">you</span> <span class="hljs-string">today?</span>
</code></pre>
<h1 id="heading-conclusion">Conclusion</h1>
<p>Integrating long-term memory into AI systems marks a major step forward in enhancing language models (LLMs). Managing long-term memory—whether in real-time (hot path) or in the background—balances performance with data retention. Techniques like profile-based and document-based memory organization also improve the system's ability to store and recall relevant information.</p>
<p>By integrating nodes for memory storage, retrieval, and conversational processing, developers can build more context-aware agents capable of updating and refining user information in real-time. The use of conditional edge logic ensures that memory storage and retrieval only occur when needed, balancing performance with accuracy.</p>
<p>To build a reliable LLM agent, leveraging both short-term and long-term memory is crucial. Short-term memory helps maintain session context, while long-term memory allows the agent to recall past interactions, leading to more personalized and accurate responses over time. Combining these ensures more intelligent, responsive, and user-friendly applications.</p>
<h1 id="heading-resources-and-references"><strong>Resources and References</strong></h1>
<ol>
<li><p><a target="_blank" href="https://langchain-ai.github.io/langgraph/">Langgraph Documentation</a></p>
</li>
<li><p><a target="_blank" href="https://github.com/langchain-ai/memory-agent">Langgraph Github</a></p>
</li>
<li><p><a target="_blank" href="https://github.com/PradipNichite/FutureSmart-AI-Blog/blob/main/Langgraph%20Long-Term%20Memory%20Blog/Langgraph%20Long-Term%20Memory.ipynb">Code implementation</a></p>
</li>
</ol>
<p>Follow FutureSmart AI to stay up-to-date with the latest and most fascinating AI-related <a target="_blank" href="https://blog.futuresmart.ai/">blogs - <strong>FutureSmart A</strong></a>I</p>
<p>Looking to catch up on the latest AI tools and applications? Look no further than <a target="_blank" href="https://www.aidemos.com/"><strong>AI Demo</strong></a><a target="_blank" href="https://blog.futuresmart.ai/"><strong>s</strong></a> This directory features a wide range of video demonstrations showcasing the latest and most innovative AI technologies.</p>
<h1 id="heading-next-steps-bringing-ai-into-your-business"><strong>Next</strong> <strong>Steps: Bringing AI</strong> <strong>into Your Business</strong></h1>
<p>Whether you're looking to integrate cutting-edge NLP models or deploy multimodal AI systems, we're here to support your journey. Reach out to us at <strong>contact@futuresmart.ai</strong> to learn more about how we can help.</p>
<p>Don't forget to check out our <a target="_blank" href="https://www.futuresmart.ai/case-studies">futuresmart.ai/case-studies</a> to see how we've successfully partnered with companies to implement transformative AI solutions.</p>
<p>Let us help you take the next step in your AI journey.</p>
]]></content:encoded></item><item><title><![CDATA[How to Fine-Tune OpenAI GPT-4o Model: A Step-by-Step Guide]]></title><description><![CDATA[Fine-tuning OpenAI models like GPT-4 has become essential for businesses looking to create specialized AI applications. In this comprehensive guide, you'll learn exactly how to fine-tune OpenAI models using Python, complete with working code examples...]]></description><link>https://blog.futuresmart.ai/how-to-fine-tune-openai-gpt-4o-model-a-step-by-step-guide</link><guid isPermaLink="true">https://blog.futuresmart.ai/how-to-fine-tune-openai-gpt-4o-model-a-step-by-step-guide</guid><category><![CDATA[GPT-4o]]></category><category><![CDATA[finetuning]]></category><category><![CDATA[openai]]></category><dc:creator><![CDATA[Pradip Nichite]]></dc:creator><pubDate>Wed, 23 Oct 2024 18:19:57 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1729706880732/c65d263c-49ea-40f3-b0d8-f381778e5739.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Fine-tuning OpenAI models like GPT-4 has become essential for businesses looking to create specialized AI applications. In this comprehensive guide, you'll learn exactly how to fine-tune OpenAI models using Python, complete with working code examples and real-world applications.</p>
<p>Fine-tuning allows you to improve model performance on specific tasks by training it with your own examples. Instead of relying on complex prompts, you can teach the model to understand your exact requirements through demonstration.</p>
<h2 id="heading-why-consider-fine-tuning">Why Consider Fine-Tuning?</h2>
<p>Three main benefits make fine-tuning particularly valuable:</p>
<ol>
<li><p><strong>Higher Quality Results</strong> When basic prompting isn't enough to get consistent results, fine-tuning helps the model adapt to your specific business use case.</p>
</li>
<li><p><strong>Handling Multiple Examples</strong> Sometimes your use case requires many examples that won't fit in a standard prompt. Fine-tuning lets you incorporate all these examples into the model itself.</p>
</li>
<li><p><strong>Simplified Prompts</strong> After fine-tuning, you can achieve better results with shorter prompts, leading to faster execution times.</p>
</li>
</ol>
<h3 id="heading-real-world-applications">Real-World Applications</h3>
<p>Fine-tuning particularly shines in two common scenarios:</p>
<ol>
<li><p><strong>Style and Format Consistency</strong></p>
<ul>
<li><p>Generating product descriptions with specific tones</p>
</li>
<li><p>Maintaining consistent writing styles</p>
</li>
<li><p>Formatting outputs in particular ways</p>
</li>
</ul>
</li>
<li><p><strong>Reliable Structured Output</strong></p>
<ul>
<li><p>Converting unstructured text into JSON</p>
</li>
<li><p>Extracting specific fields consistently</p>
</li>
<li><p>Ensuring standardized response formats</p>
</li>
</ul>
</li>
</ol>
<p>For example, in our medical data extraction implementation, we'll turn medical reports into structured data:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Example of the structured output we want to achieve</span>
{
    <span class="hljs-string">"patient name"</span>: <span class="hljs-string">"Sarah Johnson"</span>,
    <span class="hljs-string">"age"</span>: <span class="hljs-number">32</span>,
    <span class="hljs-string">"diagnosis"</span>: <span class="hljs-string">"migraine headaches"</span>,
    <span class="hljs-string">"prescribed medication"</span>: <span class="hljs-string">"Sumatriptan"</span>
}
</code></pre>
<h2 id="heading-understanding-the-fine-tuning-process">Understanding the Fine-Tuning Process</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1729703534186/550e238b-a0e1-4135-9ba1-7787c04c608e.png" alt class="image--center mx-auto" /></p>
<p>The process involves several stages:</p>
<ol>
<li><p>File validation</p>
</li>
<li><p>Job queuing</p>
</li>
<li><p>Training initialization</p>
</li>
<li><p>Training progress</p>
</li>
<li><p>Completion or failure</p>
</li>
</ol>
<h2 id="heading-preparing-data-for-fine-tuning-openai-models">Preparing Data for Fine-Tuning OpenAI Models</h2>
<h3 id="heading-understanding-the-data-format">Understanding the Data Format</h3>
<p>Before we start fine-tuning, we need to prepare our data in the correct format. For chat models like GPT-4, the training data should be in JSONL format (JSON Lines), where each line represents a conversation with messages between the system, user, and assistant.</p>
<p>Here's what a single training example looks like:</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"messages"</span>: [
    {
      <span class="hljs-attr">"role"</span>: <span class="hljs-string">"system"</span>,
      <span class="hljs-attr">"content"</span>: <span class="hljs-string">"Extract Details from medical report"</span>
    },
    {
      <span class="hljs-attr">"role"</span>: <span class="hljs-string">"user"</span>, 
      <span class="hljs-attr">"content"</span>: <span class="hljs-string">"Sarah Johnson, a 32-year-old female, presented to the clinic with complaints of severe migraine headaches occurring 3-4 times per week for the past 3 months. Patient reports throbbing pain on the right side of head, accompanied by photophobia and nausea. No previous history of migraines. Family history positive for migraines (mother). Physical examination revealed normal neurological findings. Blood pressure 118/76 mmHg. Prescribed Sumatriptan 50mg for acute episodes, with instructions to take at onset of symptoms."</span>
    },
    {
      <span class="hljs-attr">"role"</span>: <span class="hljs-string">"assistant"</span>,
      <span class="hljs-attr">"content"</span>: <span class="hljs-string">"{\"patient name\": \"Sarah Johnson\", \"age\": 32, \"diagnosis\": \"migraine headaches\", \"prescribed medication\": \"Sumatriptan\"}"</span>
    }
  ]
}
</code></pre>
<h3 id="heading-converting-csv-to-jsonl-format">Converting CSV to JSONL Format</h3>
<p>Let's write a Python function to convert our medical records CSV into the required JSONL format:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> csv
<span class="hljs-keyword">import</span> json

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">convert_csv_to_training_format</span>(<span class="hljs-params">input_csv, output_file</span>):</span>
    system_message = {
        <span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>,
        <span class="hljs-string">"content"</span>: <span class="hljs-string">"Extract Details from medical report"</span>
    }

    <span class="hljs-keyword">with</span> open(input_csv, <span class="hljs-string">'r'</span>, encoding=<span class="hljs-string">'utf-8'</span>) <span class="hljs-keyword">as</span> csvfile, \
         open(output_file, <span class="hljs-string">'w'</span>, encoding=<span class="hljs-string">'utf-8'</span>) <span class="hljs-keyword">as</span> outfile:
        reader = csv.reader(csvfile)
        next(reader)  <span class="hljs-comment"># Skip header</span>

        <span class="hljs-keyword">for</span> row <span class="hljs-keyword">in</span> reader:
            medical_report = row[<span class="hljs-number">0</span>]
            extracted_json = row[<span class="hljs-number">1</span>]

            training_example = {
                <span class="hljs-string">"messages"</span>: [
                    system_message,
                    {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: medical_report},
                    {<span class="hljs-string">"role"</span>: <span class="hljs-string">"assistant"</span>, <span class="hljs-string">"content"</span>: extracted_json}
                ]
            }
            outfile.write(json.dumps(training_example) + <span class="hljs-string">'\n'</span>)
</code></pre>
<h2 id="heading-creating-training-and-validation-sets">Creating Training and Validation Sets</h2>
<p>It's good practice to split your data into training and validation sets. Here's how to prepare both:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Prepare training data</span>
convert_csv_to_training_format(<span class="hljs-string">"medical-records.csv"</span>, <span class="hljs-string">"training_data.jsonl"</span>)

<span class="hljs-comment"># Prepare validation data</span>
convert_csv_to_training_format(<span class="hljs-string">"validation-medical-records.csv"</span>, <span class="hljs-string">"validation_data.jsonl"</span>)
</code></pre>
<h3 id="heading-key-components-of-the-training-data">Key Components of the Training Data</h3>
<ol>
<li><p><strong>System Message</strong>: A consistent instruction that sets the context for all examples</p>
</li>
<li><p><strong>User Messages</strong>: The input medical reports</p>
</li>
<li><p><strong>Assistant Messages</strong>: The expected structured JSON output</p>
</li>
<li><p><strong>JSONL Format</strong>: Each line is a complete, valid JSON object</p>
</li>
</ol>
<h3 id="heading-best-practices-for-data-preparation">Best Practices for Data Preparation</h3>
<ol>
<li><p>Keep the system message consistent across all examples</p>
</li>
<li><p>Ensure your JSON is properly formatted in the assistant responses</p>
</li>
<li><p>Include diverse examples to help the model generalize</p>
</li>
<li><p>Use validation data to monitor training progress</p>
</li>
</ol>
<h2 id="heading-setting-up-and-starting-the-fine-tuning-process">Setting Up and Starting the Fine-Tuning Process</h2>
<h3 id="heading-initial-setup-with-openai">Initial Setup with OpenAI</h3>
<p>First, let's set up the OpenAI client and necessary imports:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> openai <span class="hljs-keyword">import</span> OpenAI
<span class="hljs-keyword">from</span> time <span class="hljs-keyword">import</span> sleep

<span class="hljs-comment"># Initialize OpenAI client</span>
client = OpenAI(api_key = your_api_key)
</code></pre>
<h3 id="heading-step-1-uploading-training-files">Step 1: Uploading Training Files</h3>
<p>Before we can start fine-tuning, we need to upload our prepared data files to OpenAI. Here's the function to handle file uploads:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">upload_training_file</span>(<span class="hljs-params">file_path</span>):</span>
    <span class="hljs-string">"""Upload training file to OpenAI"""</span>
    <span class="hljs-keyword">with</span> open(file_path, <span class="hljs-string">"rb"</span>) <span class="hljs-keyword">as</span> file:
        response = client.files.create(
            file=file,
            purpose=<span class="hljs-string">"fine-tune"</span>
        )
        <span class="hljs-keyword">return</span> response.id

<span class="hljs-comment"># Upload both training and validation files</span>
training_file_id = upload_training_file(<span class="hljs-string">"training_data.jsonl"</span>)
validation_file_id = upload_training_file(<span class="hljs-string">"validation_data.jsonl"</span>)
</code></pre>
<h3 id="heading-step-2-creating-a-fine-tuning-job">Step 2: Creating a Fine-Tuning Job</h3>
<p>Once our files are uploaded, we can create a fine-tuning job:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create_fine_tuning_job</span>(<span class="hljs-params">training_file_id, validation_file_id=None, model=<span class="hljs-string">"gpt-4o-mini-2024-07-18"</span></span>):</span>
    <span class="hljs-string">"""Create a fine-tuning job"""</span>
    response = client.fine_tuning.jobs.create(
        training_file=training_file_id,
        validation_file=validation_file_id,
        model=model
    )
    <span class="hljs-keyword">return</span> response.id

<span class="hljs-comment"># Start the fine-tuning job</span>
job_id = create_fine_tuning_job(training_file_id, validation_file_id, model)
</code></pre>
<h3 id="heading-step-3-monitoring-training-progress">Step 3: Monitoring Training Progress</h3>
<p>Fine-tuning isn't instant - we need to monitor the job's progress. Here's a function to track the status:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">monitor_job</span>(<span class="hljs-params">job_id</span>):</span>
    <span class="hljs-string">"""Monitor fine-tuning job progress"""</span>
    <span class="hljs-keyword">while</span> <span class="hljs-literal">True</span>:
        job = client.fine_tuning.jobs.retrieve(job_id)
        print(<span class="hljs-string">f"Status: <span class="hljs-subst">{job.status}</span>"</span>)

        <span class="hljs-keyword">if</span> job.status <span class="hljs-keyword">in</span> [<span class="hljs-string">"succeeded"</span>, <span class="hljs-string">"failed"</span>]:
            <span class="hljs-keyword">return</span> job

        <span class="hljs-comment"># List latest events</span>
        events = client.fine_tuning.jobs.list_events(
            fine_tuning_job_id=job_id,
            limit=<span class="hljs-number">5</span>
        )
        <span class="hljs-keyword">for</span> event <span class="hljs-keyword">in</span> events.data:
            print(<span class="hljs-string">f"Event: <span class="hljs-subst">{event.message}</span>"</span>)

        sleep(<span class="hljs-number">30</span>)  <span class="hljs-comment"># Check every 30 seconds</span>

<span class="hljs-comment"># Monitor the job until completion</span>
job = monitor_job(job_id)
<span class="hljs-keyword">if</span> job.status == <span class="hljs-string">"succeeded"</span>:
    fine_tuned_model = job.fine_tuned_model
    print(<span class="hljs-string">f"Fine-tuned model ID: <span class="hljs-subst">{fine_tuned_model}</span>"</span>)
<span class="hljs-keyword">else</span>:
    print(<span class="hljs-string">"Fine-tuning failed."</span>)
</code></pre>
<h3 id="heading-important-notes">Important Notes:</h3>
<ul>
<li><p>The process typically takes several minutes to complete</p>
</li>
<li><p>Keep track of your fine-tuned model ID for later use</p>
</li>
<li><p>The status updates help monitor training progress</p>
</li>
<li><p>The job either succeeds or fails - no partial completions</p>
</li>
</ul>
<h2 id="heading-testing-and-using-your-fine-tuned-model">Testing and Using Your Fine-Tuned Model</h2>
<h3 id="heading-making-predictions-with-your-model">Making Predictions with Your Model</h3>
<p>Let's test our fine-tuned model with a new medical report. Here's how to use it:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_model</span>(<span class="hljs-params">model_id, test_input</span>):</span>
    <span class="hljs-string">"""Test the fine-tuned model"""</span>
    completion = client.chat.completions.create(
        model=model_id,
        messages=[
            {
                <span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>,
                <span class="hljs-string">"content"</span>: <span class="hljs-string">"Extract Details from medical report"</span>
            },
            {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: test_input}
        ]
    )
    <span class="hljs-keyword">return</span> completion.choices[<span class="hljs-number">0</span>].message
</code></pre>
<p>Let's try it with a new medical report:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Test input</span>
test_report = <span class="hljs-string">"""Marcus Wong, a 19-year-old male, presents with severe acne 
on face and upper back present for 1 year. Multiple inflammatory papules 
and nodules noted on examination. Previous trials of over-the-counter 
treatments ineffective. Started on Isotretinoin 40mg daily with monthly 
liver function monitoring."""</span>

<span class="hljs-comment"># Get prediction</span>
result = test_model(fine_tuned_model, test_report)

<span class="hljs-comment"># Parse the JSON response</span>
<span class="hljs-keyword">import</span> json
extracted_data = json.loads(result.content)
print(json.dumps(extracted_data, indent=<span class="hljs-number">2</span>))
</code></pre>
<p>Output:</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"patient name"</span>: <span class="hljs-string">"Marcus Wong"</span>,
  <span class="hljs-attr">"age"</span>: <span class="hljs-number">19</span>,
  <span class="hljs-attr">"diagnosis"</span>: <span class="hljs-string">"severe acne"</span>,
  <span class="hljs-attr">"prescribed medication"</span>: <span class="hljs-string">"Isotretinoin"</span>
}
</code></pre>
<h3 id="heading-improved-performance-benefits">Improved Performance Benefits</h3>
<p>The fine-tuned model shows several improvements:</p>
<ol>
<li><p><strong>Simpler Prompts</strong>: Notice we don't need complex instructions anymore</p>
</li>
<li><p><strong>Consistent Output</strong>: The model maintains the exact JSON structure we trained it for</p>
</li>
<li><p><strong>Faster Processing</strong>: Shorter prompts mean quicker responses</p>
</li>
<li><p><strong>Reliable Extraction</strong>: Accurately pulls out key information even from complex medical text</p>
</li>
</ol>
<h3 id="heading-best-practices-for-using-the-model">Best Practices for Using the Model</h3>
<ol>
<li><strong>Input Consistency</strong>:</li>
</ol>
<pre><code class="lang-python"><span class="hljs-comment"># Keep system message consistent with training</span>
messages=[
    {<span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"Extract Details from medical report"</span>},
    {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: medical_report}
]
</code></pre>
<ol start="2">
<li><strong>Error Handling</strong>:</li>
</ol>
<pre><code class="lang-python"><span class="hljs-keyword">try</span>:
    extracted_data = json.loads(result.content)
<span class="hljs-keyword">except</span> json.JSONDecodeError:
    print(<span class="hljs-string">"Error: Invalid JSON response"</span>)
</code></pre>
<ol start="3">
<li><strong>Validation</strong>:</li>
</ol>
<pre><code class="lang-python">required_fields = [<span class="hljs-string">"patient name"</span>, <span class="hljs-string">"age"</span>, <span class="hljs-string">"diagnosis"</span>, <span class="hljs-string">"prescribed medication"</span>]
missing_fields = [field <span class="hljs-keyword">for</span> field <span class="hljs-keyword">in</span> required_fields <span class="hljs-keyword">if</span> field <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> extracted_data]
<span class="hljs-keyword">if</span> missing_fields:
    print(<span class="hljs-string">f"Warning: Missing fields: <span class="hljs-subst">{missing_fields}</span>"</span>)
</code></pre>
<h2 id="heading-conclusion-building-custom-ai-solutions-with-fine-tuning">Conclusion: Building Custom AI Solutions with Fine-Tuning</h2>
<h3 id="heading-real-success-stories-from-futuresmart-ai">Real Success Stories from FutureSmart AI</h3>
<p>At FutureSmart AI, we've successfully implemented fine-tuned models for various client requirements. Here are some of our proven use cases:</p>
<h3 id="heading-1-product-description-generation">1. Product Description Generation</h3>
<p>We fine-tuned a model for precision product description generation, achieving significant improvements in:</p>
<ul>
<li><p>Maintaining consistent brand tone</p>
</li>
<li><p>Following specific style guidelines</p>
</li>
<li><p>Generating market-ready content</p>
</li>
</ul>
<h3 id="heading-2-structured-data-extraction">2. Structured Data Extraction</h3>
<p>Our team developed a reliable JSON output model that:</p>
<ul>
<li><p>Consistently extracts structured data</p>
</li>
<li><p>Reduces the need for fallback mechanisms</p>
</li>
<li><p>Maintains high accuracy across diverse inputs</p>
</li>
</ul>
<h2 id="heading-interested-in-learning-more">Interested in Learning More?</h2>
<p><strong>Explore How We Can Help Your Business:</strong></p>
<ul>
<li><p><strong>Case Studies</strong>: Visit <a target="_blank" href="https://www.futuresmart.ai/case-studies">FutureSmart AI Case Studies</a> to see real-world applications of our NLP solutions</p>
</li>
<li><p><strong>Contact Us</strong>: Have a project in mind? Reach out at <a target="_blank" href="mailto:contact@futuresmart.ai">contact@futuresmart.ai</a></p>
</li>
</ul>
<p>Let us help you transform your business with custom AI solutions tailored to your specific needs.</p>
<p><a target="_blank" href="https://github.com/PradipNichite/Youtube-Tutorials/tree/main/Finetune%20Open%20AI%20GPT-4o">Code and Data</a></p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/jiYqbEDPw7A">https://youtu.be/jiYqbEDPw7A</a></div>
]]></content:encoded></item><item><title><![CDATA[Building a User-Friendly Interface with Streamlit for Our RAG Chatbot]]></title><description><![CDATA[In this section, we'll explore how we've used Streamlit to create an intuitive front-end for our RAG chatbot. This interface allows users to interact with the chatbot, upload documents, and manage their document collection, all while communicating se...]]></description><link>https://blog.futuresmart.ai/building-a-user-friendly-interface-with-streamlit-for-our-rag-chatbot</link><guid isPermaLink="true">https://blog.futuresmart.ai/building-a-user-friendly-interface-with-streamlit-for-our-rag-chatbot</guid><category><![CDATA[streamlit chatbot]]></category><category><![CDATA[streamlit]]></category><category><![CDATA[rag chatbot]]></category><dc:creator><![CDATA[Pradip Nichite]]></dc:creator><pubDate>Mon, 21 Oct 2024 10:36:32 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1729507030210/cdbfd8c8-6162-479c-8f97-8ffd81e753d1.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In this section, we'll explore how we've used Streamlit to create an intuitive front-end for our RAG chatbot. This interface allows users to interact with the chatbot, upload documents, and manage their document collection, all while communicating seamlessly with our FastAPI backend.</p>
<h3 id="heading-1-main-application-structure-streamlitapppyhttpapppy">1. Main Application Structure (streamlit_<a target="_blank" href="http://app.py">app.py</a>)</h3>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> streamlit <span class="hljs-keyword">as</span> st
<span class="hljs-keyword">from</span> sidebar <span class="hljs-keyword">import</span> display_sidebar
<span class="hljs-keyword">from</span> chat_interface <span class="hljs-keyword">import</span> display_chat_interface

st.title(<span class="hljs-string">"Langchain RAG Chatbot"</span>)

<span class="hljs-comment"># Initialize session state variables</span>
<span class="hljs-keyword">if</span> <span class="hljs-string">"messages"</span> <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> st.session_state:
    st.session_state.messages = []

<span class="hljs-keyword">if</span> <span class="hljs-string">"session_id"</span> <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> st.session_state:
    st.session_state.session_id = <span class="hljs-literal">None</span>

<span class="hljs-comment"># Display the sidebar</span>
display_sidebar()

<span class="hljs-comment"># Display the chat interface</span>
display_chat_interface()
</code></pre>
<p>This file serves as the entry point for our Streamlit application. Here's what it does:</p>
<ul>
<li><p>Sets up the main title of the application.</p>
</li>
<li><p>Initializes session state variables:</p>
<ul>
<li><p><code>messages</code>: Stores the chat history.</p>
</li>
<li><p><code>session_id</code>: Keeps track of the current chat session.</p>
</li>
</ul>
</li>
<li><p>Calls functions to display the sidebar and chat interface.</p>
</li>
</ul>
<p>The use of <code>st.session_state</code> is crucial here. It allows us to persist data across reruns of the Streamlit app, ensuring that chat history and session information are maintained.</p>
<h3 id="heading-2-sidebar-functionality-sidebarpyhttpsidebarpy">2. Sidebar Functionality (<a target="_blank" href="http://sidebar.py">sidebar.py</a>)</h3>
<p>The sidebar handles document management and model selection:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> streamlit <span class="hljs-keyword">as</span> st
<span class="hljs-keyword">from</span> api_utils <span class="hljs-keyword">import</span> upload_document, list_documents, delete_document

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">display_sidebar</span>():</span>
    <span class="hljs-comment"># Model selection</span>
    model_options = [<span class="hljs-string">"gpt-4o"</span>, <span class="hljs-string">"gpt-4o-mini"</span>]
    st.sidebar.selectbox(<span class="hljs-string">"Select Model"</span>, options=model_options, key=<span class="hljs-string">"model"</span>)

    <span class="hljs-comment"># Document upload</span>
    uploaded_file = st.sidebar.file_uploader(<span class="hljs-string">"Choose a file"</span>, type=[<span class="hljs-string">"pdf"</span>, <span class="hljs-string">"docx"</span>, <span class="hljs-string">"html"</span>])
    <span class="hljs-keyword">if</span> uploaded_file <span class="hljs-keyword">and</span> st.sidebar.button(<span class="hljs-string">"Upload"</span>):
        <span class="hljs-keyword">with</span> st.spinner(<span class="hljs-string">"Uploading..."</span>):
            upload_response = upload_document(uploaded_file)
            <span class="hljs-keyword">if</span> upload_response:
                st.sidebar.success(<span class="hljs-string">f"File uploaded successfully with ID <span class="hljs-subst">{upload_response[<span class="hljs-string">'file_id'</span>]}</span>."</span>)
                st.session_state.documents = list_documents()

    <span class="hljs-comment"># List and delete documents</span>
    st.sidebar.header(<span class="hljs-string">"Uploaded Documents"</span>)
    <span class="hljs-keyword">if</span> st.sidebar.button(<span class="hljs-string">"Refresh Document List"</span>):
        st.session_state.documents = list_documents()

    <span class="hljs-comment"># Display document list and delete functionality</span>
    <span class="hljs-keyword">if</span> <span class="hljs-string">"documents"</span> <span class="hljs-keyword">in</span> st.session_state <span class="hljs-keyword">and</span> st.session_state.documents:
        <span class="hljs-keyword">for</span> doc <span class="hljs-keyword">in</span> st.session_state.documents:
            st.sidebar.text(<span class="hljs-string">f"<span class="hljs-subst">{doc[<span class="hljs-string">'filename'</span>]}</span> (ID: <span class="hljs-subst">{doc[<span class="hljs-string">'id'</span>]}</span>)"</span>)

        selected_file_id = st.sidebar.selectbox(<span class="hljs-string">"Select a document to delete"</span>, 
                                                options=[doc[<span class="hljs-string">'id'</span>] <span class="hljs-keyword">for</span> doc <span class="hljs-keyword">in</span> st.session_state.documents])
        <span class="hljs-keyword">if</span> st.sidebar.button(<span class="hljs-string">"Delete Selected Document"</span>):
            delete_response = delete_document(selected_file_id)
            <span class="hljs-keyword">if</span> delete_response:
                st.sidebar.success(<span class="hljs-string">f"Document deleted successfully."</span>)
                st.session_state.documents = list_documents()
</code></pre>
<p>Key features:</p>
<ul>
<li><p><strong>Model Selection</strong>: Users can choose between different language models.</p>
</li>
<li><p><strong>Document Upload</strong>: Allows users to upload PDF, DOCX, or HTML files.</p>
</li>
<li><p><strong>Document Listing</strong>: Displays all uploaded documents with their IDs.</p>
</li>
<li><p><strong>Document Deletion</strong>: Provides an option to delete selected documents.</p>
</li>
</ul>
<p>Each action (upload, list, delete) interacts with our FastAPI backend through functions in <code>api_</code><a target="_blank" href="http://utils.py"><code>utils.py</code></a>.</p>
<h3 id="heading-3-chat-interface-chatinterfacepyhttpinterfacepy">3. Chat Interface (chat_<a target="_blank" href="http://interface.py">interface.py</a>)</h3>
<p>This component handles the main chat interaction:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> streamlit <span class="hljs-keyword">as</span> st
<span class="hljs-keyword">from</span> api_utils <span class="hljs-keyword">import</span> get_api_response

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">display_chat_interface</span>():</span>
    <span class="hljs-comment"># Display chat history</span>
    <span class="hljs-keyword">for</span> message <span class="hljs-keyword">in</span> st.session_state.messages:
        <span class="hljs-keyword">with</span> st.chat_message(message[<span class="hljs-string">"role"</span>]):
            st.markdown(message[<span class="hljs-string">"content"</span>])

    <span class="hljs-comment"># Handle new user input</span>
    <span class="hljs-keyword">if</span> prompt := st.chat_input(<span class="hljs-string">"Query:"</span>):
        st.session_state.messages.append({<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: prompt})
        <span class="hljs-keyword">with</span> st.chat_message(<span class="hljs-string">"user"</span>):
            st.markdown(prompt)

        <span class="hljs-comment"># Get API response</span>
        <span class="hljs-keyword">with</span> st.spinner(<span class="hljs-string">"Generating response..."</span>):
            response = get_api_response(prompt, st.session_state.session_id, st.session_state.model)

            <span class="hljs-keyword">if</span> response:
                st.session_state.session_id = response.get(<span class="hljs-string">'session_id'</span>)
                st.session_state.messages.append({<span class="hljs-string">"role"</span>: <span class="hljs-string">"assistant"</span>, <span class="hljs-string">"content"</span>: response[<span class="hljs-string">'answer'</span>]})

                <span class="hljs-keyword">with</span> st.chat_message(<span class="hljs-string">"assistant"</span>):
                    st.markdown(response[<span class="hljs-string">'answer'</span>])

                <span class="hljs-keyword">with</span> st.expander(<span class="hljs-string">"Details"</span>):
                    st.subheader(<span class="hljs-string">"Generated Answer"</span>)
                    st.code(response[<span class="hljs-string">'answer'</span>])
                    st.subheader(<span class="hljs-string">"Model Used"</span>)
                    st.code(response[<span class="hljs-string">'model'</span>])
                    st.subheader(<span class="hljs-string">"Session ID"</span>)
                    st.code(response[<span class="hljs-string">'session_id'</span>])
            <span class="hljs-keyword">else</span>:
                st.error(<span class="hljs-string">"Failed to get a response from the API. Please try again."</span>)
</code></pre>
<p>Key features:</p>
<ul>
<li><p>Displays the entire chat history.</p>
</li>
<li><p>Handles new user inputs and sends them to the API.</p>
</li>
<li><p>Shows the chatbot's responses in a chat-like interface.</p>
</li>
<li><p>Provides an expandable section with additional details about each response.</p>
</li>
</ul>
<p>The chat interface maintains continuity by using the <code>session_id</code> returned from the API, allowing for context-aware conversations.</p>
<h3 id="heading-4-api-utilities-apiutilspyhttputilspy">4. API Utilities (api_<a target="_blank" href="http://utils.py">utils.py</a>)</h3>
<p>This file contains functions for interacting with our FastAPI backend:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> requests
<span class="hljs-keyword">import</span> streamlit <span class="hljs-keyword">as</span> st

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_api_response</span>(<span class="hljs-params">question, session_id, model</span>):</span>
    headers = {<span class="hljs-string">'accept'</span>: <span class="hljs-string">'application/json'</span>, <span class="hljs-string">'Content-Type'</span>: <span class="hljs-string">'application/json'</span>}
    data = {<span class="hljs-string">"question"</span>: question, <span class="hljs-string">"model"</span>: model}
    <span class="hljs-keyword">if</span> session_id:
        data[<span class="hljs-string">"session_id"</span>] = session_id

    <span class="hljs-keyword">try</span>:
        response = requests.post(<span class="hljs-string">"http://localhost:8000/chat"</span>, headers=headers, json=data)
        <span class="hljs-keyword">if</span> response.status_code == <span class="hljs-number">200</span>:
            <span class="hljs-keyword">return</span> response.json()
        <span class="hljs-keyword">else</span>:
            st.error(<span class="hljs-string">f"API request failed with status code <span class="hljs-subst">{response.status_code}</span>: <span class="hljs-subst">{response.text}</span>"</span>)
            <span class="hljs-keyword">return</span> <span class="hljs-literal">None</span>
    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        st.error(<span class="hljs-string">f"An error occurred: <span class="hljs-subst">{str(e)}</span>"</span>)
        <span class="hljs-keyword">return</span> <span class="hljs-literal">None</span>

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">upload_document</span>(<span class="hljs-params">file</span>):</span>
    <span class="hljs-keyword">try</span>:
        files = {<span class="hljs-string">"file"</span>: (file.name, file, file.type)}
        response = requests.post(<span class="hljs-string">"http://localhost:8000/upload-doc"</span>, files=files)
        <span class="hljs-keyword">if</span> response.status_code == <span class="hljs-number">200</span>:
            <span class="hljs-keyword">return</span> response.json()
        <span class="hljs-keyword">else</span>:
            st.error(<span class="hljs-string">f"Failed to upload file. Error: <span class="hljs-subst">{response.status_code}</span> - <span class="hljs-subst">{response.text}</span>"</span>)
            <span class="hljs-keyword">return</span> <span class="hljs-literal">None</span>
    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        st.error(<span class="hljs-string">f"An error occurred while uploading the file: <span class="hljs-subst">{str(e)}</span>"</span>)
        <span class="hljs-keyword">return</span> <span class="hljs-literal">None</span>

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">list_documents</span>():</span>
    <span class="hljs-keyword">try</span>:
        response = requests.get(<span class="hljs-string">"http://localhost:8000/list-docs"</span>)
        <span class="hljs-keyword">if</span> response.status_code == <span class="hljs-number">200</span>:
            <span class="hljs-keyword">return</span> response.json()
        <span class="hljs-keyword">else</span>:
            st.error(<span class="hljs-string">f"Failed to fetch document list. Error: <span class="hljs-subst">{response.status_code}</span> - <span class="hljs-subst">{response.text}</span>"</span>)
            <span class="hljs-keyword">return</span> []
    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        st.error(<span class="hljs-string">f"An error occurred while fetching the document list: <span class="hljs-subst">{str(e)}</span>"</span>)
        <span class="hljs-keyword">return</span> []

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">delete_document</span>(<span class="hljs-params">file_id</span>):</span>
    headers = {<span class="hljs-string">'accept'</span>: <span class="hljs-string">'application/json'</span>, <span class="hljs-string">'Content-Type'</span>: <span class="hljs-string">'application/json'</span>}
    data = {<span class="hljs-string">"file_id"</span>: file_id}

    <span class="hljs-keyword">try</span>:
        response = requests.post(<span class="hljs-string">"http://localhost:8000/delete-doc"</span>, headers=headers, json=data)
        <span class="hljs-keyword">if</span> response.status_code == <span class="hljs-number">200</span>:
            <span class="hljs-keyword">return</span> response.json()
        <span class="hljs-keyword">else</span>:
            st.error(<span class="hljs-string">f"Failed to delete document. Error: <span class="hljs-subst">{response.status_code}</span> - <span class="hljs-subst">{response.text}</span>"</span>)
            <span class="hljs-keyword">return</span> <span class="hljs-literal">None</span>
    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        st.error(<span class="hljs-string">f"An error occurred while deleting the document: <span class="hljs-subst">{str(e)}</span>"</span>)
        <span class="hljs-keyword">return</span> <span class="hljs-literal">None</span>
</code></pre>
<p>These functions handle all communication with our FastAPI backend:</p>
<ul>
<li><p><code>get_api_response</code>: Sends chat queries and receives responses.</p>
</li>
<li><p><code>upload_document</code>: Handles file uploads to the backend.</p>
</li>
<li><p><code>list_documents</code>: Retrieves the list of uploaded documents.</p>
</li>
<li><p><code>delete_document</code>: Sends requests to delete specific documents.</p>
</li>
</ul>
<p>Each function includes error handling and user feedback through Streamlit's <code>st.error()</code> function.</p>
<h3 id="heading-integration-and-data-flow">Integration and Data Flow</h3>
<ol>
<li><p><strong>User Interaction</strong>:</p>
<ul>
<li>Users interact with the Streamlit interface, entering queries or managing documents.</li>
</ul>
</li>
<li><p><strong>Streamlit to FastAPI</strong>:</p>
<ul>
<li><p>User actions trigger API calls to our FastAPI backend.</p>
</li>
<li><p>For example, when a user sends a chat message, <code>get_api_response()</code> is called, which sends a POST request to the <code>/chat</code> endpoint.</p>
</li>
</ul>
</li>
<li><p><strong>FastAPI Processing</strong>:</p>
<ul>
<li>The backend processes these requests, interacting with the RAG system, database, and vector store as needed.</li>
</ul>
</li>
<li><p><strong>Response Handling</strong>:</p>
<ul>
<li><p>Responses from the API are processed and displayed in the Streamlit interface.</p>
</li>
<li><p>For chat, responses are added to the chat history and displayed.</p>
</li>
<li><p>For document operations, success messages or error notifications are shown.</p>
</li>
</ul>
</li>
<li><p><strong>State Management</strong>:</p>
<ul>
<li>Streamlit's session state (<code>st.session_state</code>) is used to maintain chat history and current session information across interactions.</li>
</ul>
</li>
</ol>
<h3 id="heading-key-benefits-of-this-implementation">Key Benefits of This Implementation</h3>
<ol>
<li><p><strong>User-Friendly Interface</strong>: Streamlit provides an intuitive, interactive interface for our RAG chatbot.</p>
</li>
<li><p><strong>Real-Time Interaction</strong>: Users can chat, upload documents, and manage their document collection in real-time.</p>
</li>
<li><p><strong>Seamless Integration</strong>: The Streamlit frontend integrates smoothly with our FastAPI backend.</p>
</li>
<li><p><strong>Stateful Conversations</strong>: The use of session IDs allows for context-aware, stateful conversations.</p>
</li>
<li><p><strong>Flexible Document Management</strong>: Users can easily upload, view, and delete documents, enhancing the RAG system's utility.</p>
</li>
</ol>
<h3 id="heading-conclusion">Conclusion</h3>
<p>This Streamlit implementation creates a user-friendly front-end for our RAG chatbot, effectively bridging the gap between the sophisticated backend we built and the end-user. It demonstrates how Streamlit can be used to rapidly develop interactive web applications that interface with complex AI systems.</p>
<p>The modular structure and clear separation of concerns between the frontend and backend allow for easy maintenance and future enhancements. As the application grows, you can easily add new features, improve the user interface, and scale the backend to handle more users and larger document collections.</p>
<p>This implementation completes our RAG chatbot project, providing an accessible interface for users to leverage the power of retrieval-augmented generation in their interactions with the AI model.</p>
<p>Certainly! Adding links to your YouTube video and GitHub repository is an excellent way to provide additional resources for your readers. Let's add this information to the conclusion of your blog post.</p>
<hr />
<p>In this series, we've walked through the process of building a production-ready Retrieval-Augmented Generation (RAG) chatbot using FastAPI, LangChain, and Streamlit. We've covered everything from the core concepts of RAG systems to implementing a robust backend API and creating an intuitive user interface.</p>
<p>Here's a quick recap of what we've accomplished:</p>
<ol>
<li><p>Built a RAG system using LangChain, integrating language models with document retrieval.</p>
</li>
<li><p>Developed a FastAPI backend to handle chat interactions and document management.</p>
</li>
<li><p>Created a user-friendly Streamlit frontend for seamless interaction with our RAG chatbot.</p>
</li>
</ol>
<p>This project demonstrates the power of combining modern AI technologies with web development frameworks to create sophisticated, yet accessible applications.</p>
<h3 id="heading-additional-resources">Additional Resources</h3>
<p>To help you further understand and implement this RAG chatbot system, I've prepared some additional resources:</p>
<ol>
<li><p><strong>Video Tutorial</strong>: For a comprehensive walkthrough of this entire project, including live coding and explanations, check out my YouTube video:</p>
<p> <a target="_blank" href="https://youtu.be/38aMTXY2usU?si=9XsjXHvtAHbSeP32">Watch the Full RAG Chatbot Tutorial</a></p>
<p> In this video, I cover all three parts of our blog series, demonstrating the implementation details and providing additional insights.</p>
</li>
<li><p><strong>GitHub Repository</strong>: The complete source code for this project is available on GitHub. You can clone, fork, or download the repository to explore the code in detail or use it as a starting point for your own projects:</p>
<p> <a target="_blank" href="https://github.com/PradipNichite/Youtube-Tutorials/tree/main/Langchain%20RAG%20Course%202024">RAG Chatbot GitHub Repository</a></p>
<p> The repository includes all the components we've discussed: the FastAPI backend, Streamlit frontend, and associated utilities.</p>
</li>
</ol>
<p>Thank you for following along with this series. I hope you've found it informative and inspiring for your own AI development journey. Happy coding!</p>
<h3 id="heading-futuresmart-ai-your-partner-in-custom-nlp-solutions">FutureSmart AI: Your Partner in Custom NLP Solutions</h3>
<p>At FutureSmart AI, we specialize in building custom Natural Language Processing (NLP) solutions tailored to your specific needs. Our expertise extends beyond RAG systems to include:</p>
<ul>
<li><p>Natural Language to SQL (NL2SQL) interfaces</p>
</li>
<li><p>Advanced document parsing and analysis</p>
</li>
<li><p>Custom chatbots and conversational AI</p>
</li>
<li><p>And much more in the realm of NLP and AI</p>
</li>
</ul>
<p>We've successfully implemented these technologies for various industries, helping businesses leverage the power of AI to enhance their operations and user experiences.</p>
<h4 id="heading-interested-in-learning-more">Interested in Learning More?</h4>
<ul>
<li><p><strong>Check out our case studies</strong>: <a target="_blank" href="https://www.futuresmart.ai/case-studies">FutureSmart AI Case Studies</a> Explore real-world applications of our NLP solutions and see how they've transformed businesses across different sectors.</p>
</li>
<li><p><strong>Get in touch</strong>: Have a project in mind or want to discuss how NLP can benefit your business? We'd love to hear from you! Contact us at <a target="_blank" href="mailto:contact@futuresmart.ai">contact@futuresmart.ai</a></p>
</li>
</ul>
<p>Whether you're looking to implement a RAG system like the one we've built in this tutorial, or you have more specific NLP needs, our team at FutureSmart AI is here to help turn your AI aspirations into reality.</p>
]]></content:encoded></item><item><title><![CDATA[Building a Production-Ready RAG Chatbot with FastAPI and LangChain]]></title><description><![CDATA[Introduction
In our previous article, we delved into the fundamentals of Retrieval-Augmented Generation (RAG) using LangChain. We explored the core concepts, built a basic RAG system, and demonstrated its capabilities in a Jupyter notebook environmen...]]></description><link>https://blog.futuresmart.ai/building-a-production-ready-rag-chatbot-with-fastapi-and-langchain</link><guid isPermaLink="true">https://blog.futuresmart.ai/building-a-production-ready-rag-chatbot-with-fastapi-and-langchain</guid><category><![CDATA[langchain fastapi]]></category><category><![CDATA[rag chatbot]]></category><category><![CDATA[langchain rag]]></category><category><![CDATA[langchain]]></category><dc:creator><![CDATA[Pradip Nichite]]></dc:creator><pubDate>Mon, 21 Oct 2024 10:34:36 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1729506787165/fd17ead0-cc1d-42fb-ba0e-67274929b421.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>In our previous article, we delved into the fundamentals of Retrieval-Augmented Generation (RAG) using LangChain. We explored the core concepts, built a basic RAG system, and demonstrated its capabilities in a Jupyter notebook environment. While this approach is excellent for prototyping and understanding the underlying mechanics, it's not quite ready for real-world applications.</p>
<p>Today, we're taking the next crucial step: transforming our RAG prototype into a production-ready API. We'll be using FastAPI, a modern, fast (high-performance) web framework for building APIs with Python. FastAPI is particularly well-suited for our needs due to its speed, ease of use, and built-in support for asynchronous programming.</p>
<h3 id="heading-what-well-build">What We'll Build</h3>
<p>In this tutorial, we'll create a robust API that offers the following functionalities:</p>
<ol>
<li><p>A chat endpoint that processes queries using our RAG system</p>
</li>
<li><p>Document upload and indexing capabilities</p>
</li>
<li><p>Ability to list and delete indexed documents</p>
</li>
<li><p>Proper error handling and logging</p>
</li>
</ol>
<p>We'll structure our application in a modular, maintainable way, making it easy to extend and deploy in a production environment.</p>
<h3 id="heading-prerequisites">Prerequisites</h3>
<p>Before we dive in, make sure you have the following:</p>
<ul>
<li><p>Basic understanding of Python and asynchronous programming</p>
</li>
<li><p>Familiarity with RESTful APIs</p>
</li>
<li><p>Knowledge of RAG systems and LangChain (covered in Part 1 of this series)</p>
</li>
<li><p>Python 3.8+ installed on your system</p>
</li>
<li><p>pip for installing required packages</p>
</li>
</ul>
<h2 id="heading-project-setup">Project Setup</h2>
<p>First, let's set up our project environment. Create a new directory for your project and navigate into it:</p>
<pre><code class="lang-bash">mkdir rag-fastapi-project
<span class="hljs-built_in">cd</span> rag-fastapi-project
</code></pre>
<p>Now, let's install the necessary packages. Create a <code>requirements.txt</code> file with the following content:</p>
<pre><code class="lang-plaintext">langchain
langchain-openai
langchain-core
langchain_community
docx2txt
pypdf
langchain_chroma
python-multipart
fastapi
uvicorn
</code></pre>
<p>Install these packages using pip:</p>
<pre><code class="lang-bash">pip install -r requirements.txt
</code></pre>
<p>With our environment set up, we're ready to start building our production-ready RAG chatbot API. In the next section, we'll dive into the project structure and begin implementing our FastAPI application.</p>
<p>Certainly! Let's move on to the next section, where we'll discuss the project structure overview. This section will help readers understand how we're organizing our code for better maintainability and scalability.</p>
<hr />
<h2 id="heading-project-structure-overview">Project Structure Overview</h2>
<p>When transitioning from a prototype to a production-ready application, proper code organization becomes crucial. A well-structured project is easier to maintain, test, and extend. For our RAG chatbot API, we'll use a modular structure that separates concerns and promotes code reusability.</p>
<p>Here's an overview of our project structure:</p>
<pre><code class="lang-plaintext">rag-fastapi-project/
│
├── main.py
├── chroma_utils.py
├── db_utils.py
├── langchain_utils.py
├── pydantic_models.py
├── requirements.txt
└── chroma_db/  (directory for Chroma persistence)
</code></pre>
<p>Let's break down the purpose of each file:</p>
<ol>
<li><p><a target="_blank" href="http://main.py"><code>main.py</code></a>: This is the entry point of our FastAPI application. It defines the API routes and orchestrates the different components of our system.</p>
</li>
<li><p><code>chroma_</code><a target="_blank" href="http://utils.py"><code>utils.py</code></a>: Contains utilities for interacting with the Chroma vector store, including functions for indexing documents and performing similarity searches.</p>
</li>
<li><p><code>db_</code><a target="_blank" href="http://utils.py"><code>utils.py</code></a>: Handles database operations, including storing and retrieving chat history and document metadata.</p>
</li>
<li><p><code>langchain_</code><a target="_blank" href="http://utils.py"><code>utils.py</code></a>: Encapsulates the LangChain-specific logic, such as creating the RAG chain and configuring the language model.</p>
</li>
<li><p><code>pydantic_</code><a target="_blank" href="http://models.py"><code>models.py</code></a>: Defines Pydantic models for request and response validation, ensuring type safety and clear API contracts.</p>
</li>
<li><p><code>requirements.txt</code>: Lists all the Python packages required for the project.</p>
</li>
</ol>
<h3 id="heading-benefits-of-this-structure">Benefits of This Structure</h3>
<ol>
<li><p><strong>Separation of Concerns</strong>: Each file has a specific responsibility, making the code easier to understand and maintain.</p>
</li>
<li><p><strong>Modularity</strong>: Components can be developed and tested independently, facilitating collaboration and reducing the risk of conflicts.</p>
</li>
<li><p><strong>Scalability</strong>: As the project grows, new functionalities can be added by introducing new modules without significantly altering existing code.</p>
</li>
<li><p><strong>Reusability</strong>: Utility functions and models can be easily reused across different parts of the application.</p>
</li>
<li><p><strong>Readability</strong>: With clear file names and separated concerns, new developers can quickly understand the project structure and locate specific functionalities.</p>
</li>
</ol>
<p>This structure follows best practices for FastAPI applications and provides a solid foundation for building our RAG chatbot API. As we progress through the tutorial, we'll dive into each of these files, explaining their contents and how they work together to create our production-ready system.</p>
<h2 id="heading-setting-up-the-fastapi-application">Setting Up the FastAPI Application</h2>
<p>The <a target="_blank" href="http://main.py"><code>main.py</code></a> file is the core of our FastAPI application. It defines our API endpoints and orchestrates the interaction between different components of our system. Let's break down the key elements of this file:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> fastapi <span class="hljs-keyword">import</span> FastAPI, File, UploadFile, HTTPException
<span class="hljs-keyword">from</span> pydantic_models <span class="hljs-keyword">import</span> QueryInput, QueryResponse, DocumentInfo, DeleteFileRequest
<span class="hljs-keyword">from</span> langchain_utils <span class="hljs-keyword">import</span> get_rag_chain
<span class="hljs-keyword">from</span> db_utils <span class="hljs-keyword">import</span> insert_application_logs, get_chat_history, get_all_documents, insert_document_record, delete_document_record
<span class="hljs-keyword">from</span> chroma_utils <span class="hljs-keyword">import</span> index_document_to_chroma, delete_doc_from_chroma
<span class="hljs-keyword">import</span> os
<span class="hljs-keyword">import</span> uuid
<span class="hljs-keyword">import</span> logging
<span class="hljs-keyword">import</span> shutil

<span class="hljs-comment"># Set up logging</span>
logging.basicConfig(filename=<span class="hljs-string">'app.log'</span>, level=logging.INFO)

<span class="hljs-comment"># Initialize FastAPI app</span>
app = FastAPI()
</code></pre>
<p>Here, we import necessary modules and initialize our FastAPI application. We've also set up basic logging to keep track of important events in our application.</p>
<h3 id="heading-defining-api-endpoints">Defining API Endpoints</h3>
<p>Now, let's look at our main API endpoints:</p>
<ol>
<li><strong>Chat Endpoint</strong>:</li>
</ol>
<pre><code class="lang-python"><span class="hljs-meta">@app.post("/chat", response_model=QueryResponse)</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">chat</span>(<span class="hljs-params">query_input: QueryInput</span>):</span>
    session_id = query_input.session_id <span class="hljs-keyword">or</span> str(uuid.uuid4())
    logging.info(<span class="hljs-string">f"Session ID: <span class="hljs-subst">{session_id}</span>, User Query: <span class="hljs-subst">{query_input.question}</span>, Model: <span class="hljs-subst">{query_input.model.value}</span>"</span>)

    chat_history = get_chat_history(session_id)
    rag_chain = get_rag_chain(query_input.model.value)
    answer = rag_chain.invoke({
        <span class="hljs-string">"input"</span>: query_input.question,
        <span class="hljs-string">"chat_history"</span>: chat_history
    })[<span class="hljs-string">'answer'</span>]

    insert_application_logs(session_id, query_input.question, answer, query_input.model.value)
    logging.info(<span class="hljs-string">f"Session ID: <span class="hljs-subst">{session_id}</span>, AI Response: <span class="hljs-subst">{answer}</span>"</span>)
    <span class="hljs-keyword">return</span> QueryResponse(answer=answer, session_id=session_id, model=query_input.model)
</code></pre>
<p>This endpoint handles chat interactions. It generates a session ID if not provided, retrieves chat history, invokes the RAG chain to generate a response, logs the interaction, and returns the response.</p>
<ol start="2">
<li><strong>Document Upload Endpoint:</strong></li>
</ol>
<pre><code class="lang-python"><span class="hljs-meta">@app.post("/upload-doc")</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">upload_and_index_document</span>(<span class="hljs-params">file: UploadFile = File(<span class="hljs-params">...</span>)</span>):</span>
    allowed_extensions = [<span class="hljs-string">'.pdf'</span>, <span class="hljs-string">'.docx'</span>, <span class="hljs-string">'.html'</span>]
    file_extension = os.path.splitext(file.filename)[<span class="hljs-number">1</span>].lower()

    <span class="hljs-keyword">if</span> file_extension <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> allowed_extensions:
        <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">400</span>, detail=<span class="hljs-string">f"Unsupported file type. Allowed types are: <span class="hljs-subst">{<span class="hljs-string">', '</span>.join(allowed_extensions)}</span>"</span>)

    temp_file_path = <span class="hljs-string">f"temp_<span class="hljs-subst">{file.filename}</span>"</span>

    <span class="hljs-keyword">try</span>:
        <span class="hljs-comment"># Save the uploaded file to a temporary file</span>
        <span class="hljs-keyword">with</span> open(temp_file_path, <span class="hljs-string">"wb"</span>) <span class="hljs-keyword">as</span> buffer:
            shutil.copyfileobj(file.file, buffer)

        file_id = insert_document_record(file.filename)
        success = index_document_to_chroma(temp_file_path, file_id)

        <span class="hljs-keyword">if</span> success:
            <span class="hljs-keyword">return</span> {<span class="hljs-string">"message"</span>: <span class="hljs-string">f"File <span class="hljs-subst">{file.filename}</span> has been successfully uploaded and indexed."</span>, <span class="hljs-string">"file_id"</span>: file_id}
        <span class="hljs-keyword">else</span>:
            delete_document_record(file_id)
            <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">500</span>, detail=<span class="hljs-string">f"Failed to index <span class="hljs-subst">{file.filename}</span>."</span>)
    <span class="hljs-keyword">finally</span>:
        <span class="hljs-keyword">if</span> os.path.exists(temp_file_path):
            os.remove(temp_file_path)
</code></pre>
<p>This endpoint handles document uploads. It checks for allowed file types, saves the file temporarily, indexes it in Chroma, and updates the document record in the database.</p>
<ol start="3">
<li><strong>List Documents Endpoint:</strong></li>
</ol>
<pre><code class="lang-python"><span class="hljs-meta">@app.get("/list-docs", response_model=list[DocumentInfo])</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">list_documents</span>():</span>
    <span class="hljs-keyword">return</span> get_all_documents()
</code></pre>
<p>This simple endpoint returns a list of all indexed documents.</p>
<ol start="4">
<li><strong>Delete Document Endpoint:</strong></li>
</ol>
<pre><code class="lang-python"><span class="hljs-meta">@app.post("/delete-doc")</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">delete_document</span>(<span class="hljs-params">request: DeleteFileRequest</span>):</span>
    chroma_delete_success = delete_doc_from_chroma(request.file_id)

    <span class="hljs-keyword">if</span> chroma_delete_success:
        db_delete_success = delete_document_record(request.file_id)
        <span class="hljs-keyword">if</span> db_delete_success:
            <span class="hljs-keyword">return</span> {<span class="hljs-string">"message"</span>: <span class="hljs-string">f"Successfully deleted document with file_id <span class="hljs-subst">{request.file_id}</span> from the system."</span>}
        <span class="hljs-keyword">else</span>:
            <span class="hljs-keyword">return</span> {<span class="hljs-string">"error"</span>: <span class="hljs-string">f"Deleted from Chroma but failed to delete document with file_id <span class="hljs-subst">{request.file_id}</span> from the database."</span>}
    <span class="hljs-keyword">else</span>:
        <span class="hljs-keyword">return</span> {<span class="hljs-string">"error"</span>: <span class="hljs-string">f"Failed to delete document with file_id <span class="hljs-subst">{request.file_id}</span> from Chroma."</span>}
</code></pre>
<p>This endpoint handles document deletion, removing the document from both Chroma and the database.</p>
<h2 id="heading-data-models-with-pydantic">Data Models with Pydantic</h2>
<p>Pydantic is a data validation library that uses Python type annotations to define data schemas. In our FastAPI application, we use Pydantic models to define the structure of our request and response data. Let's break down the models defined in <code>pydantic_</code><a target="_blank" href="http://models.py"><code>models.py</code></a>:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> pydantic <span class="hljs-keyword">import</span> BaseModel, Field
<span class="hljs-keyword">from</span> enum <span class="hljs-keyword">import</span> Enum
<span class="hljs-keyword">from</span> datetime <span class="hljs-keyword">import</span> datetime

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">ModelName</span>(<span class="hljs-params">str, Enum</span>):</span>
    GPT4_O = <span class="hljs-string">"gpt-4o"</span>
    GPT4_O_MINI = <span class="hljs-string">"gpt-4o-mini"</span>

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">QueryInput</span>(<span class="hljs-params">BaseModel</span>):</span>
    question: str
    session_id: str = Field(default=<span class="hljs-literal">None</span>)
    model: ModelName = Field(default=ModelName.GPT4_O_MINI)

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">QueryResponse</span>(<span class="hljs-params">BaseModel</span>):</span>
    answer: str
    session_id: str
    model: ModelName

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">DocumentInfo</span>(<span class="hljs-params">BaseModel</span>):</span>
    id: int
    filename: str
    upload_timestamp: datetime

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">DeleteFileRequest</span>(<span class="hljs-params">BaseModel</span>):</span>
    file_id: int
</code></pre>
<p>Let's examine each model and its purpose:</p>
<ol>
<li><p><code>ModelName</code> (Enum):</p>
<ul>
<li><p>This enum defines the available language models for our RAG system.</p>
</li>
<li><p>Using an enum ensures that only valid model names can be used.</p>
</li>
</ul>
</li>
<li><p><code>QueryInput</code>:</p>
<ul>
<li><p>Represents the input for a chat query.</p>
</li>
<li><p><code>question</code>: The user's question (required).</p>
</li>
<li><p><code>session_id</code>: Optional session ID. If not provided, one will be generated.</p>
</li>
<li><p><code>model</code>: The language model to use, defaulting to GPT4_O_MINI.</p>
</li>
</ul>
</li>
<li><p><code>QueryResponse</code>:</p>
<ul>
<li><p>Represents the response to a chat query.</p>
</li>
<li><p><code>answer</code>: The generated answer.</p>
</li>
<li><p><code>session_id</code>: The session ID (useful for continuing conversations).</p>
</li>
<li><p><code>model</code>: The model used to generate the response.</p>
</li>
</ul>
</li>
<li><p><code>DocumentInfo</code>:</p>
<ul>
<li><p>Represents metadata about an indexed document.</p>
</li>
<li><p><code>id</code>: Unique identifier for the document.</p>
</li>
<li><p><code>filename</code>: Name of the uploaded file.</p>
</li>
<li><p><code>upload_timestamp</code>: When the document was uploaded and indexed.</p>
</li>
</ul>
</li>
<li><p><code>DeleteFileRequest</code>:</p>
<ul>
<li><p>Represents a request to delete a document.</p>
</li>
<li><p><code>file_id</code>: The ID of the document to be deleted.</p>
</li>
</ul>
</li>
</ol>
<h3 id="heading-using-pydantic-models-in-fastapi">Using Pydantic Models in FastAPI</h3>
<p>In our <a target="_blank" href="http://main.py"><code>main.py</code></a>, we use these models to define the shape of our request and response data. For example:</p>
<pre><code class="lang-python"><span class="hljs-meta">@app.post("/chat", response_model=QueryResponse)</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">chat</span>(<span class="hljs-params">query_input: QueryInput</span>):</span>
    <span class="hljs-comment"># Function implementation</span>
</code></pre>
<p>Here, FastAPI uses <code>QueryInput</code> to validate the incoming request data and <code>QueryResponse</code> to validate and serialize the response. This ensures that our API behaves consistently and provides clear error messages when invalid data is provided.</p>
<h3 id="heading-extensibility">Extensibility</h3>
<p>As our API grows, we can easily extend these models. For instance, if we want to add more metadata to our document info, we can simply add fields to the <code>DocumentInfo</code> model:</p>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">DocumentInfo</span>(<span class="hljs-params">BaseModel</span>):</span>
    id: int
    filename: str
    upload_timestamp: datetime
    file_size: int  <span class="hljs-comment"># New field</span>
    content_type: str  <span class="hljs-comment"># New field</span>
</code></pre>
<p>FastAPI and Pydantic will automatically handle the new fields, providing validation and documentation without any changes to our endpoint logic.</p>
<p>By using Pydantic models, we've created a robust foundation for our API, ensuring data integrity and providing clear contracts for our endpoints. This approach significantly reduces the amount of manual validation code we need to write and helps prevent bugs related to incorrect data handling.</p>
<h2 id="heading-managing-documents-and-chat-history">Managing Documents and Chat History</h2>
<p>The <code>db_</code><a target="_blank" href="http://utils.py"><code>utils.py</code></a> file contains functions for interacting with our SQLite database. We use SQLite for its simplicity and ease of setup, making it perfect for prototyping and small to medium-scale applications. Let's break down the key components of this file:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> sqlite3
<span class="hljs-keyword">from</span> datetime <span class="hljs-keyword">import</span> datetime

DB_NAME = <span class="hljs-string">"rag_app.db"</span>

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_db_connection</span>():</span>
    conn = sqlite3.connect(DB_NAME)
    conn.row_factory = sqlite3.Row
    <span class="hljs-keyword">return</span> conn
</code></pre>
<p>We start by importing the necessary modules and defining our database name. The <code>get_db_connection()</code> function creates a connection to our SQLite database, setting the row factory to <code>sqlite3.Row</code> for easier data access.</p>
<h3 id="heading-creating-database-tables">Creating Database Tables</h3>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create_application_logs</span>():</span>
    conn = get_db_connection()
    conn.execute(<span class="hljs-string">'''CREATE TABLE IF NOT EXISTS application_logs
                    (id INTEGER PRIMARY KEY AUTOINCREMENT,
                     session_id TEXT,
                     user_query TEXT,
                     gpt_response TEXT,
                     model TEXT,
                     created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP)'''</span>)
    conn.close()

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create_document_store</span>():</span>
    conn = get_db_connection()
    conn.execute(<span class="hljs-string">'''CREATE TABLE IF NOT EXISTS document_store
                    (id INTEGER PRIMARY KEY AUTOINCREMENT,
                     filename TEXT,
                     upload_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP)'''</span>)
    conn.close()
</code></pre>
<p>These functions create our two main tables:</p>
<ol>
<li><p><code>application_logs</code>: Stores chat history and model responses.</p>
</li>
<li><p><code>document_store</code>: Keeps track of uploaded documents.</p>
</li>
</ol>
<h3 id="heading-managing-chat-logs">Managing Chat Logs</h3>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">insert_application_logs</span>(<span class="hljs-params">session_id, user_query, gpt_response, model</span>):</span>
    conn = get_db_connection()
    conn.execute(<span class="hljs-string">'INSERT INTO application_logs (session_id, user_query, gpt_response, model) VALUES (?, ?, ?, ?)'</span>,
                 (session_id, user_query, gpt_response, model))
    conn.commit()
    conn.close()

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_chat_history</span>(<span class="hljs-params">session_id</span>):</span>
    conn = get_db_connection()
    cursor = conn.cursor()
    cursor.execute(<span class="hljs-string">'SELECT user_query, gpt_response FROM application_logs WHERE session_id = ? ORDER BY created_at'</span>, (session_id,))
    messages = []
    <span class="hljs-keyword">for</span> row <span class="hljs-keyword">in</span> cursor.fetchall():
        messages.extend([
            {<span class="hljs-string">"role"</span>: <span class="hljs-string">"human"</span>, <span class="hljs-string">"content"</span>: row[<span class="hljs-string">'user_query'</span>]},
            {<span class="hljs-string">"role"</span>: <span class="hljs-string">"ai"</span>, <span class="hljs-string">"content"</span>: row[<span class="hljs-string">'gpt_response'</span>]}
        ])
    conn.close()
    <span class="hljs-keyword">return</span> messages
</code></pre>
<p>These functions handle inserting new chat logs and retrieving chat history for a given session. The chat history is formatted to be easily usable by our RAG system.</p>
<h3 id="heading-managing-document-records">Managing Document Records</h3>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">insert_document_record</span>(<span class="hljs-params">filename</span>):</span>
    conn = get_db_connection()
    cursor = conn.cursor()
    cursor.execute(<span class="hljs-string">'INSERT INTO document_store (filename) VALUES (?)'</span>, (filename,))
    file_id = cursor.lastrowid
    conn.commit()
    conn.close()
    <span class="hljs-keyword">return</span> file_id

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">delete_document_record</span>(<span class="hljs-params">file_id</span>):</span>
    conn = get_db_connection()
    conn.execute(<span class="hljs-string">'DELETE FROM document_store WHERE id = ?'</span>, (file_id,))
    conn.commit()
    conn.close()
    <span class="hljs-keyword">return</span> <span class="hljs-literal">True</span>

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_all_documents</span>():</span>
    conn = get_db_connection()
    cursor = conn.cursor()
    cursor.execute(<span class="hljs-string">'SELECT id, filename, upload_timestamp FROM document_store ORDER BY upload_timestamp DESC'</span>)
    documents = cursor.fetchall()
    conn.close()
    <span class="hljs-keyword">return</span> [dict(doc) <span class="hljs-keyword">for</span> doc <span class="hljs-keyword">in</span> documents]
</code></pre>
<p>These functions handle CRUD operations for document records:</p>
<ul>
<li><p>Inserting new document records</p>
</li>
<li><p>Deleting document records</p>
</li>
<li><p>Retrieving all document records</p>
</li>
</ul>
<h3 id="heading-initialization">Initialization</h3>
<p>At the end of the file, we initialize our database tables:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Initialize the database tables</span>
create_application_logs()
create_document_store()
</code></pre>
<p>This ensures that our tables are created when the application starts, if they don't already exist.</p>
<p>By centralizing our database operations in <code>db_</code><a target="_blank" href="http://utils.py"><code>utils.py</code></a>, we maintain a clean separation of concerns. Our main application logic doesn't need to worry about the details of database interactions, making the code more modular and easier to maintain.</p>
<p>In a production environment, you might consider using an ORM (Object-Relational Mapping) library like SQLAlchemy for more complex database operations and better scalability. However, for our current needs, this straightforward SQLite implementation serves well.</p>
<h2 id="heading-vector-store-integration">Vector Store Integration</h2>
<p>The <code>chroma_</code><a target="_blank" href="http://utils.py"><code>utils.py</code></a> file contains functions for interacting with the Chroma vector store, which is essential for our RAG system's retrieval capabilities. Let's break down the key components of this file:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain_community.document_loaders <span class="hljs-keyword">import</span> PyPDFLoader, Docx2txtLoader, UnstructuredHTMLLoader
<span class="hljs-keyword">from</span> langchain_text_splitters <span class="hljs-keyword">import</span> RecursiveCharacterTextSplitter
<span class="hljs-keyword">from</span> langchain_openai <span class="hljs-keyword">import</span> OpenAIEmbeddings
<span class="hljs-keyword">from</span> langchain_chroma <span class="hljs-keyword">import</span> Chroma
<span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> List
<span class="hljs-keyword">from</span> langchain_core.documents <span class="hljs-keyword">import</span> Document
<span class="hljs-keyword">import</span> os

<span class="hljs-comment"># Initialize text splitter and embedding function</span>
text_splitter = RecursiveCharacterTextSplitter(chunk_size=<span class="hljs-number">1000</span>, chunk_overlap=<span class="hljs-number">200</span>, length_function=len)
embedding_function = OpenAIEmbeddings()

<span class="hljs-comment"># Initialize Chroma vector store</span>
vectorstore = Chroma(persist_directory=<span class="hljs-string">"./chroma_db"</span>, embedding_function=embedding_function)
</code></pre>
<p>Here, we import necessary modules and initialize our text splitter, embedding function, and Chroma vector store. The <code>RecursiveCharacterTextSplitter</code> is used to split documents into manageable chunks, while <code>OpenAIEmbeddings</code> provides the embedding function for our documents.</p>
<h3 id="heading-document-loading-and-splitting">Document Loading and Splitting</h3>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">load_and_split_document</span>(<span class="hljs-params">file_path: str</span>) -&gt; List[Document]:</span>
    <span class="hljs-keyword">if</span> file_path.endswith(<span class="hljs-string">'.pdf'</span>):
        loader = PyPDFLoader(file_path)
    <span class="hljs-keyword">elif</span> file_path.endswith(<span class="hljs-string">'.docx'</span>):
        loader = Docx2txtLoader(file_path)
    <span class="hljs-keyword">elif</span> file_path.endswith(<span class="hljs-string">'.html'</span>):
        loader = UnstructuredHTMLLoader(file_path)
    <span class="hljs-keyword">else</span>:
        <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">f"Unsupported file type: <span class="hljs-subst">{file_path}</span>"</span>)

    documents = loader.load()
    <span class="hljs-keyword">return</span> text_splitter.split_documents(documents)
</code></pre>
<p>This function handles loading different document types (PDF, DOCX, HTML) and splitting them into chunks. It uses the appropriate loader based on the file extension and then applies our text splitter to create manageable document chunks.</p>
<h3 id="heading-indexing-documents">Indexing Documents</h3>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">index_document_to_chroma</span>(<span class="hljs-params">file_path: str, file_id: int</span>) -&gt; bool:</span>
    <span class="hljs-keyword">try</span>:
        splits = load_and_split_document(file_path)

        <span class="hljs-comment"># Add metadata to each split</span>
        <span class="hljs-keyword">for</span> split <span class="hljs-keyword">in</span> splits:
            split.metadata[<span class="hljs-string">'file_id'</span>] = file_id

        vectorstore.add_documents(splits)
        <span class="hljs-keyword">return</span> <span class="hljs-literal">True</span>
    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        print(<span class="hljs-string">f"Error indexing document: <span class="hljs-subst">{e}</span>"</span>)
        <span class="hljs-keyword">return</span> <span class="hljs-literal">False</span>
</code></pre>
<p>This function takes a file path and a file ID, loads and splits the document, adds metadata (file ID) to each split, and then adds these document chunks to our Chroma vector store. The metadata allows us to link vector store entries back to our database records.</p>
<h3 id="heading-deleting-documents">Deleting Documents</h3>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">delete_doc_from_chroma</span>(<span class="hljs-params">file_id: int</span>):</span>
    <span class="hljs-keyword">try</span>:
        docs = vectorstore.get(where={<span class="hljs-string">"file_id"</span>: file_id})
        print(<span class="hljs-string">f"Found <span class="hljs-subst">{len(docs[<span class="hljs-string">'ids'</span>])}</span> document chunks for file_id <span class="hljs-subst">{file_id}</span>"</span>)

        vectorstore._collection.delete(where={<span class="hljs-string">"file_id"</span>: file_id})
        print(<span class="hljs-string">f"Deleted all documents with file_id <span class="hljs-subst">{file_id}</span>"</span>)

        <span class="hljs-keyword">return</span> <span class="hljs-literal">True</span>
    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        print(<span class="hljs-string">f"Error deleting document with file_id <span class="hljs-subst">{file_id}</span> from Chroma: <span class="hljs-subst">{str(e)}</span>"</span>)
        <span class="hljs-keyword">return</span> <span class="hljs-literal">False</span>
</code></pre>
<p>This function deletes all document chunks associated with a given file ID from the Chroma vector store. It first retrieves the documents to confirm their existence, then performs the deletion.</p>
<h3 id="heading-integration-with-rag-system">Integration with RAG System</h3>
<p>While not explicitly shown in this file, the Chroma vector store is crucial for our RAG system's retrieval step. In <code>langchain_</code><a target="_blank" href="http://utils.py"><code>utils.py</code></a>, we use this vector store to create a retriever:</p>
<pre><code class="lang-python">retriever = vectorstore.as_retriever(search_kwargs={<span class="hljs-string">"k"</span>: <span class="hljs-number">2</span>})
</code></pre>
<p>This retriever is then used in our RAG chain to fetch relevant document chunks based on the user's query.</p>
<p>By centralizing our vector store operations in <code>chroma_</code><a target="_blank" href="http://utils.py"><code>utils.py</code></a>, we maintain a clean separation of concerns and make it easier to swap out or upgrade our vector store implementation in the future if needed.</p>
<h2 id="heading-langchain-rag-implementation">LangChain RAG Implementation</h2>
<p>The <code>langchain_</code><a target="_blank" href="http://utils.py"><code>utils.py</code></a> file is where we implement the core of our Retrieval-Augmented Generation (RAG) system using LangChain. This file sets up the language model, retriever, and the RAG chain. Let's break down its key components:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain_openai <span class="hljs-keyword">import</span> ChatOpenAI
<span class="hljs-keyword">from</span> langchain_core.output_parsers <span class="hljs-keyword">import</span> StrOutputParser
<span class="hljs-keyword">from</span> langchain_core.prompts <span class="hljs-keyword">import</span> ChatPromptTemplate, MessagesPlaceholder
<span class="hljs-keyword">from</span> langchain.chains <span class="hljs-keyword">import</span> create_history_aware_retriever, create_retrieval_chain
<span class="hljs-keyword">from</span> langchain.chains.combine_documents <span class="hljs-keyword">import</span> create_stuff_documents_chain
<span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> List
<span class="hljs-keyword">from</span> langchain_core.documents <span class="hljs-keyword">import</span> Document
<span class="hljs-keyword">import</span> os
<span class="hljs-keyword">from</span> chroma_utils <span class="hljs-keyword">import</span> vectorstore

retriever = vectorstore.as_retriever(search_kwargs={<span class="hljs-string">"k"</span>: <span class="hljs-number">2</span>})

output_parser = StrOutputParser()
</code></pre>
<p>Here, we import necessary LangChain components and set up our retriever using the Chroma vectorstore we created earlier. We also initialize a string output parser for processing the language model's output.</p>
<h3 id="heading-setting-up-prompts">Setting Up Prompts</h3>
<pre><code class="lang-python">contextualize_q_system_prompt = (
    <span class="hljs-string">"Given a chat history and the latest user question "</span>
    <span class="hljs-string">"which might reference context in the chat history, "</span>
    <span class="hljs-string">"formulate a standalone question which can be understood "</span>
    <span class="hljs-string">"without the chat history. Do NOT answer the question, "</span>
    <span class="hljs-string">"just reformulate it if needed and otherwise return it as is."</span>
)

contextualize_q_prompt = ChatPromptTemplate.from_messages([
    (<span class="hljs-string">"system"</span>, contextualize_q_system_prompt),
    MessagesPlaceholder(<span class="hljs-string">"chat_history"</span>),
    (<span class="hljs-string">"human"</span>, <span class="hljs-string">"{input}"</span>),
])

qa_prompt = ChatPromptTemplate.from_messages([
    (<span class="hljs-string">"system"</span>, <span class="hljs-string">"You are a helpful AI assistant. Use the following context to answer the user's question."</span>),
    (<span class="hljs-string">"system"</span>, <span class="hljs-string">"Context: {context}"</span>),
    MessagesPlaceholder(variable_name=<span class="hljs-string">"chat_history"</span>),
    (<span class="hljs-string">"human"</span>, <span class="hljs-string">"{input}"</span>)
])
</code></pre>
<p>We define two main prompts:</p>
<ol>
<li><p><code>contextualize_q_prompt</code>: Used to reformulate the user's question based on chat history.</p>
</li>
<li><p><code>qa_prompt</code>: Used to generate the final answer based on the retrieved context and chat history.</p>
</li>
</ol>
<h3 id="heading-creating-the-rag-chain">Creating the RAG Chain</h3>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_rag_chain</span>(<span class="hljs-params">model=<span class="hljs-string">"gpt-4o-mini"</span></span>):</span>
    llm = ChatOpenAI(model=model)
    history_aware_retriever = create_history_aware_retriever(llm, retriever, contextualize_q_prompt)
    question_answer_chain = create_stuff_documents_chain(llm, qa_prompt)
    rag_chain = create_retrieval_chain(history_aware_retriever, question_answer_chain)    
    <span class="hljs-keyword">return</span> rag_chain
</code></pre>
<p>This function creates our RAG chain:</p>
<ol>
<li><p>It initializes the language model (<code>ChatOpenAI</code>) with the specified model name.</p>
</li>
<li><p>Creates a history-aware retriever that can understand context from previous interactions.</p>
</li>
<li><p>Sets up a question-answering chain that combines retrieved documents to generate an answer.</p>
</li>
<li><p>Finally, it creates the full RAG chain by combining the retriever and question-answering chain.</p>
</li>
</ol>
<h3 id="heading-integration-with-main-application">Integration with Main Application</h3>
<p>In our <a target="_blank" href="http://main.py"><code>main.py</code></a>, we use this RAG chain in the chat endpoint:</p>
<pre><code class="lang-python"><span class="hljs-meta">@app.post("/chat", response_model=QueryResponse)</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">chat</span>(<span class="hljs-params">query_input: QueryInput</span>):</span>
    <span class="hljs-comment"># ... (other code)</span>
    rag_chain = get_rag_chain(query_input.model.value)
    answer = rag_chain.invoke({
        <span class="hljs-string">"input"</span>: query_input.question,
        <span class="hljs-string">"chat_history"</span>: chat_history
    })[<span class="hljs-string">'answer'</span>]
    <span class="hljs-comment"># ... (rest of the function)</span>
</code></pre>
<p>This shows how the RAG chain is instantiated with the user-specified model and invoked with the user's question and chat history.</p>
<p>By centralizing our LangChain logic in <code>langchain_</code><a target="_blank" href="http://utils.py"><code>utils.py</code></a>, we maintain a clean separation of concerns and make it easier to modify or extend our RAG system in the future. This modular approach allows us to easily experiment with different models, retrievers, or chain structures without affecting the rest of the application.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Throughout this tutorial, we've walked through the process of building a production-ready Retrieval-Augmented Generation (RAG) chatbot using FastAPI and LangChain. Let's recap what we've accomplished and discuss some key takeaways and potential next steps.</p>
<h3 id="heading-what-weve-built">What We've Built</h3>
<ol>
<li><p><strong>FastAPI Application (</strong><a target="_blank" href="http://main.py"><code>main.py</code></a>): We created a robust API with endpoints for chat interactions, document management, and system information.</p>
</li>
<li><p><strong>Data Models (</strong><code>pydantic_</code><a target="_blank" href="http://models.py"><code>models.py</code></a>): We defined clear, type-safe models for our API's requests and responses.</p>
</li>
<li><p><strong>Database Utilities (</strong><code>db_</code><a target="_blank" href="http://utils.py"><code>utils.py</code></a>): We implemented SQLite database operations for managing chat logs and document metadata.</p>
</li>
<li><p><strong>Vector Store Integration (</strong><code>chroma_</code><a target="_blank" href="http://utils.py"><code>utils.py</code></a>): We set up document indexing and retrieval using the Chroma vector store.</p>
</li>
<li><p><strong>LangChain RAG Implementation (</strong><code>langchain_</code><a target="_blank" href="http://utils.py"><code>utils.py</code></a>): We created a flexible, history-aware RAG chain using LangChain components.</p>
</li>
</ol>
<p>This architecture allows for a scalable, maintainable, and extensible RAG system that can be deployed in a production environment.</p>
<h3 id="heading-potential-improvements-and-extensions">Potential Improvements and Extensions</h3>
<ol>
<li><p><strong>Authentication and Authorization</strong>: Implement user authentication to secure the API and enable user-specific document access.</p>
</li>
<li><p><strong>Asynchronous Processing</strong>: Convert synchronous operations to asynchronous for better performance, especially for document processing.</p>
</li>
<li><p><strong>Advanced Retrieval Techniques</strong>: Experiment with techniques like hybrid search or re-ranking to improve retrieval quality.</p>
</li>
<li><p><strong>Monitoring and Logging</strong>: Implement comprehensive logging and monitoring for better observability in production.</p>
</li>
<li><p><strong>Scalability</strong>: Consider distributed architectures for handling larger document collections and higher request volumes.</p>
</li>
<li><p><strong>Fine-tuning</strong>: Explore fine-tuning the language model on domain-specific data for improved performance.</p>
</li>
<li><p><strong>UI Integration</strong>: Develop a user interface (e.g., a web application or chat interface) to interact with the API.</p>
</li>
<li><p><strong>Containerization</strong>: Package the application using Docker for easier deployment and scaling.</p>
</li>
<li><p><strong>Testing</strong>: Implement comprehensive unit and integration tests to ensure system reliability.</p>
</li>
<li><p><strong>Caching</strong>: Introduce caching mechanisms to improve response times for frequent queries.</p>
</li>
</ol>
<h3 id="heading-final-thoughts">Final Thoughts</h3>
<p>Building a production-ready RAG chatbot involves more than just connecting a language model to a document store. It requires careful consideration of data flow, error handling, scalability, and user experience. The system we've built provides a solid foundation that can be adapted and extended to meet specific business needs.</p>
<p>As AI and natural language processing technologies continue to evolve, systems like this will become increasingly important for creating intelligent, context-aware applications. By understanding the principles and components of RAG systems, you're well-equipped to build and improve upon this technology in your own projects.</p>
<p>Remember, the key to a successful RAG system lies not just in the individual components, but in how they work together to create a seamless, intelligent interaction. Continual testing, monitoring, and refinement based on real-world usage will be crucial to ensuring the long-term success and effectiveness of your RAG chatbot.</p>
<h3 id="heading-additional-resources">Additional Resources</h3>
<p>To help you further understand and implement this RAG chatbot system, I've prepared some additional resources:</p>
<ol>
<li><p><strong>Video Tutorial</strong>: For a comprehensive walkthrough of this entire project, including live coding and explanations, check out my YouTube video:</p>
<p> <a target="_blank" href="https://youtu.be/38aMTXY2usU?si=9XsjXHvtAHbSeP32">Watch the Full RAG Chatbot Tutorial</a></p>
<p> In this video, I cover all three parts of our blog series, demonstrating the implementation details and providing additional insights.</p>
</li>
<li><p><strong>GitHub Repository</strong>: The complete source code for this project is available on GitHub. You can clone, fork, or download the repository to explore the code in detail or use it as a starting point for your own projects:</p>
<p> <a target="_blank" href="https://github.com/PradipNichite/Youtube-Tutorials/tree/main/Langchain%20RAG%20Course%202024">RAG Chatbot GitHub Repository</a></p>
<p> The repository includes all the components we've discussed: the FastAPI backend, Streamlit frontend, and associated utilities.</p>
</li>
</ol>
<h3 id="heading-futuresmart-ai-your-partner-in-custom-nlp-solutions">FutureSmart AI: Your Partner in Custom NLP Solutions</h3>
<p>At FutureSmart AI, we specialize in building custom Natural Language Processing (NLP) solutions tailored to your specific needs. Our expertise extends beyond RAG systems to include:</p>
<ul>
<li><p>Natural Language to SQL (NL2SQL) interfaces</p>
</li>
<li><p>Advanced document parsing and analysis</p>
</li>
<li><p>Custom chatbots and conversational AI</p>
</li>
<li><p>And much more in the realm of NLP and AI</p>
</li>
</ul>
<p>We've successfully implemented these technologies for various industries, helping businesses leverage the power of AI to enhance their operations and user experiences.</p>
<h4 id="heading-interested-in-learning-more">Interested in Learning More?</h4>
<ul>
<li><p><strong>Check out our case studies</strong>: <a target="_blank" href="https://www.futuresmart.ai/case-studies">FutureSmart AI Case Studies</a> Explore real-world applications of our NLP solutions and see how they've transformed businesses across different sectors.</p>
</li>
<li><p><strong>Get in touch</strong>: Have a project in mind or want to discuss how NLP can benefit your business? We'd love to hear from you! Contact us at <a target="_blank" href="mailto:contact@futuresmart.ai">contact@futuresmart.ai</a></p>
</li>
</ul>
<p>Whether you're looking to implement a RAG system like the one we've built in this tutorial, or you have more specific NLP needs, our team at FutureSmart AI is here to help turn your AI aspirations into reality.</p>
]]></content:encoded></item></channel></rss>