Imagine building an AI assistant that can actually read your documents, edit files, and remember context between conversations. Not just another chatbot, but an AI that has real capabilities. Today, we’re building exactly that using the Model Context Protocol (MCP), LangChain, and Google’s Gemini models.

  • Interactive CLI chat application
  • AI that reads and manages documents
  • Auto-completion and intelligent suggestions
  • Real-time streaming responses
  • Complete MCP implementation from scratch

Understanding MCP: The Foundation

Before diving into implementation, let’s understand what makes MCP special. The Model Context Protocol isn’t just another API - it’s a standardized way for AI models to interact with external systems, data, and tools in a structured, secure, and scalable manner.

Core Components

Resources: Read-Only Data Access

Resources are your AI’s window to external data. Think of them as read-only endpoints that provide structured access to information:

  • Documents: PDFs, text files, databases
  • APIs: External services, web endpoints
  • File Systems: Directory listings, file contents
  • Real-time Data: Sensors, live feeds, system metrics

Resources use URI patterns like file:///documents/{id} or api://weather/{location} to provide consistent, discoverable access to data.

Tools: Functions AI Can Execute

Tools are the AI’s hands - functions it can call to perform actions:

  • File Operations: Create, edit, delete files
  • API Calls: Send requests, process responses
  • System Commands: Execute shell commands, manage processes
  • Data Processing: Transform, analyze, visualize data

Each tool has a clear schema defining inputs, outputs, and behavior, making them reliable and predictable.

Prompts: Templates for Consistency

Prompts provide reusable templates that ensure consistent AI behavior:

  • System Prompts: Define AI personality and behavior
  • Task Templates: Structured approaches to common tasks
  • Output Formats: Ensure consistent response formatting
  • Context Injection: Add relevant context to conversations

Advanced Features

Sampling and Progress Notifications

MCP supports advanced interaction patterns:

  • Sampling: AI can request specific data samples for analysis
  • Progress Tracking: Long-running operations can report status
  • Notifications: Real-time updates about system changes
  • Cancellation: Graceful handling of interrupted operations

Roots: Security and Access Control

Roots define the boundaries of what AI can access:

  • File System Roots: Limit access to specific directories
  • API Scope: Control which endpoints are available
  • Permission Levels: Define read/write access patterns
  • Security Policies: Implement access control and auditing

Transport and Communication

JSON Message Types

MCP uses structured JSON messages for all communication:

This ensures type safety, versioning, and clear error handling across all interactions.

Stdio Transport: Simple and Reliable

The stdio transport uses standard input/output streams:

  • Process-based: Each MCP server runs as a separate process
  • Isolation: Servers can’t interfere with each other
  • Reliability: Built-in process management and error recovery
  • Simplicity: No network configuration or port management needed

Streamable HTTP Transport: Solving HTTP’s Shortcomings

While HTTP is ubiquitous, it has limitations for AI interactions:

Traditional HTTP Issues:

  • Request/response only - no server-initiated communication
  • No built-in streaming for long-running operations
  • Complex state management across multiple requests
  • Timeout issues with lengthy AI processing

MCP’s HTTP Solution:

  • Server-Sent Events (SSE): Real-time progress updates
  • Persistent Connections: Maintain state across interactions
  • Chunked Responses: Stream results as they’re generated
  • Bidirectional Communication: Servers can notify clients of changes

This creates a more natural interaction pattern for AI applications, where operations might take time and progress needs to be communicated.

State Management

MCP handles complex state scenarios:

  • Session State: Maintain context across multiple interactions
  • Resource State: Track changes to external data sources
  • Tool State: Manage ongoing operations and their progress
  • Error Recovery: Graceful handling of connection issues and failures

What We’re Building

Before we start, make sure you have:

  • Python 3.10+ installed
  • uv package manager (install here)
  • Google API key for Gemini (get one here)
  • Google Application Credentials (get one here)
  • Node.js 18+ for the MCP Inspector tool

We’ll use uv throughout this tutorial as it provides fast, reliable package management and makes our MCP setup much smoother.

Creating the Core Features

Let’s start by building the heart of our application - the chat system that will eventually connect to our MCP server. We’ll begin with a simple version and progressively add more capabilities.

Make sure to enable the Google Generative AI API in your Google Cloud Console for Gemini access. and export your GOOGLE_API_KEY and GOOGLE_APPLICATION_CREDENTIALS in your environment variables.

First, let’s set up our project:

What’s happening here:

  • uv init .: Initializes a new Python project with uv, creating a pyproject.toml file and basic project structure. We use uv because it’s significantly faster than pip, provides better dependency resolution, and handles virtual environments automatically.

  • langchain: The core library for building LLM applications. It provides abstractions for messages, agents, and chains.

  • langchain-mcp-adapters: This is the bridge that connects MCP servers to LangChain. It automatically converts MCP tools and resources into LangChain-compatible tools that can be used by agents.

  • langgraph: A library for building stateful, multi-actor applications with LLMs. We use it to create a ReAct (Reasoning and Acting) agent that can think, use tools, and respond intelligently.

  • langchain-google-genai: Official LangChain integration for Google’s Gemini models. This provides the core language model capabilities with streaming support.

  • mcp: The core Model Context Protocol library that handles client-server communication, message serialization, and transport management.

  • python-dotenv: Loads environment variables from .env files, keeping API keys and configuration separate from code.

Now let’s create our basic chat system:

Breaking Down the Chat Class

Initialization (__init__)

  • self.clients: Holds MCP clients for different servers (docs, APIs, etc.).
  • self.messages: Stores conversation history using LangChain messages.
  • self.agent: The ReAct agent, created only when needed to save resources.

Agent Setup (initialize_agent)

  • Tools: load_mcp_tools() turns MCP server features into LangChain tools the agent can call like functions.
  • Model: Spins up a Gemini model with streaming and config options.
  • ReAct Agent: Combines reasoning + acting. The agent:
    1. Figures out which tool to use
    2. Calls it
    3. Looks at the result
    4. Repeats until done

Query Processing (run)

  • Lazy Start: The agent only loads when you actually use it.
  • Conversation Flow: Each user query becomes a HumanMessage in history.
  • Streaming: astream_events() shows the agent’s step-by-step process — tool calls, results, reasoning, responses, and even errors.

This architecture creates a conversational agent that can maintain context, use tools intelligently, and provide real-time feedback about its reasoning process.

Simple CLI Interface

Let’s create a basic CLI that we can test our chat system with, including detailed explanations of how real-time streaming works:

Understanding CLI Event Processing

Async Event Loop (asyncio.run(main()))

  • Runs Python’s async runtime for concurrent operations.
  • Keeps the interface responsive while the AI processes in the background.
  • Enables real-time streaming without blocking.

Input Processing Loop

  • Handles input: Validates commands and user input.
  • Exit conditions: Clean shutdown on quit or Ctrl+C.
  • Filters blanks: Skips empty inputs to avoid wasted API calls.

Real-time Streaming (async for event in chat.run(user_input))
Instead of waiting for a full response, events stream in as they happen:

  • on_chat_model_stream: Text chunks from the model.
  • on_tool_start: Tool begins execution.
  • on_tool_end: Tool finishes execution.
  • on_agent_action: Agent decides on the next move.

Chunk Processing Example

Create a .env file with your Google API key:

Now you can test the basic chat:

Basic Chat with no tools

Basic chat with no tools

Great! We have a working chat, but it’s just a regular AI - no special capabilities yet. Let’s add MCP to give our AI some superpowers.

Building the MCP Server

Now we’ll create an MCP server that manages documents and provides tools for our AI to use. This is where the magic happens - we’re creating a service that exposes capabilities to AI agents:

Understanding the MCP Server Architecture

FastMCP Framework (FastMCP("DocumentMCP", log_level="ERROR"))

  • Identifies the server as "DocumentMCP".
  • Logs only errors to cut noise.
  • Auto-discovers decorated functions as MCP tools.
  • Works with stdio, HTTP, or other transports.

Document Storage (docs dictionary)

  • Uses a simple in-memory dict for demo purposes.
  • Keys = document IDs, values = content.
  • Mimics real business data.
  • Can be swapped for a database or external storage.

Tool Decorators (@mcp.tool())
Turns Python functions into AI tools:

  • Generates schemas automatically from type hints.
  • Enforces type safety and input validation.
  • Uses docstrings as tool descriptions.
  • Surfaces exceptions back to the AI cleanly.

Read Tool (read_document)

Understanding the MCP Client Architecture

Client Initialization (__init__)

  • Stores the command used to launch the MCP server.
  • Supports optional environment variables for isolation.
  • Uses AsyncExitStack to manage resources and cleanup.
  • Delays connection setup until actually needed.

Stdio Transport (connect method)

Process Setup

  • Defines how the server process should start.
  • Passes arguments for configuration.
  • Runs with controlled environment variables.

Transport Layer

  • Launches and manages the server process.
  • Sets up bidirectional stdin/stdout pipes.
  • Recovers from crashes automatically.
  • Cleans up processes when done.

Session Management (ClientSession)

  • _stdio for reading, _write for writing.
  • Handles JSON-RPC serialization and parsing.
  • Initializes handshake to sync state.
  • Confirms the server is ready before use.

Async Context Manager

  • Entry (__aenter__): Connects automatically when used with async with, returns self, handles errors cleanly.
  • Exit (__aexit__): Always closes connections, kills stray processes, and frees resources.

Session Access (session method)

  • Validates the connection before returning.
  • Provides clear errors if not connected.
  • Returns a typed ClientSession for IDE support.

Tool Discovery (list_tools)

  • Queries the server for available tools.
  • Retrieves schemas and documentation.
  • Adapts to servers whose capabilities change over time.

Communication Flow

  1. Client launches server as a subprocess.
  2. Sets up stdin/stdout streams.
  3. Performs protocol handshake.
  4. Server advertises tools, resources, and prompts.
  5. Client sends JSON-RPC requests, server responds.
  6. Both handle malformed messages and failures.
  7. Client shuts down the server cleanly on exit.

This design hides the messy parts of process and protocol management, giving you a reliable, efficient client–server bridge.

Creating the MCP Client

Now we need a client to connect to our MCP server. This client handles all the complex communication details and provides a clean interface for our application:

Let’s update our main application to use the MCP client:

MCP Client Architecture

Init (__init__)

  • Stores server command + env vars.
  • Uses AsyncExitStack for cleanup.
  • Connects lazily.

Connect (connect)

  • Launches server process with StdioServerParameters.
  • Sets up stdin/stdout transport.
  • Manages crashes + cleanup automatically.

Session (ClientSession)

  • Handles JSON-RPC read/write.
  • Runs handshake to sync state.
  • Ensures server is ready.

Context Manager

  • __aenter__: auto-connects.
  • __aexit__: closes + kills processes safely.

Extras

  • session(): validates + returns session.
  • list_tools(): discovers tools + schemas.

Now test it out:

Basic Chat with tools

Basic chat with tools

Amazing! Your AI can now read and edit documents. But we can make it even better by adding Resources for data access consistent behavior.

Enriching chat interface with Resources

Now we have the barebones working , now for the wrap up lets implemente auto suggestions, This is something that is observed in all the AI tools. For keeping it simple in our case for enhancing the user experience we will make it so that when user types @ he gets an auto suggestion for the documents available.

Lets import the necessary libraries

Now we will create a custom completer class that will fetch the document names from the MCP server and provide them as suggestions when user types @

Now lets update the mainfunction to use this completer and fetch documents from MCP server

Basic Chat with no tools

Auto-completion in action

Key features of this implementation:

  • Automatic triggering: When you type @, it immediately shows completions
  • Fuzzy matching: Type @fin and it will suggest financials.docx
  • Real-time filtering: As you type more characters, the suggestions narrow down
  • Visual feedback: Shows “Document” as metadata for each suggestion

Testing with MCP Inspector

The MCP Inspector is a fantastic tool for debugging and exploring your MCP server. Let’s use it to see what we’ve built:

MCP inspector

MCP Inspector Interface

This opens a web interface where you can:

  • See all your tools, resources, and prompts
  • Test tool calls interactively
  • Explore resource URIs
  • Debug your MCP implementation

The Inspector shows you exactly what your AI can see and do - it’s invaluable for development and debugging.

Your Complete MCP Application

Congratulations! You’ve built a complete MCP-powered AI application. Here’s what you’ve accomplished:

Core chat system with LangChain and Gemini integration
MCP server with tools for document management
MCP client for seamless communication
Resources for efficient data access
Prompts for consistent AI behavior
Inspector integration for development and debugging

🚀 Complete Implementation with Advanced Features

The full example includes additional features like prompts, logging and reporting, CLI interface, error handling etc. You now have everything you need to build sophisticated AI applications that can actually interact with real-world data and systems.

Other helpful resources:

Learn and get certifcation from anthropic for FREE!

Share this article

Help others discover this content by sharing it on your favorite platform

You May Also Like