{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Memory and RAG\n",
    "\n",
    "There are several use cases where it is valuable to maintain a _store_ of useful facts that can be intelligently added to the context of the agent just before a specific step. The typically use case here is a RAG pattern where a query is used to retrieve relevant information from a database that is then added to the agent's context.\n",
    "\n",
    "\n",
    "AgentChat provides a {py:class}`~autogen_core.memory.Memory` protocol that can be extended to provide this functionality.  The key methods are `query`, `update_context`,  `add`, `clear`, and `close`. \n",
    "\n",
    "- `add`: add new entries to the memory store\n",
    "- `query`: retrieve relevant information from the memory store \n",
    "- `update_context`: mutate an agent's internal `model_context` by adding the retrieved information (used in the {py:class}`~autogen_agentchat.agents.AssistantAgent` class) \n",
    "- `clear`: clear all entries from the memory store\n",
    "- `close`: clean up any resources used by the memory store  \n",
    "\n",
    "\n",
    "## ListMemory Example\n",
    "\n",
    "{py:class}`~autogen_core.memory.ListMemory` is provided as an example implementation of the {py:class}`~autogen_core.memory.Memory` protocol. It is a simple list-based memory implementation that maintains memories in chronological order, appending the most recent memories to the model's context. The implementation is designed to be straightforward and predictable, making it easy to understand and debug.\n",
    "In the following example, we will use ListMemory to maintain a memory bank of user preferences and demonstrate how it can be used to provide consistent context for agent responses over time."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from autogen_agentchat.agents import AssistantAgent\n",
    "from autogen_agentchat.ui import Console\n",
    "from autogen_core.memory import ListMemory, MemoryContent, MemoryMimeType\n",
    "from autogen_ext.models.openai import OpenAIChatCompletionClient"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Initialize user memory\n",
    "user_memory = ListMemory()\n",
    "\n",
    "# Add user preferences to memory\n",
    "await user_memory.add(MemoryContent(content=\"The weather should be in metric units\", mime_type=MemoryMimeType.TEXT))\n",
    "\n",
    "await user_memory.add(MemoryContent(content=\"Meal recipe must be vegan\", mime_type=MemoryMimeType.TEXT))\n",
    "\n",
    "\n",
    "async def get_weather(city: str, units: str = \"imperial\") -> str:\n",
    "    if units == \"imperial\":\n",
    "        return f\"The weather in {city} is 73 °F and Sunny.\"\n",
    "    elif units == \"metric\":\n",
    "        return f\"The weather in {city} is 23 °C and Sunny.\"\n",
    "    else:\n",
    "        return f\"Sorry, I don't know the weather in {city}.\"\n",
    "\n",
    "\n",
    "assistant_agent = AssistantAgent(\n",
    "    name=\"assistant_agent\",\n",
    "    model_client=OpenAIChatCompletionClient(\n",
    "        model=\"gpt-4o-2024-08-06\",\n",
    "    ),\n",
    "    tools=[get_weather],\n",
    "    memory=[user_memory],\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "---------- TextMessage (user) ----------\n",
      "What is the weather in New York?\n",
      "---------- MemoryQueryEvent (assistant_agent) ----------\n",
      "[MemoryContent(content='The weather should be in metric units', mime_type=<MemoryMimeType.TEXT: 'text/plain'>, metadata=None), MemoryContent(content='Meal recipe must be vegan', mime_type=<MemoryMimeType.TEXT: 'text/plain'>, metadata=None)]\n",
      "---------- ToolCallRequestEvent (assistant_agent) ----------\n",
      "[FunctionCall(id='call_33uMqZO6hwOfEpJavP9GW9LI', arguments='{\"city\":\"New York\",\"units\":\"metric\"}', name='get_weather')]\n",
      "---------- ToolCallExecutionEvent (assistant_agent) ----------\n",
      "[FunctionExecutionResult(content='The weather in New York is 23 °C and Sunny.', name='get_weather', call_id='call_33uMqZO6hwOfEpJavP9GW9LI', is_error=False)]\n",
      "---------- ToolCallSummaryMessage (assistant_agent) ----------\n",
      "The weather in New York is 23 °C and Sunny.\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "TaskResult(messages=[TextMessage(source='user', models_usage=None, metadata={}, created_at=datetime.datetime(2025, 7, 1, 23, 53, 8, 867845, tzinfo=datetime.timezone.utc), content='What is the weather in New York?', type='TextMessage'), MemoryQueryEvent(source='assistant_agent', models_usage=None, metadata={}, created_at=datetime.datetime(2025, 7, 1, 23, 53, 8, 869589, tzinfo=datetime.timezone.utc), content=[MemoryContent(content='The weather should be in metric units', mime_type=<MemoryMimeType.TEXT: 'text/plain'>, metadata=None), MemoryContent(content='Meal recipe must be vegan', mime_type=<MemoryMimeType.TEXT: 'text/plain'>, metadata=None)], type='MemoryQueryEvent'), ToolCallRequestEvent(source='assistant_agent', models_usage=RequestUsage(prompt_tokens=123, completion_tokens=19), metadata={}, created_at=datetime.datetime(2025, 7, 1, 23, 53, 10, 240626, tzinfo=datetime.timezone.utc), content=[FunctionCall(id='call_33uMqZO6hwOfEpJavP9GW9LI', arguments='{\"city\":\"New York\",\"units\":\"metric\"}', name='get_weather')], type='ToolCallRequestEvent'), ToolCallExecutionEvent(source='assistant_agent', models_usage=None, metadata={}, created_at=datetime.datetime(2025, 7, 1, 23, 53, 10, 242633, tzinfo=datetime.timezone.utc), content=[FunctionExecutionResult(content='The weather in New York is 23 °C and Sunny.', name='get_weather', call_id='call_33uMqZO6hwOfEpJavP9GW9LI', is_error=False)], type='ToolCallExecutionEvent'), ToolCallSummaryMessage(source='assistant_agent', models_usage=None, metadata={}, created_at=datetime.datetime(2025, 7, 1, 23, 53, 10, 243722, tzinfo=datetime.timezone.utc), content='The weather in New York is 23 °C and Sunny.', type='ToolCallSummaryMessage')], stop_reason=None)"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Run the agent with a task.\n",
    "stream = assistant_agent.run_stream(task=\"What is the weather in New York?\")\n",
    "await Console(stream)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can inspect that the `assistant_agent` model_context is actually updated with the retrieved memory entries.  The `transform` method is used to format the retrieved memory entries into a string that can be used by the agent.  In this case, we simply concatenate the content of each memory entry into a single string."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[UserMessage(content='What is the weather in New York?', source='user', type='UserMessage'),\n",
       " SystemMessage(content='\\nRelevant memory content (in chronological order):\\n1. The weather should be in metric units\\n2. Meal recipe must be vegan\\n', type='SystemMessage'),\n",
       " AssistantMessage(content=[FunctionCall(id='call_33uMqZO6hwOfEpJavP9GW9LI', arguments='{\"city\":\"New York\",\"units\":\"metric\"}', name='get_weather')], thought=None, source='assistant_agent', type='AssistantMessage'),\n",
       " FunctionExecutionResultMessage(content=[FunctionExecutionResult(content='The weather in New York is 23 °C and Sunny.', name='get_weather', call_id='call_33uMqZO6hwOfEpJavP9GW9LI', is_error=False)], type='FunctionExecutionResultMessage')]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "await assistant_agent._model_context.get_messages()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We see above that the weather is returned in Centigrade as stated in the user preferences. \n",
    "\n",
    "Similarly, assuming we ask a separate question about generating a meal plan, the agent is able to retrieve relevant information from the memory store and provide a personalized (vegan) response."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "---------- TextMessage (user) ----------\n",
      "Write brief meal recipe with broth\n",
      "---------- MemoryQueryEvent (assistant_agent) ----------\n",
      "[MemoryContent(content='The weather should be in metric units', mime_type=<MemoryMimeType.TEXT: 'text/plain'>, metadata=None), MemoryContent(content='Meal recipe must be vegan', mime_type=<MemoryMimeType.TEXT: 'text/plain'>, metadata=None)]\n",
      "---------- TextMessage (assistant_agent) ----------\n",
      "Here's a brief vegan meal recipe using broth:\n",
      "\n",
      "**Vegan Vegetable Broth Soup**\n",
      "\n",
      "**Ingredients:**\n",
      "- 1 tablespoon olive oil\n",
      "- 1 onion, chopped\n",
      "- 3 cloves garlic, minced\n",
      "- 2 carrots, sliced\n",
      "- 2 celery stalks, sliced\n",
      "- 1 zucchini, chopped\n",
      "- 1 cup mushrooms, sliced\n",
      "- 1 cup kale or spinach, chopped\n",
      "- 1 can (400g) diced tomatoes\n",
      "- 4 cups vegetable broth\n",
      "- 1 teaspoon dried thyme\n",
      "- Salt and pepper to taste\n",
      "- Fresh parsley, chopped (for garnish)\n",
      "\n",
      "**Instructions:**\n",
      "1. Heat olive oil in a large pot over medium heat. Add the onion and garlic, and sauté until soft.\n",
      "2. Add the carrots, celery, zucchini, and mushrooms. Cook for about 5 minutes until the vegetables begin to soften.\n",
      "3. Add the diced tomatoes, vegetable broth, and dried thyme. Bring to a boil.\n",
      "4. Reduce heat and let it simmer for about 20 minutes, or until the vegetables are tender.\n",
      "5. Stir in the chopped kale or spinach and cook for another 5 minutes.\n",
      "6. Season with salt and pepper to taste.\n",
      "7. Serve hot, garnished with fresh parsley.\n",
      "\n",
      "Enjoy your comforting vegan vegetable broth soup!\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "TaskResult(messages=[TextMessage(source='user', models_usage=None, metadata={}, created_at=datetime.datetime(2025, 7, 1, 23, 53, 10, 256897, tzinfo=datetime.timezone.utc), content='Write brief meal recipe with broth', type='TextMessage'), MemoryQueryEvent(source='assistant_agent', models_usage=None, metadata={}, created_at=datetime.datetime(2025, 7, 1, 23, 53, 10, 258468, tzinfo=datetime.timezone.utc), content=[MemoryContent(content='The weather should be in metric units', mime_type=<MemoryMimeType.TEXT: 'text/plain'>, metadata=None), MemoryContent(content='Meal recipe must be vegan', mime_type=<MemoryMimeType.TEXT: 'text/plain'>, metadata=None)], type='MemoryQueryEvent'), TextMessage(source='assistant_agent', models_usage=RequestUsage(prompt_tokens=205, completion_tokens=266), metadata={}, created_at=datetime.datetime(2025, 7, 1, 23, 53, 14, 67151, tzinfo=datetime.timezone.utc), content=\"Here's a brief vegan meal recipe using broth:\\n\\n**Vegan Vegetable Broth Soup**\\n\\n**Ingredients:**\\n- 1 tablespoon olive oil\\n- 1 onion, chopped\\n- 3 cloves garlic, minced\\n- 2 carrots, sliced\\n- 2 celery stalks, sliced\\n- 1 zucchini, chopped\\n- 1 cup mushrooms, sliced\\n- 1 cup kale or spinach, chopped\\n- 1 can (400g) diced tomatoes\\n- 4 cups vegetable broth\\n- 1 teaspoon dried thyme\\n- Salt and pepper to taste\\n- Fresh parsley, chopped (for garnish)\\n\\n**Instructions:**\\n1. Heat olive oil in a large pot over medium heat. Add the onion and garlic, and sauté until soft.\\n2. Add the carrots, celery, zucchini, and mushrooms. Cook for about 5 minutes until the vegetables begin to soften.\\n3. Add the diced tomatoes, vegetable broth, and dried thyme. Bring to a boil.\\n4. Reduce heat and let it simmer for about 20 minutes, or until the vegetables are tender.\\n5. Stir in the chopped kale or spinach and cook for another 5 minutes.\\n6. Season with salt and pepper to taste.\\n7. Serve hot, garnished with fresh parsley.\\n\\nEnjoy your comforting vegan vegetable broth soup!\", type='TextMessage')], stop_reason=None)"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "stream = assistant_agent.run_stream(task=\"Write brief meal recipe with broth\")\n",
    "await Console(stream)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Custom Memory Stores (Vector DBs, etc.)\n",
    "\n",
    "You can build on the `Memory` protocol to implement more complex memory stores. For example, you could implement a custom memory store that uses a vector database to store and retrieve information, or a memory store that uses a machine learning model to generate personalized responses based on the user's preferences etc.\n",
    "\n",
    "Specifically, you will need to overload the `add`, `query` and `update_context`  methods to implement the desired functionality and pass the memory store to your agent.\n",
    "\n",
    "\n",
    "Currently the following example memory stores are available as part of the {py:class}`~autogen_ext` extensions package.\n",
    "\n",
    "- `autogen_ext.memory.chromadb.ChromaDBVectorMemory`: A memory store that uses a vector database to store and retrieve information.\n",
    "\n",
    "- `autogen_ext.memory.chromadb.SentenceTransformerEmbeddingFunctionConfig`: A configuration class for the SentenceTransformer embedding function used by the `ChromaDBVectorMemory` store. Note that other embedding functions such as `autogen_ext.memory.openai.OpenAIEmbeddingFunctionConfig` can also be used with the `ChromaDBVectorMemory` store.\n",
    "\n",
    "- `autogen_ext.memory.redis.RedisMemory`: A memory store that uses a Redis vector database to store and retrieve information.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "---------- TextMessage (user) ----------\n",
      "What is the weather in New York?\n",
      "---------- MemoryQueryEvent (assistant_agent) ----------\n",
      "[MemoryContent(content='The weather should be in metric units', mime_type='MemoryMimeType.TEXT', metadata={'category': 'preferences', 'mime_type': 'MemoryMimeType.TEXT', 'type': 'units', 'score': 0.4342913031578064, 'id': 'b8a70e90-a39f-47ed-ab7b-5a274009d9f0'}), MemoryContent(content='The weather should be in metric units', mime_type='MemoryMimeType.TEXT', metadata={'mime_type': 'MemoryMimeType.TEXT', 'type': 'units', 'category': 'preferences', 'score': 0.4342913031578064, 'id': 'b240f12a-1440-42d1-8f5e-3d8a388363f2'})]\n",
      "---------- ToolCallRequestEvent (assistant_agent) ----------\n",
      "[FunctionCall(id='call_YmKqq1nWXgAkAAyXWWk9YpFW', arguments='{\"city\":\"New York\",\"units\":\"metric\"}', name='get_weather')]\n",
      "---------- ToolCallExecutionEvent (assistant_agent) ----------\n",
      "[FunctionExecutionResult(content='The weather in New York is 23 °C and Sunny.', name='get_weather', call_id='call_YmKqq1nWXgAkAAyXWWk9YpFW', is_error=False)]\n",
      "---------- ToolCallSummaryMessage (assistant_agent) ----------\n",
      "The weather in New York is 23 °C and Sunny.\n"
     ]
    }
   ],
   "source": [
    "import tempfile\n",
    "\n",
    "from autogen_agentchat.agents import AssistantAgent\n",
    "from autogen_agentchat.ui import Console\n",
    "from autogen_core.memory import MemoryContent, MemoryMimeType\n",
    "from autogen_ext.memory.chromadb import (\n",
    "    ChromaDBVectorMemory,\n",
    "    PersistentChromaDBVectorMemoryConfig,\n",
    "    SentenceTransformerEmbeddingFunctionConfig,\n",
    ")\n",
    "from autogen_ext.models.openai import OpenAIChatCompletionClient\n",
    "\n",
    "# Use a temporary directory for ChromaDB persistence\n",
    "with tempfile.TemporaryDirectory() as tmpdir:\n",
    "    chroma_user_memory = ChromaDBVectorMemory(\n",
    "        config=PersistentChromaDBVectorMemoryConfig(\n",
    "            collection_name=\"preferences\",\n",
    "            persistence_path=tmpdir,  # Use the temp directory here\n",
    "            k=2,  # Return top k results\n",
    "            score_threshold=0.4,  # Minimum similarity score\n",
    "            embedding_function_config=SentenceTransformerEmbeddingFunctionConfig(\n",
    "                model_name=\"all-MiniLM-L6-v2\"  # Use default model for testing\n",
    "            ),\n",
    "        )\n",
    "    )\n",
    "    # Add user preferences to memory\n",
    "    await chroma_user_memory.add(\n",
    "        MemoryContent(\n",
    "            content=\"The weather should be in metric units\",\n",
    "            mime_type=MemoryMimeType.TEXT,\n",
    "            metadata={\"category\": \"preferences\", \"type\": \"units\"},\n",
    "        )\n",
    "    )\n",
    "\n",
    "    await chroma_user_memory.add(\n",
    "        MemoryContent(\n",
    "            content=\"Meal recipe must be vegan\",\n",
    "            mime_type=MemoryMimeType.TEXT,\n",
    "            metadata={\"category\": \"preferences\", \"type\": \"dietary\"},\n",
    "        )\n",
    "    )\n",
    "\n",
    "    model_client = OpenAIChatCompletionClient(\n",
    "        model=\"gpt-4o\",\n",
    "    )\n",
    "\n",
    "    # Create assistant agent with ChromaDB memory\n",
    "    assistant_agent = AssistantAgent(\n",
    "        name=\"assistant_agent\",\n",
    "        model_client=model_client,\n",
    "        tools=[get_weather],\n",
    "        memory=[chroma_user_memory],\n",
    "    )\n",
    "\n",
    "    stream = assistant_agent.run_stream(task=\"What is the weather in New York?\")\n",
    "    await Console(stream)\n",
    "\n",
    "    await model_client.close()\n",
    "    await chroma_user_memory.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that you can also serialize the ChromaDBVectorMemory and save it to disk."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'{\"provider\":\"autogen_ext.memory.chromadb.ChromaDBVectorMemory\",\"component_type\":\"memory\",\"version\":1,\"component_version\":1,\"description\":\"Store and retrieve memory using vector similarity search powered by ChromaDB.\",\"label\":\"ChromaDBVectorMemory\",\"config\":{\"client_type\":\"persistent\",\"collection_name\":\"preferences\",\"distance_metric\":\"cosine\",\"k\":2,\"score_threshold\":0.4,\"allow_reset\":false,\"tenant\":\"default_tenant\",\"database\":\"default_database\",\"persistence_path\":\"/Users/justin.cechmanek/.chromadb_autogen\"}}'"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chroma_user_memory.dump_component().model_dump_json()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Redis Memory\n",
    "You can perform the same persistent memory storage using Redis. Note, you will need to have a running Redis instance to connect to.\n",
    "\n",
    "See {py:class}`~autogen_ext.memory.redis.RedisMemory` for instructions to run Redis locally or via Docker."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "---------- TextMessage (user) ----------\n",
      "What is the weather in New York?\n",
      "---------- MemoryQueryEvent (assistant_agent) ----------\n",
      "[MemoryContent(content='The weather should be in metric units', mime_type=<MemoryMimeType.TEXT: 'text/plain'>, metadata={'category': 'preferences', 'type': 'units'})]\n",
      "---------- ToolCallRequestEvent (assistant_agent) ----------\n",
      "[FunctionCall(id='call_1R6wV3uDOK8mGK2Vh2t0h4ld', arguments='{\"city\":\"New York\",\"units\":\"metric\"}', name='get_weather')]\n",
      "---------- ToolCallExecutionEvent (assistant_agent) ----------\n",
      "[FunctionExecutionResult(content='The weather in New York is 23 °C and Sunny.', name='get_weather', call_id='call_1R6wV3uDOK8mGK2Vh2t0h4ld', is_error=False)]\n",
      "---------- ToolCallSummaryMessage (assistant_agent) ----------\n",
      "The weather in New York is 23 °C and Sunny.\n"
     ]
    }
   ],
   "source": [
    "from logging import WARNING, getLogger\n",
    "\n",
    "from autogen_agentchat.agents import AssistantAgent\n",
    "from autogen_agentchat.ui import Console\n",
    "from autogen_core.memory import MemoryContent, MemoryMimeType\n",
    "from autogen_ext.memory.redis import RedisMemory, RedisMemoryConfig\n",
    "from autogen_ext.models.openai import OpenAIChatCompletionClient\n",
    "\n",
    "logger = getLogger()\n",
    "logger.setLevel(WARNING)\n",
    "\n",
    "# Initailize Redis memory\n",
    "redis_memory = RedisMemory(\n",
    "    config=RedisMemoryConfig(\n",
    "        redis_url=\"redis://localhost:6379\",\n",
    "        index_name=\"chat_history\",\n",
    "        prefix=\"memory\",\n",
    "    )\n",
    ")\n",
    "\n",
    "# Add user preferences to memory\n",
    "await redis_memory.add(\n",
    "    MemoryContent(\n",
    "        content=\"The weather should be in metric units\",\n",
    "        mime_type=MemoryMimeType.TEXT,\n",
    "        metadata={\"category\": \"preferences\", \"type\": \"units\"},\n",
    "    )\n",
    ")\n",
    "\n",
    "await redis_memory.add(\n",
    "    MemoryContent(\n",
    "        content=\"Meal recipe must be vegan\",\n",
    "        mime_type=MemoryMimeType.TEXT,\n",
    "        metadata={\"category\": \"preferences\", \"type\": \"dietary\"},\n",
    "    )\n",
    ")\n",
    "\n",
    "model_client = OpenAIChatCompletionClient(\n",
    "    model=\"gpt-4o\",\n",
    ")\n",
    "\n",
    "# Create assistant agent with ChromaDB memory\n",
    "assistant_agent = AssistantAgent(\n",
    "    name=\"assistant_agent\",\n",
    "    model_client=model_client,\n",
    "    tools=[get_weather],\n",
    "    memory=[redis_memory],\n",
    ")\n",
    "\n",
    "stream = assistant_agent.run_stream(task=\"What is the weather in New York?\")\n",
    "await Console(stream)\n",
    "\n",
    "await model_client.close()\n",
    "await redis_memory.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## RAG Agent: Putting It All Together\n",
    "\n",
    "The RAG (Retrieval Augmented Generation) pattern which is common in building AI systems encompasses two distinct phases:\n",
    "\n",
    "1. **Indexing**: Loading documents, chunking them, and storing them in a vector database\n",
    "2. **Retrieval**: Finding and using relevant chunks during conversation runtime\n",
    "\n",
    "In our previous examples, we manually added items to memory and passed them to our agents. In practice, the indexing process is usually automated and based on much larger document sources like product documentation, internal files, or knowledge bases.\n",
    "\n",
    "> Note: The quality of a RAG system is dependent on the quality of the chunking and retrieval process (models, embeddings, etc.). You may need to experiement with more advanced chunking and retrieval models to get the best results.\n",
    "\n",
    "### Building a Simple RAG Agent\n",
    "\n",
    "To begin, let's create a simple document indexer that we will used to load documents, chunk them, and store them in a `ChromaDBVectorMemory` memory store. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "import re\n",
    "from typing import List\n",
    "\n",
    "import aiofiles\n",
    "import aiohttp\n",
    "from autogen_core.memory import Memory, MemoryContent, MemoryMimeType\n",
    "\n",
    "\n",
    "class SimpleDocumentIndexer:\n",
    "    \"\"\"Basic document indexer for AutoGen Memory.\"\"\"\n",
    "\n",
    "    def __init__(self, memory: Memory, chunk_size: int = 1500) -> None:\n",
    "        self.memory = memory\n",
    "        self.chunk_size = chunk_size\n",
    "\n",
    "    async def _fetch_content(self, source: str) -> str:\n",
    "        \"\"\"Fetch content from URL or file.\"\"\"\n",
    "        if source.startswith((\"http://\", \"https://\")):\n",
    "            async with aiohttp.ClientSession() as session:\n",
    "                async with session.get(source) as response:\n",
    "                    return await response.text()\n",
    "        else:\n",
    "            async with aiofiles.open(source, \"r\", encoding=\"utf-8\") as f:\n",
    "                return await f.read()\n",
    "\n",
    "    def _strip_html(self, text: str) -> str:\n",
    "        \"\"\"Remove HTML tags and normalize whitespace.\"\"\"\n",
    "        text = re.sub(r\"<[^>]*>\", \" \", text)\n",
    "        text = re.sub(r\"\\s+\", \" \", text)\n",
    "        return text.strip()\n",
    "\n",
    "    def _split_text(self, text: str) -> List[str]:\n",
    "        \"\"\"Split text into fixed-size chunks.\"\"\"\n",
    "        chunks: list[str] = []\n",
    "        # Just split text into fixed-size chunks\n",
    "        for i in range(0, len(text), self.chunk_size):\n",
    "            chunk = text[i : i + self.chunk_size]\n",
    "            chunks.append(chunk.strip())\n",
    "        return chunks\n",
    "\n",
    "    async def index_documents(self, sources: List[str]) -> int:\n",
    "        \"\"\"Index documents into memory.\"\"\"\n",
    "        total_chunks = 0\n",
    "\n",
    "        for source in sources:\n",
    "            try:\n",
    "                content = await self._fetch_content(source)\n",
    "\n",
    "                # Strip HTML if content appears to be HTML\n",
    "                if \"<\" in content and \">\" in content:\n",
    "                    content = self._strip_html(content)\n",
    "\n",
    "                chunks = self._split_text(content)\n",
    "\n",
    "                for i, chunk in enumerate(chunks):\n",
    "                    await self.memory.add(\n",
    "                        MemoryContent(\n",
    "                            content=chunk, mime_type=MemoryMimeType.TEXT, metadata={\"source\": source, \"chunk_index\": i}\n",
    "                        )\n",
    "                    )\n",
    "\n",
    "                total_chunks += len(chunks)\n",
    "\n",
    "            except Exception as e:\n",
    "                print(f\"Error indexing {source}: {str(e)}\")\n",
    "\n",
    "        return total_chunks"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    " \n",
    "Now let's use our indexer with ChromaDBVectorMemory to build a complete RAG agent:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Indexed 70 chunks from 4 AutoGen documents\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "from pathlib import Path\n",
    "\n",
    "from autogen_agentchat.agents import AssistantAgent\n",
    "from autogen_agentchat.ui import Console\n",
    "from autogen_ext.memory.chromadb import ChromaDBVectorMemory, PersistentChromaDBVectorMemoryConfig\n",
    "from autogen_ext.models.openai import OpenAIChatCompletionClient\n",
    "\n",
    "# Initialize vector memory\n",
    "\n",
    "rag_memory = ChromaDBVectorMemory(\n",
    "    config=PersistentChromaDBVectorMemoryConfig(\n",
    "        collection_name=\"autogen_docs\",\n",
    "        persistence_path=os.path.join(str(Path.home()), \".chromadb_autogen\"),\n",
    "        k=3,  # Return top 3 results\n",
    "        score_threshold=0.4,  # Minimum similarity score\n",
    "    )\n",
    ")\n",
    "\n",
    "await rag_memory.clear()  # Clear existing memory\n",
    "\n",
    "\n",
    "# Index AutoGen documentation\n",
    "async def index_autogen_docs() -> None:\n",
    "    indexer = SimpleDocumentIndexer(memory=rag_memory)\n",
    "    sources = [\n",
    "        \"https://raw.githubusercontent.com/microsoft/autogen/main/README.md\",\n",
    "        \"https://microsoft.github.io/autogen/dev/user-guide/agentchat-user-guide/tutorial/agents.html\",\n",
    "        \"https://microsoft.github.io/autogen/dev/user-guide/agentchat-user-guide/tutorial/teams.html\",\n",
    "        \"https://microsoft.github.io/autogen/dev/user-guide/agentchat-user-guide/tutorial/termination.html\",\n",
    "    ]\n",
    "    chunks: int = await indexer.index_documents(sources)\n",
    "    print(f\"Indexed {chunks} chunks from {len(sources)} AutoGen documents\")\n",
    "\n",
    "\n",
    "await index_autogen_docs()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "---------- TextMessage (user) ----------\n",
      "What is AgentChat?\n",
      "---------- MemoryQueryEvent (rag_assistant) ----------\n",
      "[MemoryContent(content='e of the AssistantAgent , we can now proceed to the next section to learn about the teams feature in AgentChat. previous Messages next Teams On this page Assistant Agent Getting Result Multi-Modal Input Streaming Messages Using Tools and Workbench Built-in Tools and Workbench Function Tool Model Context Protocol (MCP) Workbench Agent as a Tool Parallel Tool Calls Tool Iterations Structured Output Streaming Tokens Using Model Context Other Preset Agents Next Step Edit on GitHub Show Source so the DOM is not blocked --> © Copyright 2024, Microsoft. Privacy Policy | Consumer Health Privacy Built with the PyData Sphinx Theme 0.16.0.', mime_type='MemoryMimeType.TEXT', metadata={'chunk_index': 16, 'mime_type': 'MemoryMimeType.TEXT', 'source': 'https://microsoft.github.io/autogen/dev/user-guide/agentchat-user-guide/tutorial/agents.html', 'score': 0.6237251460552216, 'id': '6457da13-1c25-44f0-bea3-158e5c0c5bb4'}), MemoryContent(content='h Literature Review API Reference PyPi Source AgentChat Agents Agents # AutoGen AgentChat provides a set of preset Agents, each with variations in how an agent might respond to messages. All agents share the following attributes and methods: name : The unique name of the agent. description : The description of the agent in text. run : The method that runs the agent given a task as a string or a list of messages, and returns a TaskResult . Agents are expected to be stateful and this method is expected to be called with new messages, not complete history . run_stream : Same as run() but returns an iterator of messages that subclass BaseAgentEvent or BaseChatMessage followed by a TaskResult as the last item. See autogen_agentchat.messages for more information on AgentChat message types. Assistant Agent # AssistantAgent is a built-in agent that uses a language model and has the ability to use tools. Warning AssistantAgent is a “kitchen sink” agent for prototyping and educational purpose – it is very general. Make sure you read the documentation and implementation to understand the design choices. Once you fully understand the design, you may want to implement your own agent. See Custom Agent . from autogen_agentchat.agents import AssistantAgent from autogen_agentchat.messages import StructuredMessage from autogen_agentchat.ui import Console from autogen_ext.models.openai import OpenAIChatCompletionClient # Define a tool that searches the web for information. # For simplicity, we', mime_type='MemoryMimeType.TEXT', metadata={'chunk_index': 1, 'mime_type': 'MemoryMimeType.TEXT', 'source': 'https://microsoft.github.io/autogen/dev/user-guide/agentchat-user-guide/tutorial/agents.html', 'score': 0.6212755441665649, 'id': 'ab3a553f-bb69-41ff-b6a9-8397b4cb3cb1'}), MemoryContent(content='Literature Review API Reference PyPi Source AgentChat Teams Teams # In this section you’ll learn how to create a multi-agent team (or simply team) using AutoGen. A team is a group of agents that work together to achieve a common goal. We’ll first show you how to create and run a team. We’ll then explain how to observe the team’s behavior, which is crucial for debugging and understanding the team’s performance, and common operations to control the team’s behavior. AgentChat supports several team presets: RoundRobinGroupChat : A team that runs a group chat with participants taking turns in a round-robin fashion (covered on this page). Tutorial SelectorGroupChat : A team that selects the next speaker using a ChatCompletion model after each message. Tutorial MagenticOneGroupChat : A generalist multi-agent system for solving open-ended web and file-based tasks across a variety of domains. Tutorial Swarm : A team that uses HandoffMessage to signal transitions between agents. Tutorial Note When should you use a team? Teams are for complex tasks that require collaboration and diverse expertise. However, they also demand more scaffolding to steer compared to single agents. While AutoGen simplifies the process of working with teams, start with a single agent for simpler tasks, and transition to a multi-agent team when a single agent proves inadequate. Ensure that you have optimized your single agent with the appropriate tools and instructions before moving to a team-based approach. Cre', mime_type='MemoryMimeType.TEXT', metadata={'mime_type': 'MemoryMimeType.TEXT', 'chunk_index': 1, 'source': 'https://microsoft.github.io/autogen/dev/user-guide/agentchat-user-guide/tutorial/teams.html', 'score': 0.5267025232315063, 'id': '554b20a9-e041-4ac6-b2f1-11261336861c'})]\n",
      "---------- TextMessage (rag_assistant) ----------\n",
      "AgentChat is a framework that provides a set of preset agents designed to handle conversations and tasks using a variety of response strategies. It includes features for managing individual agents as well as creating teams of agents that can work collaboratively on complex goals. These agents are stateful, meaning they can manage and track ongoing conversations. AgentChat also includes agents that can utilize tools to enhance their capabilities.\n",
      "\n",
      "Key features of AgentChat include:\n",
      "- **Preset Agents**: These agents are pre-configured with specific behavior patterns for handling tasks and messages.\n",
      "- **Agent Attributes and Methods**: Each agent has a unique name and description, and methods like `run` and `run_stream` to execute tasks and handle messages.\n",
      "- **AssistantAgent**: A built-in general-purpose agent used primarily for prototyping and educational purposes.\n",
      "- **Team Configurations**: AgentChat allows for the creation of multi-agent teams for tasks that are too complex for a single agent. Teams run in preset formats like RoundRobinGroupChat or Swarm, providing structured interaction among agents.\n",
      "\n",
      "Overall, AgentChat is designed for flexible deployment of conversational agents, either singly or in groups, across a variety of tasks. \n",
      "\n",
      "TERMINATE\n"
     ]
    }
   ],
   "source": [
    "# Create our RAG assistant agent\n",
    "rag_assistant = AssistantAgent(\n",
    "    name=\"rag_assistant\", model_client=OpenAIChatCompletionClient(model=\"gpt-4o\"), memory=[rag_memory]\n",
    ")\n",
    "\n",
    "# Ask questions about AutoGen\n",
    "stream = rag_assistant.run_stream(task=\"What is AgentChat?\")\n",
    "await Console(stream)\n",
    "\n",
    "# Remember to close the memory when done\n",
    "await rag_memory.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This implementation provides a RAG agent that can answer questions based on AutoGen documentation. When a question is asked, the Memory system  retrieves relevant chunks and adds them to the context, enabling the assistant to generate informed responses.\n",
    "\n",
    "For production systems, you might want to:\n",
    "1. Implement more sophisticated chunking strategies\n",
    "2. Add metadata filtering capabilities\n",
    "3. Customize the retrieval scoring\n",
    "4. Optimize embedding models for your specific domain\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Mem0Memory Example\n",
    "\n",
    "`autogen_ext.memory.mem0.Mem0Memory` provides integration with `Mem0.ai`'s memory system. It supports both cloud-based and local backends, offering advanced memory capabilities for agents. The implementation handles proper retrieval and context updating, making it suitable for production environments.\n",
    "\n",
    "In the following example, we'll demonstrate how to use `Mem0Memory` to maintain persistent memories across conversations:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "from autogen_agentchat.agents import AssistantAgent\n",
    "from autogen_agentchat.ui import Console\n",
    "from autogen_core.memory import MemoryContent, MemoryMimeType\n",
    "from autogen_ext.memory.mem0 import Mem0Memory\n",
    "from autogen_ext.models.openai import OpenAIChatCompletionClient\n",
    "\n",
    "# Initialize Mem0 cloud memory (requires API key)\n",
    "# For local deployment, use is_cloud=False with appropriate config\n",
    "mem0_memory = Mem0Memory(\n",
    "    is_cloud=True,\n",
    "    limit=5,  # Maximum number of memories to retrieve\n",
    ")\n",
    "\n",
    "# Add user preferences to memory\n",
    "await mem0_memory.add(\n",
    "    MemoryContent(\n",
    "        content=\"The weather should be in metric units\",\n",
    "        mime_type=MemoryMimeType.TEXT,\n",
    "        metadata={\"category\": \"preferences\", \"type\": \"units\"},\n",
    "    )\n",
    ")\n",
    "\n",
    "await mem0_memory.add(\n",
    "    MemoryContent(\n",
    "        content=\"Meal recipe must be vegan\",\n",
    "        mime_type=MemoryMimeType.TEXT,\n",
    "        metadata={\"category\": \"preferences\", \"type\": \"dietary\"},\n",
    "    )\n",
    ")\n",
    "\n",
    "# Create assistant with mem0 memory\n",
    "assistant_agent = AssistantAgent(\n",
    "    name=\"assistant_agent\",\n",
    "    model_client=OpenAIChatCompletionClient(\n",
    "        model=\"gpt-4o-2024-08-06\",\n",
    "    ),\n",
    "    tools=[get_weather],\n",
    "    memory=[mem0_memory],\n",
    ")\n",
    "\n",
    "# Ask about the weather\n",
    "stream = assistant_agent.run_stream(task=\"What are my dietary preferences?\")\n",
    "await Console(stream)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The example above demonstrates how Mem0Memory can be used with an assistant agent. The memory integration ensures that:\n",
    "\n",
    "1. All agent interactions are stored in Mem0 for future reference\n",
    "2. Relevant memories (like user preferences) are automatically retrieved and added to the context\n",
    "3. The agent can maintain consistent behavior based on stored memories\n",
    "\n",
    "Mem0Memory is particularly useful for:\n",
    "- Long-running agent deployments that need persistent memory\n",
    "- Applications requiring enhanced privacy controls\n",
    "- Teams wanting unified memory management across agents\n",
    "- Use cases needing advanced memory filtering and analytics\n",
    "\n",
    "Just like ChromaDBVectorMemory, you can serialize Mem0Memory configurations:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Serialize the memory configuration\n",
    "config_json = mem0_memory.dump_component().model_dump_json()\n",
    "print(f\"Memory config JSON: {config_json[:100]}...\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "python",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}