Long-Term Memory
What is Memory?โ
AI memory refers to the ability to process, store, and effectively recall past interactions. This allows intelligent systems to learn from feedback and adapt to user preferences. Memory can be categorized into two types: short-term memory and long-term memory.
Short-Term Memory (memory within the scope of a session thread): The system can access short-term memory within a single session thread. This memory is managed as part of the agent's state and is persisted in a database through a checkpoint mechanism, allowing sessions to be resumed at any time. Short-term memory updates during process execution or after step completion and is read at the start of each step.
Long-Term Memory (memory shared across threads): Long-term memory applies across multiple session threads and can be accessed in any thread at any time. It is not restricted to a single thread ID but can be stored and retrieved using custom namespaces. Xpert AI provides storage functionality to save and retrieve this long-term memory.
Currently, most AI applications are like "goldfish," unable to retain information between conversations. This short-sightedness is not only inefficient but also limits the capabilities of AI.
Cross-thread Memoryโ
Cross-session memory enables AI agents to store and retrieve information across different conversation threads. This allows the agent to retain context, user preferences, and historical information across multiple interactions, delivering smarter and more personalized experiences.
Summarizing Conversations to Reduce Memory Load and Enhance Focusโ
Xpert uses a conversation summarization feature to extract key information from dialogues, reducing memory load and improving focus.
- Maximum Tolerance: Sets the threshold for the number of new messages allowed before generating a new summary. For example, in the image, this is set to 22. Even if the conversation exceeds 16 turns, the system continues recording until 22 turns before triggering a new summary. This parameter helps avoid frequent summarization and reduces computational costs.
- Number of Messages to Summarize: Specifies the number of messages to include in the summary. In the example, 16 indicates the system will summarize the first 16 of the 22 messages in the conversation.
- Number to Retain: Specifies the number of messages retained after summarization. For example, 4 means 4 messages are retained, while the rest are replaced with a summary in the system prompt.
- Prompt: Guides the system on how to summarize historical messages in the conversation. If no prompt is provided, the system uses a default one.
Configuring Long-Term Memoryโ
This configuration controls how information is extracted from conversations and stored in long-term memory.
Memory Types include User Profiles, Q&A, and Custom.
User Profileโ
- Interval (seconds): Sets the delay in seconds before extracting information after a conversation ends. The default is 10 seconds, meaning the system will create a background task for information extraction 10 seconds after the user and Xpert stop interacting.
- Prompt: Guides the system on how to extract information from conversations. If no prompt is provided, the system uses a default one.
Q&Aโ
In Q&A mode, summarized memories are created by extracting questions and answers when the user likes a response.
Custom Memoryโ
Custom memory allows users to define their own rules for extracting information.
Under Development
Managing Memoryโ
The management page serves as a centralized interface for managing and maintaining Xpert memory data.
- Score: Represents the degree of similarity during semantic searches (0โ1).
- Creator: Indicates the owner of the memory.
- Value: Stores the memory content in JSON format.
On this page, users can delete individual memories or clear all memories for a specific expert.
Semantic Search Testing allows users to test the retrieval of memories related to specific queries.