Microsoft.Extensions.AI原生提供的两种缩减器,无需自定义开发,就能让 Agent 自动 “瘦身” 长会话历史,既保留关键上下文,又确保不超模型限制。双重保障,让长会话 Agent 真正具备生产环境落地能力。
长会话的 “隐形门槛”:上下文窗口限制
当 Agent 支持会话持久化后,用户可能会进行多轮连续对话(比如客服咨询、代码调试、日常闲聊),但所有大模型都有明确的上下文窗口限制(例如 GPT-3.5 为 4k token,GPT-4o 为 128k token):
-
消息累积过多,总 token 数超限时,模型调用直接报错; -
即使未超限,大量冗余历史会增加模型计算负担,响应速度明显下降; -
三方存储中历史数据无限增长,长期占用存储资源,增加运维成本。
此时,需要一个 “智能瘦身工具”——聊天历史缩减器(Chat Reducer) 。微软早已考虑到这一场景,在Microsoft.Extensions.AI中内置了两种核心缩减器,直接开箱即用,无需重复造轮子。
核心知识点:两种内置缩减器,按需选型
Microsoft.Extensions.AI提供的两种缩减器,覆盖了不同长会话场景需求,我的 Demo 代码已支持 “注释切换”,下面结合官方定义和实际用法详细说明:
|
|
|
|
|
|---|---|---|---|
MessageCountingChatReducer
|
|
maxMessageCount
|
|
SummarizingChatReducer
|
|
chatClient
maxMessageCountBeforeSummarization:触发摘要阈值;maxSummaryCount:最大摘要数
|
|
关键补充:两种缩减器的官方核心特性
1. MessageCountingChatReducer
官方定义核心提炼:
限制对话中非系统消息的数量,保留最新消息和第一条系统消息(若存在);排除包含函数调用或函数结果的消息,适用于需要约束聊天历史大小的场景(如适配模型上下文限制)。
简单说:它是 “精准裁剪” 工具,只保留最新的关键消息,不做语义处理,速度快、无额外 token 消耗。
2. SummarizingChatReducer
官方定义核心提炼:
将聊天消息集合缩减为摘要形式;会话超指定长度时自动摘要旧消息,保留上下文同时减少消息数;保留系统消息,排除函数相关消息不参与摘要。
简单说:它是 “智能压缩” 工具,用模型将旧消息浓缩为摘要,既精简体积,又不丢失核心语义,适合超长复杂会话。
重要疑问:较上一篇为什么要显式定义IChatClient?
代码中这行是关键适配:
IChatClient chatClient = new OpenAIClient(...)!.GetChatClient(modelName).AsIChatClient();
原因很明确:
SummarizingChatReducer需要调用大模型生成摘要,必须传入 IChatClient实例;OpenAIClient.GetChatClient()返回的是 OpenAIChatClient,需通过AsIChatClient()转换为通用接口,确保兼容性;-
即使只用 MessageCountingChatReducer,显式定义也让代码更规范,后续切换缩减器时无需大幅修改。
Demo 实战:三方存储 + 双缩减器,无缝集成
本次 Demo 基于优化后的代码,实现 “会话持久化 + 缩减器二选一”,核心目标:
-
会话历史存储在外部向量库(持久化不丢失,重启可恢复); -
支持两种内置缩减器无缝切换,缩减后自动同步更新存储; -
退出时验证缩减效果,直观看到保留的历史消息。
下面按 “依赖准备→核心代码拆解” 两步详解。
1. 核心依赖:确认版本,安装必备包
确保项目引用以下包
<PackageReference Include="Microsoft.Agents.AI.OpenAI" Version="1.0.0-preview.251110.2" /><PackageReference Include="Microsoft.SemanticKernel.Connectors.InMemory" Version="1.67.1-preview" />
2. 核心代码拆解:从存储到 Agent 的完整链路
(1)VectorChatMessageStore:三方存储 + 缩减器的桥梁
这个类是核心,既负责会话持久化,又集成了缩减器逻辑,兼容两种内置缩减器:
internal sealed class VectorChatMessageStore : ChatMessageStore{private readonly VectorStore _vectorStore; // 外部存储载体public string? ThreadDbKey { get; private set; } // 会话唯一标识public IChatReducer? ChatReducer { get; } // 缩减器实例(兼容两种类型)public ChatReducerTriggerEvent ReducerTriggerEvent { get; } // 缩减触发时机// 无缩减器构造函数(兼容旧场景)public VectorChatMessageStore(VectorStore vectorStore, JsonElement serializedStoreState, JsonSerializerOptions? jsonSerializerOptions = null){this._vectorStore = vectorStore ?? throw new ArgumentNullException(nameof(vectorStore));// 反序列化会话标识(支持线程恢复)if (serializedStoreState.ValueKind is JsonValueKind.String){this.ThreadDbKey = serializedStoreState.Deserialize<string>();}}// 带缩减器构造函数(核心适配)public VectorChatMessageStore(IChatReducer chatReducer,VectorStore vectorStore,JsonElement serializedStoreState,JsonSerializerOptions? jsonSerializerOptions = null,ChatReducerTriggerEvent reducerTriggerEvent = ChatReducerTriggerEvent.BeforeMessagesRetrieval): this(vectorStore, serializedStoreState, jsonSerializerOptions){this.ChatReducer = chatReducer;this.ReducerTriggerEvent = reducerTriggerEvent;}// 核心方法:添加消息时自动缩减+同步存储public override async Task AddMessagesAsync(IEnumerable<ChatMessage> messages, CancellationToken cancellationToken = default){this.ThreadDbKey ??= Guid.NewGuid().ToString("N"); // 首次存储生成会话唯一标识var collection = this._vectorStore.GetCollection<string, ChatHistoryItem>("ChatHistory");await collection.EnsureCollectionExistsAsync(cancellationToken);#region 聊天历史缩减核心逻辑// 1. 读取现有历史消息var chatHistoryItems = collection.GetAsync(x => x.ThreadId == this.ThreadDbKey,int.MaxValue,new() { OrderBy = x => x.Descending(y => y.Timestamp) },cancellationToken);List<ChatMessage> chatHistoryMessages = [];await foreach (var record in chatHistoryItems){chatHistoryMessages.Add(JsonSerializer.Deserialize<ChatMessage>(record.SerializedMessage!)!);}// 2. 合并现有历史+新消息chatHistoryMessages.AddRange(messages);// 3. 触发缩减(添加消息后立即执行,两种缩减器自动适配)if (this.ReducerTriggerEvent is ChatReducerTriggerEvent.AfterMessageAdded && this.ChatReducer is not null){chatHistoryMessages = (await this.ChatReducer.ReduceAsync(chatHistoryMessages, cancellationToken).ConfigureAwait(false)).ToList();}#endregion#region 同步更新缩减后的历史到三方存储await collection.EnsureCollectionDeletedAsync(); // 删除旧数据,避免冗余await collection.EnsureCollectionExistsAsync(cancellationToken);#endregion// 存储缩减后的消息await collection.UpsertAsync(chatHistoryMessages.Select(x => new ChatHistoryItem(){Key = this.ThreadDbKey + x.MessageId, // 消息唯一键(会话标识+消息ID)Timestamp = DateTimeOffset.UtcNow, // 存储时间戳ThreadId = this.ThreadDbKey, // 关联会话SerializedMessage = JsonSerializer.Serialize(x), // 序列化消息MessageText = x.Text // 消息文本(用于检索)}), cancellationToken);}// 读取缩减后的历史消息public override async Task<IEnumerable<ChatMessage>> GetMessagesAsync(CancellationToken cancellationToken = default){var collection = this._vectorStore.GetCollection<string, ChatHistoryItem>("ChatHistory");await collection.EnsureCollectionExistsAsync(cancellationToken);var records = collection.GetAsync(x => x.ThreadId == this.ThreadDbKey, int.MaxValue, new() { OrderBy = x => x.Descending(y => y.Timestamp) }, cancellationToken);List<ChatMessage> messages = [];await foreach (var record in records){messages.Add(JsonSerializer.Deserialize<ChatMessage>(record.SerializedMessage!)!);}messages.Reverse(); // 按时间升序返回,适配Agent上下文处理return messages;}// 序列化会话状态(支持线程持久化)public override JsonElement Serialize(JsonSerializerOptions? jsonSerializerOptions = null){return JsonSerializer.SerializeToElement(this.ThreadDbKey);}// 向量库存储模型:定义消息存储结构private sealed class ChatHistoryItem{[VectorStoreKey] public string? Key { get; set; } // 唯一键[VectorStoreData] public string? ThreadId { get; set; } // 会话标识[VectorStoreData] public DateTimeOffset? Timestamp { get; set; } // 时间戳[VectorStoreData] public string? SerializedMessage { get; set; } // 序列化消息[VectorStoreData] public string? MessageText { get; set; } // 消息文本}}
关键设计亮点:
-
兼容性:通过 IChatReducer接口适配两种内置缩减器,切换时无需修改存储类; -
数据一致性:缩减后先删除旧存储数据,再插入新数据,确保持久化的是精简后的数据; -
触发时机:支持 AfterMessageAdded(添加后立即缩减)和BeforeMessagesRetrieval(查询前缩减),按需选择。
(2)Agent 集成:双缩减器二选一,直接复制可用
代码中已做好 “注释切换” 设计,两种方案无需大幅修改:
方案 1:MessageCountingChatReducer(默认启用,精准控消息数)
public static async Task DemoAsync(string apiKey, string modelName, string endpoint){var clientOptions = new OpenAIClientOptions { Endpoint = new Uri(endpoint) };// 显式定义IChatClient,适配缩减器切换IChatClient chatClient = new OpenAIClient(new ApiKeyCredential(apiKey), clientOptions).GetChatClient(modelName).AsIChatClient();var agent = chatClient.CreateAIAgent(new ChatClientAgentOptions{Instructions = "你是一个擅长讲笑话的Agent,回复简洁有趣", // 第一条系统消息会被保留Name = "ZerekZhang",ChatMessageStoreFactory = ctx =>{// 配置MessageCountingChatReducer:非系统消息最多保留2条return new VectorChatMessageStore(new MessageCountingChatReducer(maxMessageCount: 2),new InMemoryVectorStore(), // 三方存储(可替换为Redis)ctx.SerializedState,ctx.JsonSerializerOptions,ChatReducerTriggerEvent.AfterMessageAdded // 添加消息后立即缩减);}});// 线程序列化+恢复(会话持久化核心)AgentThread thread = agent.GetNewThread();JsonElement serializedThread = thread.Serialize(); // 序列化线程状态(可存数据库/文件)AgentThread resumedThread = agent.DeserializeThread(serializedThread); // 恢复线程(模拟服务重启)// 交互循环while (true){var userInput = Console.ReadLine();if (userInput == "Exit"){// 验证缩减效果:打印保留的历史消息var messageStore = resumedThread.GetService<VectorChatMessageStore>()!;var messages = await messageStore.GetMessagesAsync();Console.WriteLine("\n缩减后的历史消息:");foreach (var item in messages){Console.WriteLine($"{item.Role}:{item.Text}");}break;}var response = await agent.RunAsync(userInput, resumedThread);Console.WriteLine($"Agent Output:{response}\n");}}
方案 2:SummarizingChatReducer(解除注释可用,语义摘要)
// 解除注释后替换方案1的Agent创建逻辑var agent = chatClient.CreateAIAgent(new ChatClientAgentOptions{Instructions = "你是一个擅长讲笑话的Agent,回复简洁有趣",Name = "ZerekZhang",ChatMessageStoreFactory = ctx =>{// 配置SummarizingChatReducer:超2条消息触发摘要,最多保留10条摘要return new VectorChatMessageStore(new SummarizingChatReducer(chatClient: chatClient, // 摘要用的模型客户端maxMessageCountBeforeSummarization: 2, // 消息数超2条触发摘要maxSummaryCount: 10 // 最多保留10条摘要(避免摘要冗余)),new InMemoryVectorStore(),ctx.SerializedState,ctx.JsonSerializerOptions,ChatReducerTriggerEvent.AfterMessageAdded);}});
生产环境优化建议
-
存储替换:将 InMemoryVectorStore替换为 Redis(高并发)、PostgreSQL(支持向量检索),适配分布式部署; -
参数调优: MessageCountingChatReducer:根据模型上下文窗口设置 maxMessageCount(如 GPT-3.5 设为 5-8 条);SummarizingChatReducer: maxMessageCountBeforeSummarization建议设为模型上下文窗口的 1/3,平衡语义保留和 token 消耗;-
触发时机选择: -
需审计完整历史:用 BeforeMessagesRetrieval(存储完整历史,查询时缩减); -
需节省存储:用 AfterMessageAdded(存储缩减后历史); -
降级逻辑:生产环境需添加缩减器调用失败的降级处理(如默认保留最新 10 条消息)。
核心要点总结
结合本次 Demo 代码,实现 “会话持久化 + 长会话不超限” 只需 3 步:
-
依赖准备:确保Microsoft.Agents.AI,内置两种缩减器可用; -
存储适配:使用 VectorChatMessageStore,通过构造函数集成缩减器,自动处理缩减 + 存储同步; -
Agent 配置:根据场景选择缩减器(简单场景用 MessageCounting,复杂长会话用Summarizing),显式定义IChatClient适配切换。
这个方案既解决了会话 “易丢失、不共享” 的问题,又借助微软原生能力突破了模型上下文限制,无需自定义开发,大幅提升开发效率,让长会话 Agent 真正能落地生产。
Demo完整代码
using Microsoft.Agents.AI;using Microsoft.Extensions.AI;using Microsoft.Extensions.VectorData;using Microsoft.SemanticKernel.Connectors.InMemory;using OpenAI;using System.ClientModel;using System.Text.Json;using static Microsoft.Agents.AI.InMemoryChatMessageStore;using VectorStore = Microsoft.Extensions.VectorData.VectorStore;namespace AgentDemo{#pragma warning disable OPENAI001#pragma warning disable MEAI001/// <summary>/// Agent 会话记录三方存储/// </summary>internal static partial class AgentConversationSaveBase{public static async Task DemoAsync(string apiKey, string modelName, string endpoint){var clientOptions = new OpenAIClientOptions { Endpoint = new Uri(endpoint) };IChatClient chatClient = new OpenAIClient(new ApiKeyCredential(apiKey), clientOptions).GetChatClient(modelName).AsIChatClient();var agent = chatClient.CreateAIAgent(new ChatClientAgentOptions{Instructions = "你是一个擅长讲笑话的Agent",Name = "ZerekZhang",ChatMessageStoreFactory = ctx =>{return new VectorChatMessageStore(new MessageCountingChatReducer(2), new InMemoryVectorStore(), ctx.SerializedState, ctx.JsonSerializerOptions, ChatReducerTriggerEvent.AfterMessageAdded);}});////使用 SummarizingChatReducer的Demo//var agent = chatClient.CreateAIAgent(new ChatClientAgentOptions//{// Instructions = "你是一个擅长讲笑话的Agent",// Name = "ZerekZhang",// ChatMessageStoreFactory = ctx =>// {// return new VectorChatMessageStore(new SummarizingChatReducer(chatClient, 2, 10), new InMemoryVectorStore(), ctx.SerializedState, ctx.JsonSerializerOptions, ChatReducerTriggerEvent.AfterMessageAdded);// }//});AgentThread thread = agent.GetNewThread();JsonElement serializedThread = thread.Serialize();AgentThread resumedThread = agent.DeserializeThread(serializedThread);while (true){var userInput = Console.ReadLine();if (userInput == "Exit"){// 退出时查询存储的历史(验证缩减效果)var messageStore = resumedThread.GetService<VectorChatMessageStore>()!;var messages = await messageStore.GetMessagesAsync();Console.WriteLine("\n缩减后的历史消息:");foreach (var item in messages){Console.WriteLine($"{item.Role}:{item.Text}");}break;}var response = await agent.RunAsync(userInput, resumedThread);Console.WriteLine($"Agent Output:{response}\n");}}}internal sealed class VectorChatMessageStore : ChatMessageStore{private readonly VectorStore _vectorStore;public string? ThreadDbKey { get; private set; }public IChatReducer? ChatReducer { get; }public ChatReducerTriggerEvent ReducerTriggerEvent { get; }public VectorChatMessageStore(VectorStore vectorStore, JsonElement serializedStoreState, JsonSerializerOptions? jsonSerializerOptions = null){this._vectorStore = vectorStore ?? throw new ArgumentNullException(nameof(vectorStore));if (serializedStoreState.ValueKind is JsonValueKind.String){this.ThreadDbKey = serializedStoreState.Deserialize<string>();}}public VectorChatMessageStore(IChatReducer chatReducer, VectorStore vectorStore, JsonElement serializedStoreState, JsonSerializerOptions? jsonSerializerOptions = null, ChatReducerTriggerEvent reducerTriggerEvent = ChatReducerTriggerEvent.BeforeMessagesRetrieval): this(vectorStore, serializedStoreState, jsonSerializerOptions){this.ChatReducer = chatReducer;this.ReducerTriggerEvent = reducerTriggerEvent;}public override async Task AddMessagesAsync(IEnumerable<ChatMessage> messages, CancellationToken cancellationToken = default){this.ThreadDbKey ??= Guid.NewGuid().ToString("N");var collection = this._vectorStore.GetCollection<string, ChatHistoryItem>("ChatHistory");await collection.EnsureCollectionExistsAsync(cancellationToken);#region 添加聊天记录的压缩var chatHistoryItems = collection.GetAsync(x => x.ThreadId == this.ThreadDbKey, int.MaxValue, new() { OrderBy = x => x.Descending(y => y.Timestamp) }, cancellationToken);List<ChatMessage> chatHistoryMessages = [];await foreach (var record in chatHistoryItems){chatHistoryMessages.Add(JsonSerializer.Deserialize<ChatMessage>(record.SerializedMessage!)!);}chatHistoryMessages.AddRange(messages);if (this.ReducerTriggerEvent is ChatReducerTriggerEvent.AfterMessageAdded && this.ChatReducer is not null){chatHistoryMessages = (await this.ChatReducer.ReduceAsync(chatHistoryMessages, cancellationToken).ConfigureAwait(false)).ToList();}#endregion#region 将压缩后的聊天记录同步更新到三方存储中await collection.EnsureCollectionDeletedAsync();await collection.EnsureCollectionExistsAsync(cancellationToken);#endregionawait collection.UpsertAsync(chatHistoryMessages.Select(x => new ChatHistoryItem(){Key = this.ThreadDbKey + x.MessageId,Timestamp = DateTimeOffset.UtcNow,ThreadId = this.ThreadDbKey,SerializedMessage = JsonSerializer.Serialize(x),MessageText = x.Text}), cancellationToken);}public override async Task<IEnumerable<ChatMessage>> GetMessagesAsync(CancellationToken cancellationToken = default){var collection = this._vectorStore.GetCollection<string, ChatHistoryItem>("ChatHistory");await collection.EnsureCollectionExistsAsync(cancellationToken);var records = collection.GetAsync(x => x.ThreadId == this.ThreadDbKey, int.MaxValue, new() { OrderBy = x => x.Descending(y => y.Timestamp) }, cancellationToken);List<ChatMessage> messages = [];await foreach (var record in records){messages.Add(JsonSerializer.Deserialize<ChatMessage>(record.SerializedMessage!)!);}messages.Reverse();return messages;}public override JsonElement Serialize(JsonSerializerOptions? jsonSerializerOptions = null){return JsonSerializer.SerializeToElement(this.ThreadDbKey);}private sealed class ChatHistoryItem{[VectorStoreKey]public string? Key { get; set; }[VectorStoreData]public string? ThreadId { get; set; }[VectorStoreData]public DateTimeOffset? Timestamp { get; set; }[VectorStoreData]public string? SerializedMessage { get; set; }[VectorStoreData]public string? MessageText { get; set; }}}#pragma warning restore OPENAI001#pragma warning restore MEAI001}
Demo运行结果

