旁門左道:藉助 HttpClientHandler 攔截請求,體驗 Semantic Kernel 插件

前天嘗試通過 one-api + dashscope(阿里雲靈積) + qwen(通義千問)運行 Semantic Kernel 插件(Plugin) ,結果嘗試失敗,詳見前天的博文

今天換一種方式嘗試,選擇了一個旁門左道走走看,看能不能在不使用大模型的情況下讓 Semantic Kernel 插件運行起來,這個旁門左道就是從 Stephen Toub 那偷學到的一招 —— 藉助 DelegatingHandler(new HttpClientHandler()) 攔截 HttpClient 請求,直接以模擬數據進行響應。

先創建一個 .NET 控制檯項目

dotnet new console
dotnet add package Microsoft.SemanticKernel
dotnet add package Microsoft.Extensions.Http

參照 Semantic Kernel 源碼中的示例代碼創建一個非常簡單的插件 LightPlugin

public class LightPlugin
{
    public bool IsOn { get; set; } = false;

    [KernelFunction]
    [Description("幫看一下燈是開是關")]
    public string GetState() => IsOn ? "on" : "off";

    [KernelFunction]
    [Description("開燈或者關燈")]
    public string ChangeState(bool newState)
    {
        IsOn = newState;
        var state = GetState();
        Console.WriteLine(state == "on" ? $"[開燈啦]" : "[關燈咯]");
        return state;
    }
}

接着創建旁門左道 BackdoorHandler,先實現一個最簡單的功能,打印 HttpClient 請求內容

public class BypassHandler() : DelegatingHandler(new HttpClientHandler())
{
    protected override async Task<HttpResponseMessage> SendAsync(
        HttpRequestMessage request, CancellationToken cancellationToken)
    {
        Console.WriteLine(await request.Content!.ReadAsStringAsync());
        // return await base.SendAsync(request, cancellationToken);
        return new HttpResponseMessage(HttpStatusCode.OK);
    }
}

然後攜 LightPluginBypassHandler 創建 Semantic Kernel 的 Kernel

var builder = Kernel.CreateBuilder();
builder.Services.AddOpenAIChatCompletion("qwen-max", "sk-xxxxxx");
builder.Services.ConfigureHttpClientDefaults(b =>
    b.ConfigurePrimaryHttpMessageHandler(() => new BypassHandler()));
builder.Plugins.AddFromType<LightPlugin>();
Kernel kernel = builder.Build();

再然後,發送攜帶 prompt 的請求並獲取響應內容

var history = new ChatHistory();
history.AddUserMessage("請開燈");
Console.WriteLine("User > " + history[0].Content);
var chatCompletionService = kernel.GetRequiredService<IChatCompletionService>();

// Enable auto function calling
OpenAIPromptExecutionSettings openAIPromptExecutionSettings = new()
{
    ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions
};

var result = await chatCompletionService.GetChatMessageContentAsync(
    history,
    executionSettings: openAIPromptExecutionSettings,
    kernel: kernel);

Console.WriteLine("Assistant > " + result);

運行控制檯程序,BypassHandler 就會在控制檯輸出請求的 json 內容(爲了閱讀方便對json進行了格式化):

點擊查看 json
{
  "messages": [
    {
      "content": "Assistant is a large language model.",
      "role": "system"
    },
    {
      "content": "\u8BF7\u5F00\u706F",
      "role": "user"
    }
  ],
  "temperature": 1,
  "top_p": 1,
  "n": 1,
  "presence_penalty": 0,
  "frequency_penalty": 0,
  "model": "qwen-max",
  "tools": [
    {
      "function": {
        "name": "LightPlugin-GetState",
        "description": "\u5E2E\u770B\u4E00\u4E0B\u706F\u662F\u5F00\u662F\u5173",
        "parameters": {
          "type": "object",
          "required": [],
          "properties": {}
        }
      },
      "type": "function"
    },
    {
      "function": {
        "name": "LightPlugin-ChangeState",
        "description": "\u5F00\u706F\u6216\u8005\u5173\u706F",
        "parameters": {
          "type": "object",
          "required": [
            "newState"
          ],
          "properties": {
            "newState": {
              "type": "boolean"
            }
          }
        }
      },
      "type": "function"
    }
  ],
  "tool_choice": "auto"
}

爲了能反序列化這個 json ,我們需要定義一個類型 ChatCompletionRequest,Sermantic Kernel 中沒有現成可以使用的,實現代碼如下:

點擊查看 ChatCompletionRequest
public class ChatCompletionRequest
{
    [JsonPropertyName("messages")]
    public IReadOnlyList<RequestMessage>? Messages { get; set; }

    [JsonPropertyName("temperature")]
    public double Temperature { get; set; } = 1;

    [JsonPropertyName("top_p")]
    public double TopP { get; set; } = 1;

    [JsonPropertyName("n")]
    public int? N { get; set; } = 1;

    [JsonPropertyName("presence_penalty")]
    public double PresencePenalty { get; set; } = 0;

    [JsonPropertyName("frequency_penalty")]
    public double FrequencyPenalty { get; set; } = 0;

    [JsonPropertyName("model")]
    public required string Model { get; set; }

    [JsonPropertyName("tools")]
    public IReadOnlyList<Tool>? Tools { get; set; }

    [JsonPropertyName("tool_choice")]
    public string? ToolChoice { get; set; }
}

public class RequestMessage
{
    [JsonPropertyName("role")]
    public string? Role { get; set; }

    [JsonPropertyName("name")]
    public string? Name { get; set; }

    [JsonPropertyName("content")]
    public string? Content { get; set; }
}

public class Tool
{
    [JsonPropertyName("function")]
    public FunctionDefinition? Function { get; set; }

    [JsonPropertyName("type")]
    public string? Type { get; set; }
}

public class FunctionDefinition
{
    [JsonPropertyName("name")]
    public string? Name { get; set; }

    [JsonPropertyName("description")]
    public string? Description { get; set; }

    [JsonPropertyName("parameters")]
    public ParameterDefinition Parameters { get; set; }

    public struct ParameterDefinition
    {
        [JsonPropertyName("type")]
        public required string Type { get; set; }

        [JsonPropertyName("description")]
        public string? Description { get; set; }

        [JsonPropertyName("required")]
        public string[]? Required { get; set; }

        [JsonPropertyName("properties")]
        public Dictionary<string, PropertyDefinition>? Properties { get; set; }

        public struct PropertyDefinition
        {
            [JsonPropertyName("type")]
            public required PropertyType Type { get; set; }
        }

        [JsonConverter(typeof(JsonStringEnumConverter))]
        public enum PropertyType
        {
            Number,
            String,
            Boolean
        }
    }
}

有了這個類,我們就可以從請求中獲取對應 Plugin 的 function 信息,比如下面的代碼:

var function = chatCompletionRequest?.Tools.FirstOrDefault(x => x.Function.Description.Contains("開燈"))?.Function;
var functionName = function.Name;
var parameterName = function.Parameters.Properties.FirstOrDefault(x => x.Value.Type == PropertyType.Boolean).Key;

接下來就是旁門左道的關鍵,直接在 BypassHandler 中響應 Semantic Kernel 通過 OpenAI.ClientCore 發出的 http 請求。

首先創建用於 json 序列化的類 ChatCompletionResponse

點擊查看 ChatCompletionResponse
public class ChatCompletionResponse
{
    [JsonPropertyName("id")]
    public string? Id { get; set; }

    [JsonPropertyName("object")]
    public string? Object { get; set; }

    [JsonPropertyName("created")]
    public long Created { get; set; }

    [JsonPropertyName("model")]
    public string? Model { get; set; }

    [JsonPropertyName("usage"), JsonIgnore(Condition = JsonIgnoreCondition.WhenWritingNull)]
    public Usage? Usage { get; set; }

    [JsonPropertyName("choices")]
    public List<Choice>? Choices { get; set; }
}

public class Choice
{
    [JsonPropertyName("message")]
    public ResponseMessage? Message { get; set; }

    /// <summary>
    /// The message in this response (when streaming a response).
    /// </summary>
    [JsonPropertyName("delta"), JsonIgnore(Condition = JsonIgnoreCondition.WhenWritingNull)]
    public ResponseMessage? Delta { get; set; }

    [JsonPropertyName("finish_reason")]
    public string? FinishReason { get; set; }

    /// <summary>
    /// The index of this response in the array of choices.
    /// </summary>
    [JsonPropertyName("index")]
    public int Index { get; set; }
}

public class ResponseMessage
{
    [JsonPropertyName("role")]
    public string? Role { get; set; }

    [JsonPropertyName("name"), JsonIgnore(Condition = JsonIgnoreCondition.WhenWritingNull)]
    public string? Name { get; set; }

    [JsonPropertyName("content"), JsonIgnore(Condition = JsonIgnoreCondition.WhenWritingNull)]
    public string? Content { get; set; }

    [JsonPropertyName("tool_calls")]
    public IReadOnlyList<ToolCall>? ToolCalls { get; set; }
}

public class ToolCall
{
    [JsonPropertyName("id")]
    public string? Id { get; set; }

    [JsonPropertyName("function")]
    public FunctionCall? Function { get; set; }

    [JsonPropertyName("type")]
    public string? Type { get; set; }
}

public class Usage
{
    [JsonPropertyName("prompt_tokens")]
    public int PromptTokens { get; set; }

    [JsonPropertyName("completion_tokens")]
    public int CompletionTokens { get; set; }

    [JsonPropertyName("total_tokens")]
    public int TotalTokens { get; set; }
}

public class FunctionCall
{
    [JsonPropertyName("name")]
    public string Name { get; set; } = string.Empty;

    [JsonPropertyName("arguments")]
    public string Arguments { get; set; } = string.Empty;
}

先試試不執行 function calling ,直接以 assistant 角色回覆一句話

public class BypassHandler() : DelegatingHandler(new HttpClientHandler())
{
    protected override async Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
    {
        var chatCompletion = new ChatCompletionResponse
        {
            Id = Guid.NewGuid().ToString(),
            Model = "fake-mode",
            Object = "chat.completion",
            Created = DateTimeOffset.Now.ToUnixTimeSeconds(),
            Choices =
               [
                   new()
                   {
                       Message = new ResponseMessage
                       {
                           Content = "自己動手,豐衣足食",
                           Role = "assistant"
                       },
                       FinishReason = "stop"
                   }
               ]
        };

        var json = JsonSerializer.Serialize(chatCompletion, GetJsonSerializerOptions());
        return new HttpResponseMessage
        {
            Content = new StringContent(json, Encoding.UTF8, "application/json")
        };
    }
}

運行控制檯程序,輸出如下:

User > 請開燈
Assistant > 自己動手,豐衣足食

成功響應,到此,旁門左道成功了一半。

接下來在之前創建的 chatCompletion 基礎上添加針對 function calling 的 ToolCall 部分。

先準備好 ChangeState(bool newState) 的參數值

Dictionary<string, bool> arguments = new()
{
    { parameterName, true }
};

並將回覆內容由 "自己動手,豐衣足食" 改爲 "客官,燈已開"

Message = new ResponseMessage
{
    Content = "客官,燈已開",
    Role = "assistant"
}

然後爲 chatCompletion 創建 ToolCalls 實例用於響應 function calling

var messages = chatCompletionRequest.Messages;
if (messages.First(x => x.Role == "user").Content.Contains("開燈") == true)
{
    chatCompletion.Choices[0].Message.ToolCalls = new List<ToolCall>()
    {
        new ToolCall
        {
            Id = Guid.NewGuid().ToString(),
            Type = "function",
            Function = new FunctionCall
            {
                Name = function.Name,
                Arguments = JsonSerializer.Serialize(arguments, GetJsonSerializerOptions())
            }
        }
    };
}

運行控制檯程序看看效果

User > 請開燈
[開燈啦]
[開燈啦]
[開燈啦]
[開燈啦]
[開燈啦]
Assistant > 客官,燈已開

耶!成功開燈!但是,竟然開了5次,差點把燈給開爆了。

BypassHandler 中打印一下請求內容看看哪裏出了問題

var json = await request.Content!.ReadAsStringAsync();
Console.WriteLine(json);

原來分別請求/響應了5次,第2次請求開始,json 中 messages 部分多了 tool_callstool_call_id 內容

{
  "messages": [
    {
      "content": "\u5BA2\u5B98\uFF0C\u706F\u5DF2\u5F00",
      "tool_calls": [
        {
          "function": {
            "name": "LightPlugin-ChangeState",
            "arguments": "{\u0022newState\u0022:true}"
          },
          "type": "function",
          "id": "76f8dead-b5ad-4e6d-b343-7f78d68fac8e"
        }
      ],
      "role": "assistant"
    },
    {
      "content": "on",
      "tool_call_id": "76f8dead-b5ad-4e6d-b343-7f78d68fac8e",
      "role": "tool"
    }
  ]
}

這時恍然大悟,之前 AI assistant 對 function calling 的響應只是讓 Plugin 執行對應的 function,assistant 還需要根據執行的結果決定下一下做什麼,第2次請求中的 tool_callstool_call_id 就是爲了告訴 assistant 執行的結果,所以,還需要針對這個請求進行專門的響應。

到了旁門左道最後100米衝刺的時刻!

RequestMessage 添加 ToolCallId 屬性

public class RequestMessage
{
    [JsonPropertyName("role")]
    public string? Role { get; set; }

    [JsonPropertyName("name")]
    public string? Name { get; set; }

    [JsonPropertyName("content")]
    public string? Content { get; set; }

    [JsonPropertyName("tool_call_id")]
    public string? ToolCallId { get; set; }
}

BypassHandler 中響應時判斷一下 ToolCallId,如果是針對 Plugin 的 function 執行結果的請求,只返回 Message.Content,不進行 function calling 響應

var messages = chatCompletionRequest.Messages;
var toolCallId = "76f8dead- b5ad-4e6d-b343-7f78d68fac8e";
var toolCallIdMessage = messages.FirstOrDefault(x => x.Role == "tool" && x.ToolCallId == toolCallId);

if (toolCallIdMessage != null && toolCallIdMessage.Content == "on")
{
    chatCompletion.Choices[0].Message.Content = "客官,燈已開";
}
else if (messages.First(x => x.Role == "user").Content.Contains("開燈") == true)
{  
    chatCompletion.Choices[0].Message.Content = "";
    //..
}

改進代碼完成,到了最後10米衝刺的時刻,再次運行控制檯程序

User > 請開燈
[開燈啦]
Assistant > 客官,燈已開

只有一次開燈,衝刺成功,旁門左道走通,用這種方式體驗一下 Semantic Kernel Plugin,也別有一番風味。

完整示例代碼已上傳到 github https://github.com/cnblogs-dudu/sk-plugin-sample-101

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章