基於Microsoft SemanticKernel和GPT4實現一個智能翻譯服務

今年.NET Conf China 2023技術大會,我給大家分享了 .NET應用國際化-AIGC智能翻譯+代碼生成的議題

.NET Conf China 2023分享-.NET應用國際化-AIGC智能翻譯+代碼生成

今天將詳細的代碼實現和大家分享一下。

一、前提準備

1. 新建一個Console類的Project

2. 引用SK的Nuget包,SK的最新Nuget包

dotnet add package Microsoft.SemanticKernel --version 1.4.0
<ItemGroup>
    <PackageReference Include="Microsoft.SemanticKernel" Version="1.4.0" />
    <PackageReference Include="Newtonsoft.Json" Version="13.0.3" />
  </ItemGroup>

3. 在Azure OpenAI Service中創建一個GPT4的服務,這個可能大家沒有賬號,那就先看代碼如何實現吧

部署好GPT4模型後,可以拿到以下三個重要的值

Azure OpenAI Deployment Name
Azure OpenAI Endpoint
Azure OpenAI Key
二、編寫翻譯使用的Prompt
 {{$input}}
請將上面的輸入翻譯爲英文,不要返回任何解釋說明,
請扮演一個美國電動汽車充電服務運營商(精通中文和英文),用戶的輸入數據是JSON格式,例如{"1":"充電站", "2":"充電樁"}, 
如果不是JSON格式,請返回無效的輸入。
請使用以下專業術語進行翻譯
 {
    "充電站":"Charging station",
    "電站":"Charging station",
    "場站":"Charging station",
    "充電樁":"Charging point",
    "充電終端":"Charging point",
    "終端":"Charging point",
    "電動汽車":"Electric Vehicle",
    "直流快充":"DC Fast Charger",
    "超級充電站":"Supercharger",
    "智能充電":"Smart Charging",
    "交流慢充":"AC Slow Charging"
}
翻譯結果請以JSON格式返回,例如 {"1":"Charging station", "2":"Charging point"}

類似的還有葡萄牙下的翻譯Prompt

{{$input}}
請將上面的輸入翻譯爲葡萄牙語,不要返回任何解釋說明,請扮演一個巴西的電動汽車充電服務運營商(精通葡萄牙語、中文和英文)
用戶的輸入數據是JSON格式,例如{"1":"充電站", "2":"充電樁"}, 如果不是JSON格式,請返回無效的輸入
請使用以下專業術語進行翻譯
 {
  "充電站": "Estação de carregamento",
  "電站": "Estação de carregamento",
  "場站": "Estação de carregamento",
  "充電樁": "Ponto de carregamento",
  "充電終端": "Ponto de carregamento",
  "終端": "Ponto de carregamento",
  "電動汽車": "Veículo Elétrico",
  "直流快充": "Carregador Rápido DC",
  "超級充電站": "Supercharger",
  "智能充電": "Carregamento Inteligente",
  "交流慢充": "Carregamento AC Lento"
}
請以JSON格式返回,例如 {"1":"Estação de carregamento", "2":"Ponto de carregamento"}

在項目工程下新建Plugins目錄和TranslatePlugin子目錄,同時新建Translator_en和Translator_pt等多個子目錄

 config.json文件下的內容如下:

{
    "schema": 1,
    "type": "completion",
    "description": "Translate.",
    "completion": {
         "max_tokens": 2000,
         "temperature": 0.5,
         "top_p": 0.0,
         "presence_penalty": 0.0,
         "frequency_penalty": 0.0
    },
    "input": {
         "parameters": [
              {
                   "name": "input",
                   "description": "The user's input.",
                   "defaultValue": ""
              }
         ]
    }
}

三、Translator翻譯類,實現文本多語言翻譯

這個類主要實現將用戶輸入的文本(系統處理爲JSON格式),翻譯爲指定的語言

using System.Runtime.InteropServices;
using Microsoft.SemanticKernel;
using Newtonsoft.Json;

namespace LLM_SK;
public class Translator
{
    Kernel kernel;
    public Translator(Kernel kernel)
    {
        this.kernel = kernel;
    }

    public IDictionary<int, string> Translate(IDictionary<int, string> textList, string language)
    {
        var pluginDirectory = Path.Combine(System.IO.Directory.GetCurrentDirectory(), "Plugins/TranslatePlugin");
        var plugin = kernel.CreatePluginFromPromptDirectory(pluginDirectory, "Translator_" + language + "");        

        var json = JsonConvert.SerializeObject(textList);      

        if (!string.IsNullOrEmpty(json))
        {
            var output = kernel.InvokeAsync(plugin["Translator_" + language + ""], new() { ["input"] = json }).Result.ToString();
            if (!string.IsNullOrWhiteSpace(output))
            {
                Console.WriteLine(output);
                return JsonConvert.DeserializeObject<Dictionary<int, string>>(output);
            }
        }

        return new Dictionary<int, string>();
    }
}

這個類中構造函數中接收傳入的Kernel對象,這個Kernel對象是指

Microsoft.SemanticKernel.Kernel  
//
// Summary:
//     Provides state for use throughout a Semantic Kernel workload.
//
// Remarks:
//     An instance of Microsoft.SemanticKernel.Kernel is passed through to every function
//     invocation and service call throughout the system, providing to each the ability
//     to access shared state and services.
public sealed class Kernel

暫且理解爲調用各類大模型的Kernel核心類,基於這個Kernel實例對象完成大模型的調用和交互

另外,上述代碼中有個Prompt模板文件讀取的操作。

        var pluginDirectory = Path.Combine(System.IO.Directory.GetCurrentDirectory(), "Plugins/TranslatePlugin");
        var plugin = kernel.CreatePluginFromPromptDirectory(pluginDirectory, "Translator_" + language + "");    

 從Plugins/TranslatePlugin目錄下讀取指定的KernelPlugin,例如Translator_en英語翻譯插件和Translator_pt 葡萄牙翻譯插件

 var output = kernel.InvokeAsync(plugin["Translator_" + language + ""], new() { ["input"] = json }).Result.ToString();

 調用KernelFunction方式實現GPT4大模型調用

 //
    // Summary:
    //     Invokes the Microsoft.SemanticKernel.KernelFunction.
    //
    // Parameters:
    //   function:
    //     The Microsoft.SemanticKernel.KernelFunction to invoke.
    //
    //   arguments:
    //     The arguments to pass to the function's invocation, including any Microsoft.SemanticKernel.PromptExecutionSettings.
    //
    //
    //   cancellationToken:
    //     The System.Threading.CancellationToken to monitor for cancellation requests.
    //     The default is System.Threading.CancellationToken.None.
    //
    // Returns:
    //     The result of the function's execution.
    //
    // Exceptions:
    //   T:System.ArgumentNullException:
    //     function is null.
    //
    //   T:Microsoft.SemanticKernel.KernelFunctionCanceledException:
    //     The Microsoft.SemanticKernel.KernelFunction's invocation was canceled.
    //
    // Remarks:
    //     This behaves identically to invoking the specified function with this Microsoft.SemanticKernel.Kernel
    //     as its Microsoft.SemanticKernel.Kernel argument.
    public Task<FunctionResult> InvokeAsync(KernelFunction function, KernelArguments? arguments = null, CancellationToken cancellationToken = default(CancellationToken))
    {
        Verify.NotNull(function, "function");
        return function.InvokeAsync(this, arguments, cancellationToken);
    }

 繼續封裝GPT4TranslateService,構造Microsoft.SemanticKernel.Kernel 類實例。

using System.Globalization;
using Microsoft.SemanticKernel;

namespace LLM_SK;
public class GPT4TranslateService
{    
    public IDictionary<int,string> Translate(IDictionary<int, string> texts, CultureInfo cultureInfo)
    {
        var kernel = BuildKernel();
        var translator = new Translator(kernel);
        return translator.Translate(texts, cultureInfo.TwoLetterISOLanguageName );
    }

    //私有方法,構造IKernel
    private Kernel BuildKernel()
    {
        var builder = Kernel.CreateBuilder();
        builder.AddAzureOpenAIChatCompletion(
                 "xxxxgpt4",                  // Azure OpenAI Deployment Name
                 "https://****.openai.azure.com/", // Azure OpenAI Endpoint
                 "***************");      // Azure OpenAI Key

        return  builder.Build();
   }
}

四、測試調用

這裏我們設計了2種語言,英語和葡萄牙的文本翻譯

var culture = new CultureInfo("en-US");
var translator = new GPT4TranslateService();
translator.Translate(new Dictionary<int, string>(){{ 1,"電站"}, {2,"終端不可用"},{3,"充電樁不可用"} ,
{4,"場站"},{5,"充電站暫未運營" }},culture);

culture = new CultureInfo("pt-BR");
translator.Translate(new Dictionary<int, string>(){{ 1,"電站"}, {2,"終端不可用"},{3,"充電樁不可用"} ,
{4,"場站"},{5,"充電站暫未運營" }},culture);

輸出的結果

{"1":"Charging station","2":"Charging point unavailable","3":"Charging station unavailable","4":"Charging station","5":"Charging station not in operation yet"}
{"1":"Estação de carregamento","2":"Ponto de carregamento não está disponível","3":"Ponto de carregamento não está disponível","4":"Estação de carregamento","5":"A estação de carregamento ainda não está em operação"}

五、總結

以上是基於SemanticKernel和GPT4實現一個智能翻譯服務的Demo和框架,大家可以基於這個示例繼續完善,增加更多動態的數據和API調用,例如將JSON數據寫入數據庫

同時還可以記錄翻譯不穩定的異常,手工處理或者繼續完善Prompt。

 

周國慶

2024/2/17

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章