如何爲chatbot提供訓練語料

對話的實質是什麼？

我們的生活中充滿對話，從和男朋友準備晚餐的聊天，從快餐店訂一個烤鴨，對公司季度銷售進行總結報告，對話無處不在。對話有不同的長短，不同的主題，不同的重要性和不同的聊天場合，但是我們很少思考：我爲什麼要進行這次對話？我的目的是什麼？

本文中，我們從對話是協同行動（coordinating joint action）這個視角來理解它。對話是動態的，充滿了信號和互動。我們可以按照自己的設想開始一段對話，但是很多時候不能保證對話在哪裏結束。chatbot和對話是名詞，但是要很好的理解它們，我們傾向於把它們想象成動詞。我們如何和別人互動？我們怎麼確定對話按照我們想要的方向發展？

關於對話

對話的很大一部分構成是場合。設想如果你在舞會上，想請人跳舞，你可能只需要走過去，點點頭說一句：我可以麼（May I)？你的舞伴就會明白你的意圖是想邀請她一起跳舞。但是設想如果你是在大街上這樣問一個人，她可能就會很困惑，不知道你的請求是什麼，或者只是理解爲一次善意的打招呼.這就是不符合場合的對話。

這個道理也適用於chatbot。當你在和一個旅行社或者航空公司的chatbot聊天時，這個場合意味着chatbot應該可以幫你預定一個酒店或者改簽你的航班。而不要期望它可以和你深入的聊政治新聞或者微積分方面的知識。場合對成功的對話非常重要，那麼對話的是指是關於什麼呢？通常我們可以把對話分成4個主要組成部分：

1. 相互問候:這部分很容易理解，但是當你說“早上好”，對方說“吃了麼”的時候表達的都是問好，問候有非常多的表達方式，但是目標是一樣的：建立良好關係

2. 信息交換: 當你說”你晚上打算做什麼“你期望對方給你相應的答案。對話是關於提問和給出答案的過程

3. 鼓動行爲:當你說”我們明天一起去逛街吧“或者”你可以幫我拿一下電腦麼“時，對話的一部分是關於制定計劃和提出請求做某事

4. 確定觀點：當你說：”我同意，葡萄比蘋果好吃“時，你在確定自己的觀點

當然，強行把對話按照分成界限分明的4類是不可能的，但是當你在創建chatbot時，有這種把語句單獨分類的意識很重要，你需要AI理解這4個分離的部分和它們之間的相互關係。最重要的是記住對話的關鍵是：協同行動。一個chatbot需要把對話者的話當作輸入內容並且協調統籌後面會發生的事情和要採取的行動：是你的公司要申請退款麼？在chatbot行動之前是否需要對話的人提供跟多的信息？是否需要真人介入來解決這個問題？一個chatbot在行動之前需要對這些問題做出判斷。

爲什麼要開發chatbot?

如果你認爲chatbot有點死板，這是可以理解的，因爲之前人們對chatbot和智能助手進行了大肆宣傳，你有理由懷疑：chatbot到底有多大的用處？畢竟在企業投入人力研發之前，我們有必要了解chatbot對我們來說有哪些研發的必要性。

下面是幾個突出的原因：

起初大多數的企業研究chatbot用於客戶服務。它可以幫助潛在的消費者確認適合的衣服尺寸下週是否會到貨，重新預定一個酒店甚至處理更加複雜敏感的金融問題。

我們應該意識到，當對話成了客戶服務重要的一部分時，企業是可以大大節省在每個客戶身上所花費的服務成本的。畢竟客服人員可以同時服務幾個客戶並且只是相當於電話互動的成本的30%。當使用chatbot時，可以極大的提高效率，因爲在某個場合下，大多數的問題都是反覆詢問並且可以預測的（一個酒店預定網站處理無數次的關於取消預定，房間升級或者入住時間方面的問題)，在很多特定場合下，chatbot可以處理客戶遇到的大多數的問題，這使得你的客戶服務人員可以解放出來去做更多複雜和必須需要人協助的事情

斯坦福大學的教授Chris Pott曾經有個準則：常規的事件可以用常規的語言解決，非常規事件需要非常規語言來解決。chatbot處理常規事件一般沒有問題，是那些非常規的事件它解決起來有困難

這些非常規事件正是客服人員可以並且能夠解決的。通過把不同的對話情景分配不同的解決方案（電話，聊天窗口，chatbot),你可以讓客服去解決更棘手的問題，在這種情況下你不再需要衆多的客服人員從而節省費用。此外，chatbot可以24小時工作，他們在春節期間可以工作，從來不請病假，它是可靠的不會曠工。這些聽起來不錯。

現在我們想要說另外一個事實：信息發送APP近幾年非常流行。事實上，根據商業調查，信息交流APP現在比社交網絡更加流行。換句話說，信息交流APP的使用者就是你的客戶，chatbot可以無縫的添加到這些APP當中，如果你的客戶在whatsapp和Messenger上花費的時間超過了Twitter和領英，那麼你爲什麼不好好利用呢？Chatbot生來就屬於那裏，聊天APP讓你的客戶可以方便的和你交流而不用在額外的下載你們公司的聊天軟件或者通過電話和你們公司交流。所以爲什麼不選擇chatbot呢？它可以把客戶人員從重複枯燥的問題中解救出來，它可以使你的客戶在現有的流行聊天軟件中隨和你們公司互動

好吧，那麼我們來開始開發CHATBOT！

首先我們要思考的第一個問題就是我們爲什麼要重新開發一個CHATBOT。我們經常把CHATBOT用於客戶服務，它可以幫助C端消費者做決策，在旅行中幫忙訂酒店，爲大型SaaS供應商提供問題解答，或者在任何需要大量員工和客戶互動來解決問題的場景。記住，你要實現的目標和創建的內容非常重要。你不需要開發類似SIRI那樣的CHATBOT，同時如果你曾經嘗試問SIRI一些業務細分場景的問題，你會發現它也不能提供滿意的回答。同時你還需要確認你的CHATBOT的打分標準，可以是每個小時服務的客戶數量或者NPS得分，或者其他指標。你需要檢測這些指標，聰明的CHATBOT可以在很多重要衡量指標上向你提供不間斷的反饋信息。在這篇文章中，我們會以爲航空公司創建CHATBOT爲例子，來簡要介紹如果使你的CHATBOT更聰明，更敏捷，更強健，最終滿足你的商業應用。

我們周圍有各式各樣的CHATBOT。它們可以幫助得知明天的天氣，向你定時更新某個新聞，幫你安排會議時間，管理你的財產或者如果你願意，你可以和他們談心，成爲朋友。但是我們今天談論的是應用到這些聊天應用的本質的東西，關於對話和訓練CHATBOT對話的一些原則方法

確立了目標之後，你需要考慮我們可以從日常的互動中學習什麼經驗。其中一個“騙局”就是，其實在CHATBOT背後有很多工程師前提編程好的答案，比如當你對SIRI說：給我將一個笑話時。SIRI並不是真的當場“想“出了一個上週末聽到的笑話。實際是，SIRI後臺在諮詢查詢表格(consult a lookup table),蘋果的工程師提前設想到我們會問這個問題所以把這個問題編寫在SIRI知識庫裏面。對於大多數的公司，這個方法是可用的。記得我們上文提到的”普通事件可以用普通語言回答麼“，你可以對某個應用場景中經常出現的問題，編寫相應的回答。這樣可行是因爲我們預先就可以猜測客戶會和我們如何互動，或者我們知道客戶的常規的行爲方式。通常，你的客服人員會知道客戶經常會問哪些問題，或者通過你的選票系統或者其他的大數據分析，你可以知道你的客戶經常會問的問題和對話的方式，所以我們只需要儘可能的把把所有的情況都編碼進去就可以了是麼？實際上，CHATBOT沒有那麼簡單，我們上面談論的只是一個信息檢索系統，或者搜索系統，但是成功的CHATBOT不是搜索欄。它需要有互動，需要有對話，需要協同行動。下面我們想要介紹4重算法訓練，你通常需要從你的數據庫或者數據服務提供商哪裏獲取訓練數據，這些數據用來使你的CHATBOT裏面對話和客戶進行互動，分別是：

表達方式：描述同樣一件事有多少種表達方式？你的CHATBOT需要了解儘可能的表達方式，否則它永遠是迷惑的

相關性：某個特定的回答是否和某個問題相關？

意圖檢測：你的CHATBOT明白你的客戶的意圖或者目的麼？如果它不明白客戶想要做什麼，那麼無法協同行動。

實體提取：”我特別想吃蘋果"和“這個蘋果特別好吃”是不一樣的意思。實體提取對於算法理解語言的細微差別非常有幫助。

訓練CHATBOT的4種語言任務

1: Utterance, or, How Many Ways Can YouOrder a Pizza?

To work at all, your chatbot needs to understand what users are asking it to do. And while you can likely easily identify the most frequent,most normal requests from a user, it's tough to come up with every permutation of those core questions on your own.That's what utterance data collection is all about.The task is simple: set up a task where a bunch of people come up with different ways to ask the same question. What's the question? That'sup to you and your team. But you'd be surprised just how many ways there are to ask for the simplest things.

KEY USE CASES

• Transforming FAQ content into a chatbot(you’ve already written answers, but want tomatch them to lots of different questions)

• Building up voice/text activation for a new feature (how many ways are there to ask fora song to play?)

An example? Reddit's Random Acts of Pizza,where people ask for pizzas and potentially the community responds. If we look at 5,671 requests for pizza, we’ll find that 99.4% of all of them have unique titles. In fact, there were only four repeats at all! Inside the body of the posts themselves, the only repetition that exists over 27,000 sentences are basically just greetings and assorted gratitude:

This a good example of the breadth of just simple requests. “Please pass the salt” and “Salt!” are both ways to make a request, after all, but they feel rather different. And while people will interact with chatbots differently than people(think about how you search for shoes or use Google; it's not exactly how you talk to your friends), accruing a database of the ways people ask for things gives your chatbot fuel to answer those requests in kind.

Now, a section or two ago, we mentioned that we're going to use this eBook to demonstrate how to create the data you need to train a chatbot. We chose to create data around an airline customer service chatbot, but of course,you can do utterance tasks for whatever utterances you want to capture.For our example job, we chose to ask for ways to ask for "can I change my flight?" Again, there are no specifics here (like "I need to change flight 563" or "I have to fly to Vegas instead") so the pool of utterance data is artificially limited a bit,but here's how you do it:

Pretty easy right? Now, one of the things we prides itself on is quality control.But with utterance tasks, that can be tough. You can't come up with the "correct" ways to ask this question (in fact, you're trying to accrue just that data) so you can't use the typical test question format most of jobs take. We get around this ina pretty simple way: two different, intertwined jobs.

Next, let’s look at relevance:

2: Relevance, or, Are We Making Senseor Not?

Once you have a set of utterances, you want tobe able to match them with answers and actions.Relevance tasks do this by giving you trainingdata about you can use to map utterances thatusers might say to the help pages and actiontriggersin your database. They are usually of theform, “here’s a question, here’s an answer, howrelevant is it?”In doing this mapping, you are likely to find thatcertain flavors of questions need longer orshorter responses. The more a response justlooks like “the best matching paragraphs” or "anadjacent answer from our FAQ section," the lessdirect help it offers, the less human it feels, andthe less satisfied your user is.To get a sense of how people know what tosay, let’s look at the four maxims Paul Gricedeveloped that people follow when talking. Ifyou flout these maxims, things get weird.

1. Quantity: be as informative as you possiblycan and give as much information as is needed,and no more

2. Quality: try to be truthful and don’t giveinformation that is false or that is not supportedby evidence

3. Relation: try to be relevant and say things thatare pertinent to the discussion

4. Manner: try to be as clear, as brief, and asorderly as you can in what you say and avoidobscurity and ambiguity

We can reduce these even more. For DanSperber and Deirdre Wilson, the centralthing is “Be relevant”. Or more formally:The issue for chatbots is they can havetrouble understanding context. They'recertainly worse at it than we are. And because of that, some of their responsesare, well, irrelevant. And irrelevantresponses make for bad conversations.They don't coordinate joint action.

This is one of the reasonsit's much simpler to createa chatbot to handle discreteissues (like rescheduling aflight) than one that justwants to talk about any old thing

You see similar tasks in search relevance projects:given a query, does this resultmatch? Is it relevant? Doingthat with chatbot question/answer pairings gives youthe tools you need to tweakyour models and make themmore accurate. It also willshow you where your modelis falling down and where it's succeeding.

3: Intent, or, What Were You Trying to Do Anyway?

When we’re engaging with people in jointactivities like conversation, we are (orbecome) attuned to their intentions. That’swhat’s behind the comedy of somethinglike Lucy and Charlie Brown’s “I knowyou know I know you know” chains ofreasoning. Other minds aren’t entirely opaque to us, even if we tend to fill them inwith our own projections.

Much like the last example, you see intentwork in informational retrieval projects likeinternal search relevance tasks. Basically:does this output match the intent of whatsomeone wanted? When someone searchesfor an iPhone and they're presented with aniPhone case, does that match their intent?The same is true for chatbot replies. Givena question from your utterance corpus,how relevant is the answer your model orhardcoded bot returns?

The reality is that relevance isn't quiteenough for chatbots. Conversationis simply too complicated for simplerelevance to make chatbot responsesgood enough.

Take the airline customer chatbot we'rebuilding. Imagine a customer typing"baggage fees?" What do they actuallymean? Are they asking what the baggagefees for a particular flight are? Are theydemanding a refund for baggage fees theywere recently charged? A chatbot whodoesn't understand context and intentmight just send the customer to an FAQabout baggage fees. And that customerisn't going to be particularly enthusedabout that interaction.

Intent and relevance are intrinsicallylinked. You want to start the process byidentifying which flags your chatbot willbe able to support. Do you want to handleyour top ten issues? Top five? You wantto tackle as many permutations of thoseconversations as possible in your relevanceand intent tasks. And keep in mind,these tasks are sometimes even morevaluable for tuning your bot after it’s beenreleased or with test conversations youconduct with it. You'll be able to analyzewhole conversations, find out wherethey fall down, and give annotators fullerconversations to understand customerintent.

Because, really, that's an important pointhere: intent shows itself most clearlyin the context of a full conversation.That "baggage fees?" comment means amuch different thing based on particular,individual conversations.

Intent tasks often present annotators withconversations (or snippets thereof) and askusers if the chatbot is understanding theintent of the customer. In the places it didnot, it's important to understand whereand why your bot hit a snag. Once that'sunderstood, you can hone your models orhard-code answers to deal with preciselythose issues.

Last thing: remember that point we madeabout your chatbot's personality? That playshere. If your chatbot isn't sure it's going tobe relevant (essentially, it's unconfidentabout output) or is at sea over intent, justask! Chatbots that deal with requests byasking a series of probing questions to findthe exact thing that user is looking to doare far, far more successful that those thatmake pseudo-guesses where they're notfully confident. When in doubt, your chatbotshould aim towards further clarity, notaction.

4: Entity Recognition, or, WhichWashington is this Washington?

Entity recognition is the last major trainingjob for your algorithm. Essentially, itinvolves looking at passages of texts andidentifying "entities" within. Those mightbe places, people, product names, youname it, but generally work best lookingfor specific entities that are valid for yourparticular use case.

Take our example use case of an airlineservice chatbot. If you tell it that you'relooking to go to Washington, what doesthat mean? Because it could mean any ofthe following:

You get the idea. Now, if you're buildinga chatbot that's looking to engage overAmerican history, Washington has atotally different meaning. Ditto to a botlooking to give out college sports scores.The list goes on.

For starters, this is why more generic,multi-purpose bots are so difficult andwhy context is so important for anychatbot. But it's also why you need towork on entity extraction for your chatbotproject. In fact, named entity recognitionis one of the basic building blocks ofnatural language processing and it allowsyour bot to function properly

We've created an entity extractiontool that's very similar to a popular oneyou may have heard of called BRAT.Essentially, on our platform, you provideusers with text blocks and they highlightthe entities you care about. You can seean example below:

In that screenshot, we're interested in afew salient things to build to our airlinechatbot. Note especially that numbers areimportant here. Is it a flight number? Anarrival time? An amount of ounces for carryonsunscreen? The more examples of namedentities your model sees, the more it learnsto understand that some time people typingwon't write "7:25" and instead just write"725" but your bot will actually understand.That increases your bot's accuracy, itsability to actually converse, and, yes, makesit function in the way it's supposed to:coordinating joint action.

CONCLUSION

Nice as it would be, you can't just buy chatbot software out of a box and simply deployit. You need to test, tune, and train your chatbot. Hopefully, this eBook gave you theunderstanding of how that's actually done. But we do want to highlight a few of the keytakeaways we'd love to leave you with now that we're finished:

• Conversations are about coordinating joint action. The best chatbots have realconversations and, thus, coordinate realjoint actions

.• When in doubt, make sure your chatbotis curious. A curious chatbot understandswhat a user really wants before acting. Andpeople are much more willing to answera few extra questions than deal with badoutcomes.

• There are four major chatbot dataprojects. Each are important.

They are: • Utterance tasks: How many ways arethere to say a thing?

• Relevance tasks: Does this responseeven make sense?

• Intent tasks: What did the user want tohappen here?

• Entity extraction: What are theseparticular words exactly

如何爲chatbot提供訓練語料

測試人員都是畫畫大神，讓我看看誰還不會用代碼圖？

Object.values()對象遍歷

我拍了拍Redis，被移出了羣聊···

網絡現代化通向雲原生應用的高速公路

面試官：說說你對序列化的理解

訓練機器學習模型時如何評估數據質量

如何根據實際問題訓練調優部署機器學習模型

如何構建，訓練，測試和部署機器學習模型

3D點雲標註工具

數據標註：AI背後的十億市場

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結