語音編程,軟件開發領域的下一個前沿技術?

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"從語音到代碼"},{"type":"text","text":":當今有兩種領先的語言編程平臺,它們提供了不同的方式來向計算機“朗誦”代碼。其中一個叫做 Serenade,有點像數字助理:它允許你描述你正在編寫代碼的指令,而不要求你必須逐字逐句地口述每條指令;另一個叫做 Talon,它提供了對每一行更細粒度的控制,它還要求對編入計算機中的每一項任務都有更細緻的瞭解。本文的一個簡單的例子是 Serenade 和 Talon 中生成 Python 代碼的分步指南,它將在屏幕上打印“hello”。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通過與小工具的對話,我們與它們的"},{"type":"text","marks":[{"type":"strong"}],"text":"互動日益頻繁"},{"type":"text","text":"。如今,像 Alexa 和 Siri 這樣的老朋友,加入了像蘋果 CarPlay 和 Android Auto 這樣的汽車助手的行列,甚至加入了對"},{"type":"link","attrs":{"href":"https:\/\/www.anz.com.au\/ways-to-bank\/mobile-banking-apps\/voice-id\/?fileGuid=sFMARu7MFl4RkDcW","title":"","type":null},"content":[{"type":"text","text":"語音生物識別"}]},{"type":"text","text":"和指令敏感的應用程序。但是,如果這種技術本身可以用語音構建呢?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那就是語音編程的前提,語音編程是一種軟件開發方法,用語音代替鍵盤和鼠標來編寫代碼。在語音編程平臺上,程序員“說出”命令來操作代碼,創建適應並自動執行工作流的定製命令。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"語音編程並不像看上去那麼簡單,它背後有很多複雜的技術。例如,語音編程應用"},{"type":"link","attrs":{"href":"https:\/\/serenade.ai\/?fileGuid=sFMARu7MFl4RkDcW","title":"","type":null},"content":[{"type":"text","text":"Serenade"}]},{"type":"text","text":",它有一個專門爲代碼開發的語音轉文字引擎,與"},{"type":"link","attrs":{"href":"https:\/\/cloud.google.com\/speech-to-text?fileGuid=sFMARu7MFl4RkDcW","title":"","type":null},"content":[{"type":"text","text":"谷歌的語音轉文字 API"}]},{"type":"text","text":"不一樣,它是爲對話式語音設計的。當軟件工程師把代碼說出來後,Serenade 的引擎就會將它反饋給自然語言處理層,它的機器學習模型被訓練來識別並將常見的編程結構轉換爲語法上有效的代碼。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Serenade"},{"type":"link","attrs":{"href":"https:\/\/techcrunch.com\/2020\/11\/23\/serenade-snags-2-1m-seed-round-to-turn-speech-into-code\/?fileGuid=sFMARu7MFl4RkDcW","title":"","type":null},"content":[{"type":"text","text":"在 2020 年的種子輪融資中募集到 210 萬美元"}]},{"type":"text","text":",當聯合創始人 Matt Wiethoff 在 2019 年被診斷出重複使力傷害(RSI,譯註:指因長時間重複使用某組肌肉造成的損害,是常見的職業病)時,Serenade 就應運而生了。“我放棄了 Quora 的軟件工程師職位,因爲我再也不能幹這個工作了。”他說。“要麼選擇一份不用打這麼多字的工作,要麼想出一些解決辦法。”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Ryan Hileman 也走上了同樣的道路,在一年前遭受手痛之苦之後,2017年他辭掉了軟件工程師的全職工作。於是,Hileman 開始創建"},{"type":"link","attrs":{"href":"https:\/\/talonvoice.com\/?fileGuid=sFMARu7MFl4RkDcW","title":"","type":null},"content":[{"type":"text","text":"Talon"}]},{"type":"text","text":",一種免手敲的編程平臺。他說:“Talon 的目的是要完全取代鍵盤和鼠標。”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Talon 有幾個組件:語音識別、眼球追蹤和噪音識別。Talon 的語音識別引擎基於 Facebook 的"},{"type":"link","attrs":{"href":"https:\/\/ai.facebook.com\/tools\/wav2letter\/?fileGuid=sFMARu7MFl4RkDcW","title":"","type":null},"content":[{"type":"text","text":"Wav2letter"}]},{"type":"text","text":"自動語音識別系統,Hileman 將其擴展到了與語音編程命令相適應的範圍。同時,Talon 的眼球追蹤和噪音識別功能可以模擬使用鼠標導航,"},{"type":"link","attrs":{"href":"https:\/\/www.joshwcomeau.com\/blog\/hands-free-coding\/?fileGuid=sFMARu7MFl4RkDcW","title":"","type":null},"content":[{"type":"text","text":"根據眼球運動在屏幕上移動光標,並根據嘴巴的爆音來點擊"}]},{"type":"text","text":"。“這種聲音很容易發出。這種方法不費吹灰之力,並且只需較低的延遲就可識別,因此這種點擊鼠標的非語言方式速度更快,而且不會引起聲音疲勞。”Hileman 說。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"用 Talon 編程聽起來就像是用另一種語言說話,軟件工程師兼語音編程員 Emily Shea 在"},{"type":"link","attrs":{"href":"https:\/\/whalequench.club\/blog\/2019\/09\/14\/strange-loop.html?fileGuid=sFMARu7MFl4RkDcW","title":"","type":null},"content":[{"type":"text","text":"2019 年的一次會議演講"}]},{"type":"text","text":"時這樣說道。她的演講視頻裏滿是語音命令,如“slap”(點擊回車)、“undo”(刪除)、“spring 3”(轉到文件的第三行) ,以及“phrase name op equals snake extract word paren mad”(結果是這行代碼:name = extract_word(m))。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"而在使用 Serenade 進行編程時,遵循的是一種更自然的講代碼的方式。你可以說“delete import”來刪除文件頂部的導入指令,或者說“build”來運行自定義的構建命令。你也可以說“Add function factorial”來創建一個函數,在 JavaScript 中計算階乘,並且應用程序將處理語法 —— 包括“function”關鍵字、括號和大括號 —— 因此你無需顯式地聲明每個元素。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/e2\/38\/e210367b376cf4600422e70807392938.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"語音編程的確需要一個像樣的麥克風,尤其是當你想去除背景噪音時,Serenade 的模型是根據筆記本電腦上麥克風產生的音頻來訓練的。如果你想用眼球追蹤運行 Talon,你還需要眼球追蹤硬件。不過,在沒有這種硬件的情況下,Talon 也能正常運行。開源語音編程平臺,如"},{"type":"link","attrs":{"href":"https:\/\/github.com\/dictation-toolbox\/aenea?fileGuid=sFMARu7MFl4RkDcW","title":"","type":null},"content":[{"type":"text","text":"Aenea"}]},{"type":"text","text":"和"},{"type":"link","attrs":{"href":"https:\/\/github.com\/dictation-toolbox\/Caster?fileGuid=sFMARu7MFl4RkDcW","title":"","type":null},"content":[{"type":"text","text":"Caster"}]},{"type":"text","text":",是免費的,但都依賴於"},{"type":"link","attrs":{"href":"https:\/\/www.nuance.com\/dragon.html?fileGuid=sFMARu7MFl4RkDcW","title":"","type":null},"content":[{"type":"text","text":"Dragon"}]},{"type":"text","text":"語音識別引擎,用戶必須自行購買。這就是說,Caster 支持"},{"type":"link","attrs":{"href":"http:\/\/kaldi-asr.org\/?fileGuid=sFMARu7MFl4RkDcW","title":"","type":null},"content":[{"type":"text","text":"Kaldi"}]},{"type":"text","text":"和 Windows 語音識別,前者是一個開源的語音識別工具包,後者預裝在 Windows 上。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Serenade Labs 聯合創始人 Tommy MacWilliam 表示,這些結果足以說明問題。“能夠描述你想做的事情是如此簡單,”他說。“與打字或按鍵盤快捷鍵相比,說‘move these three lines down’或者‘duplicate this method’會更加流暢。”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"語音編程還可以讓那些有傷病或慢性疼痛的人繼續他們的職業生涯。“能夠使用語音,只需將我的手臂從等式中移除,就可以開闢一種限制較小的方式來使用電腦。”Shea 說。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通過語音進行編程也可以降低軟件開發的准入門檻。“如果他們能夠用邏輯和結構化的方式思考他們想要寫的代碼,”MacWilliam 說,“那麼我們就能讓機器學習走完最後一公里,並將這些想法轉化成語法上有效的代碼。”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"語音編程還處於起步階段,它能否被廣泛採用還取決於軟件工程師對傳統鍵盤和鼠標編寫代碼模式的束縛程度。但是語言編程給了我們各種各樣的可能性,也許在將來,腦機接口會直接將你所想的內容轉換成代碼,或者說是軟件本身。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"作者介紹:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Rina Diane Caballar,記者,曾做過軟件工程師,居住新西蘭惠靈頓。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"原文鏈接:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"https:\/\/spectrum.ieee.org\/computing\/software\/programming-by-voice-may-be-the-next-frontier-in-software-development"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章