可信編程 – Rust語言開發的實踐和願景

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"作者:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Yijun Yu"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"可信編程首席專家"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"華爲可信軟件工程與開源實驗室"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"華爲愛爾蘭研究所"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Amanieu d’Antras"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Rust 高級專家"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"華爲可信軟件工程與開源實驗室"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"華爲愛爾蘭研究所"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":"center","level":2},"content":[{"type":"text","text":"Rust 帶來的創新"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"StackOverflow 的調查表明, 自 2015 年以來,Rust 一直是開發者最愛的編程語言。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/66\/fb\/66af4faa9b5bd9f285e9335084ayybfb.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"學術界對於 Rust 也越來越重視,在編程語言和軟件工程頂會上發表的關於 Rust 的論文正逐年增加。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/79\/c3\/792deb9bfaa2833c67c7d6b9ff45bac3.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"不僅如此,《自然》雜誌 2020 年尾的文章《Why Scientists are Turning to Rust》中也強調:科學家極爲推崇 Rust。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/6e\/6c\/6ed7d2792feee58434ce03eceeeb826c.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"heading","attrs":{"align":"center","level":2},"content":[{"type":"text","text":"Rust 在華爲的初步推進"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"華爲的目標是引領通信系統軟件向安全可信演進,其中 Rust 語言正在發揮很大的作用。例如,我們希望通過部分C\/C++代碼向Rust的遷移,在保證高性能的同時,擁有更高的安全性。在此過程中, 我們爲開發者提供一套自動化工具支持:基於開源的 C2Rust 轉譯工具, 首先從 C 代碼生成 Rust 代碼, 然後通過源到源變換工具自動重構。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"華爲內部還基於 actor 的併發編程模式開發了 Rust 庫,例如async, await,方便程序員充分利用 Rust 的語言特性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"華爲的通信系統軟件開發以 C\/C++ 代碼爲主, 在需要的時候,這些 Rust 庫將使 C\/C++ 到 Rust 的遷移更加順暢。作爲業界領先公司和 Rust 基金會創始成員,華爲致力於推進 Rust 在通信軟件行業的發展,並將持續爲 Rust 社區做出貢獻。"}]},{"type":"heading","attrs":{"align":"center","level":2},"content":[{"type":"text","text":"華爲對 Rust 社區的貢獻"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們爲 Rust 社區貢獻了許多重要的功能特性。例如,我們最近爲 Rust 編譯器提交了一系列代碼,使得 Rust 編譯目標可以支持 ARM AArch64 32 位大端變體 ILP32 芯片組, 以便用於我們的通信產品中。這些改進使得我們和友商可以在這些常用網絡硬件架構上執行 Rust 原生程序。這些代碼已經通過我們的 Rust 專家 Amanieu d’Antras 提交給了 LLVM 編譯器, libc 庫, 以及 Rust 編譯器等開源項目。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這些對 Rust 編譯器的更改引入了新的端到端交叉編譯目標,基於此針對定製硬件構建 Rust 產品變得更容易, 只需要簡單的命令,比如:"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"cargo build --target aarch64_be-unknown-linux-gnucargo build --target aarch64-unknown-linux-gnu_ilp32cargo build --target aarch64_be-unknown-linux-gnu_ilp32"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"華爲在中國 Rust 社區貢獻方面也走在前列。去年 12 月 26 日至 27 日,在深圳戰略贊助了Rust China Conf 2020 ,並推行多項社區活動,包括爲中國的開發者提供 Rust 教程和 Rust 編碼規範。"}]},{"type":"heading","attrs":{"align":"center","level":2},"content":[{"type":"text","text":"配置華爲的端到端 Rust 工具鏈"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/9d\/61\/9dbb0de6yy631016d693c9c08fb97d61.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"(C、C++、Rust 代碼在 Fuchsia 項目的佔比)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Rust 社區中有幾種端到端的工具,我們已經開始從開發人員和工具的交互中獲取信息。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這裏有一些例子"}]},{"type":"heading","attrs":{"align":"center","level":3},"content":[{"type":"text","text":"tokei"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由於可信編程項目通常涉及多個編程語言,我們採用了 tokei 作爲多語言代碼複雜性度量工具,可識別多達 200 種編程語言。例如,開源的 Fuchsia 項目涉及了多種編程語言,下面的統計信息顯示有多少行不同語言的代碼:"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/52\/9f\/52cf6633e5f6555e06d7fcf2416f3c9f.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另外,爲了在大型項目中滿足處理多種編程語言的場景需求,我們爲tokei 提供新特性,使其支持識別編程語言的批處理。"}]},{"type":"heading","attrs":{"align":"center","level":3},"content":[{"type":"text","text":"cargo-geiger"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了提高安全性,我們經常想知道有多少代碼已經被 Rust 編譯器檢查過。幸運的是,通過統計帶有“unsafe”關鍵字的fn、expr,struct、impl、trait 及其在各相關庫, cargo-geiger 幾乎做到了這點。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/c6\/a5\/c6fd5bb0c023ebbaabcb80bbb70c48a5.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"不過,統計數字中並沒有反映安全性,所以沒辦法展現 Rust 項目總體上取得了多少進展的比例。因此,我們提交了代碼,在改進的 cargo-geiger 計數器報告中提供 Rust 項目的安全檢查比率。這些代碼被採納後,我們的研發團隊現在每天都在使用這個工具,這份典型的報告能夠很容易看出哪些代碼庫還沒被 Rust 編譯器完全檢查到。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/4d\/fb\/4d114b16abc33a41936a2682471502fb.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/ec\/8f\/ecc59855246c49737c844b75d8fcff8f.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"heading","attrs":{"align":"center","level":2},"content":[{"type":"text","text":"通過深度代碼學習研究 Rust"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"隨着 Rust 開源社區代碼的發展和革新,初學者需要學習掌握 Rust 最佳的實踐,其包括但不限於 Rust 語言本身。把統計機器學習的方法應用到源代碼數據上,也稱爲Big Code,正被全世界的軟件工程研究團隊關注:類似於圖像處理和自然語言處理中的機器學習問題,這些問題都需要通過深度神經網絡 (deep neural networks DNN) 提取大量的特徵,Big Code 可能同樣需要去訓練 DNN 來反映程序的統計特性,所以也稱爲“深度代碼學習”。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在這方面,華爲與英國開放大學和新加坡管理大學進行技術合作,在現在最先進的“跨語言”深度代碼學習基礎上進行優化研究。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"例如,最初的深度代碼學習方法應用於北京大學編程課程收集到的 104 個算法類的 5.2 萬個 C\/C++ 程序。對此數據集,樹基卷積神經網絡 (TBCNN) 算法分類準確率達到 94%(AAAI’16)。最近的 SOTA 在語句級使用抽象語法樹 (ICSE ’19) 準確率達到 98%。近期我們同英國開放大學和新加坡管理大學在樹基膠囊網絡的合作研究進展推動了 SOTA 進一步提高,達到 98.4% 的準確率 (AAAI’21)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"早些時候我們已經使用跨語言的數據集證明,對一種編程語言的深度代碼學習模型也適用於另一種編程語言。例如,從 GitHub 爬取的數據集 Rosetta Code,從 Java 到 C 語言,可以獲得 86% 的算法分類準確度 (SANER’19),在 Java 到 C# 的跨語言 API 映射問題中也能發揮重要作用 (ESEC\/FSE’19)。這些統計語言模型在軟件工程中可以應用於很多方面,比如代碼分類、代碼搜索、代碼推薦、代碼摘要、方法名稱預測、代碼克隆檢測等等 (ICSE’21)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了進一步研究分析 Rust 項目,我們向 Rust 解析器項目 tree-sitter 和 XML 序列化 quick-xml 等項目提交了代碼,通過 Rust 程序的抽象語法樹來訓練深度代碼學習模型。研究的初步結果顯示,算法檢測任務在 Rust 代碼上的精度高達 85.5%。隨着工具鏈的改進,這個比例還有望進一步提升。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 Visual Studio Code IDE 上,我們開發擴展插件,使得程序員可以得到合適的算法推薦和可解釋性的幫助。"}]},{"type":"heading","attrs":{"align":"center","level":2},"content":[{"type":"text","text":"結      論"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"綜上所述,華爲可信軟件工程與開源實驗室正在開展的 Rust 工作爲程序員提供智能化端到端 IDE 工具鏈,以期最大限度地提高代碼的安全性和性能。走向可信編程遠景的旅程剛剛開始,我們希望與 Rust 社區和 Rust 基金會深度合作,引領電信軟件產業的可信革新。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"英文原文鏈接"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"https:\/\/trusted-programming.github.io\/2021-02-07\/index.html"}]}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章