2020年美國大選技術平臺架構

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在2020年美國總統大選中,我擔任拜登陣營的首席技術官。在2020年11月的QCon Plus大會上,我做了一個"},{"type":"link","attrs":{"href":"https:\/\/www.infoq.com\/presentations\/biden-presidential-campaign\/","title":"","type":null},"content":[{"type":"text","text":"演講"}]},{"type":"text","text":",分享了我們的團隊爲競選而做的架構以及爲了解決各種問題而開發的特殊工具。這篇文章是這次演講的提煉總結。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"作爲首席技術官,我領導着競選陣營的技術團隊。我負責整體的技術運營工作,與我的優秀團隊一起構建出了總統競選活動中最好的技術棧。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們涉足了所有東西,從軟件工程到IT運營,再到網絡安全,以及它們之間的方方面面。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我作爲Target的一名工程師加入競選陣營,專注於基礎設施和運營,構建高可伸縮且可靠的分佈式系統。在2016年希拉里競選美國總統期間,我作爲Groundwork的員工參與了競選技術平臺的工作。Groundwork是一家專門爲競選活動提供技術支持的公司。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"競選活動組織結構"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"錯綜複雜的競選組織結構對技術選型的方方面面都有很大的影響。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖1列出了競選組織的各個部門,讓你對各個部門的職責有一個大致的感受。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/imgopt.infoq.com\/fit-in\/1200x2400\/filters:quality(80)\/filters:no_upscale()\/articles\/tech-presidential-campaign\/en\/resources\/1Figure-1-The-organizational-structure-of-a-political-campaign-1634559286236.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖1:競選活動的組織結構"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"每一個團隊都有自己關注的方面。儘管技術受到媒體的關注,但從政治角度來看,它並不是最重要的。我們的目標是接觸選民,讓他們發聲,讓儘可能多的人加入民主進程。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"技術在競選活動中的位置"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在競選活動中,技術需要完成任何需要它完成的事情。競選活動的幾乎每一個方面都需要一種技術形態。我的團隊需要負責構建和管理我們的雲平臺、所有的IT運營工作、供應商入駐,等等。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"簡單地說,我們構建技術的方式就是通過“膠水”將供應商和各種系統“粘合”在一起。我們所處的環境讓我們不能像開發成熟產品那樣做。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果我們需要某種工具,但沒有相應的供應商可以提供或者找不到開源的,也沒有足夠的預算,我們就自己開發。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"競選活動技術要做的大部分事情是將數據從A點移動到B點。嚴格來說,就是創建了很多S3桶。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另外,我們在過去幾年開發了很多東西。我們構建了數十個網站,有大型的,也有小型的。我們爲外勤團隊開發了一個移動應用。我們爲競選團隊成員開發Chrome插件,幫助他們減輕工作負擔。我們還開發了Word和Excel宏來提高工作效率,儘管這些事情很枯燥無味。在總統競選活動中,時間就是一切,我們所做的每一件可以節省時間的事情,哪怕只節省了一分鐘,也是值得我們投入的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們所做的大部分工作是通過自動化任務減輕競選團隊的工作負載。競選活動技術的精華部分可以提煉成:數據、數字化、IT或網絡安全。實際情況是,競選活動的技術都是關於這些東西,它是競選活動重要的組成部分。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"我們做了哪些事情"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們的團隊不大,但我們生氣勃勃,都是充滿鬥志的技術狂人。我們接下所有的需求,並盡力做到最好。在競選活動期間,我們構建並交付了超過100個服務。我們構建了超過50個Lambda Function,提供各種各樣的功能。我們還爲初選開發了一個叫作“Team Joe”的關係結構組織移動應用,可以把數千名熱切的選民和志願者跟他們認識的人聯繫起來。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們進行了超過10000次部署,零停機,並保證了穩定性和可靠性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們基於雲機器學習實現了自有搜索平臺,具備健壯的自動化能力,爲活動節省了數萬小時的人工工作量,併爲我們帶來非常有深度的見解。我們基於強大的機器學習基礎設施構建了很多服務。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"競選活動一開始就承若,我們將以高標準來要求自己,並確保不接受來自傷害地球的組織或別有用心的個人的捐贈。爲了信守承諾,我們建立了一個自動化框架,幾乎可以實時地審查我們的捐助者是否遵守聯邦選舉委員會的規定和競選承諾。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們贏得了初選,並希望在大選的網站上打造一個新的品牌。所以我們爲joebiden.com帶來了一種全新的體驗。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"競選活動的一個重要方面是直接接觸選民,讓他們知道他們所在地區發生了什麼,或者是什麼時候可以投票。爲此,我們建立了一個全國範圍內的短信推廣平臺。在這一過程中,我們節省了數百萬美元的運營費用。除此之外,競選總部和州總部也開始IT化運作,最終在疫情隔離開始時成爲一個可以完全遠程運作的組織。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們通過這些確保了我們擁有世界級的網絡安全,這是我們所做一切的核心。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但不管怎樣,競選活動中的任何一項工作都不只是個體的責任,每個人都需要在任何時候、任何地點提供幫助,無論是打電話讓選民去投票、發送短信還是收集簽名讓他們參加投票。這些我們都做了。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"基礎設施和平臺"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們所做的一切都依賴於雲基礎設施。最重要的是,我們沒有花太多寶貴的時間在重新創造輪子上。我們使用Pantheon來託管主網站joebiden.com。對於非網站的工作負載、API和服務,我們把它們部署在AWS上。此外,我們在AWS部署了一個小型Kubernetes,幫助我們更快地交付簡單的工作負載(主要用於CRON作業)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們仍然需要快速交付大規模服務。作爲一個小團隊,我們的構建和部署管道的可重複性就變得非常重要。對於持續集成,我們使用了Travis CI。對於持續交付,我們使用了Spinnaker。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"服務在雲端部署並開始運行了之後,它們都需要一組核心功能,比如如何找到其他服務並安全地訪問配置和祕鑰。爲此,我們使用了HashiCorp的Consul和Vault,這幫助我們構建了完全不可變和差異化的開發環境和生產環境,很少需要手動操作服務器——不是完全不需要,但真的非常少。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"很大一部分技術工作是由分析團隊完成的。爲了確保他們能夠獲得最好的工具和服務,我們將分析數據保存在AWS Redshift中。它提供了一個高度可伸縮的環境,可以對資源利用做到細粒度控制。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們將PostgreSQL作爲服務的後端數據存儲。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從運維的角度來看,我們希望應用程序的日誌記錄活動有一箇中心視圖,以便我們能夠快速地排除和診斷問題,實現最快的故障恢復。爲此,我們部署了一個ELK棧,並使用AWS Elasticsearch來存儲日誌。日誌對於應用程序來說非常重要。指標提供了服務運行狀態的洞見,在與我們的輪崗待命機制集成時,它們是非常關鍵的信息來源。在服務水平指標方面,我們部署了Influx和Grafana,並將它們連接到PagerDuty,確保不會遇到未知的中斷。很多自動化工作負載和任務並不適合使用傳統的部署模型,對於這些,我們會盡可能使用AWS Lambda。我們還將Lambda用於需要與AWS生態系統其他部分集成的工作負載,以及基於數據呈現的扇出作業。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們構建了一個真正的多語言環境。我們用各種語言和框架構建了服務和自動化機制,我們能夠以無與倫比的彈性和速度做到這些。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們在競選期間涉及了很多領域。要把我們所做的每一件事都徹底細化需要花費一生的時間,所以我將挑選一些有趣的架構,這些架構是由負責軟件工程的技術團隊做起來的。當我深入討論架構時,請注意,對於我所討論的每個細節,至少還可以做十幾次分享,它們涵蓋了我們整個技術團隊所完成的工作的廣度。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"捐款者審覈"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"正如前面提到的,競選活動一開始就承諾拒絕某些組織和個人的捐款。要大規模做到這一點,唯一的辦法是讓一羣人定期梳理捐款,通常是每季度梳理一次,並標出可能匹配篩選條件的捐款者。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這個過程很困難,很耗時,而且容易出錯。如果是手工操作,通常需要爲個人捐贈的金額設定一個門檻,然後研究捐贈者是否符合我們標記的有問題的類別。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了提高效率,我們構建了一個高度可伸縮的自動化流程,將捐贈的細節與我們想標記的一組標準相關聯。我們每天都會啓動一次流程。它將NGP VAN(這是所有捐款信息的真實數據來源)中的捐款數據導出到CSV文件,並轉儲到S3中。將CSV文件存儲到S3將會觸發一個SNS通知,該通知反過來將激活一系列Lambda Function。這些Lambda Function將文件分割成更小的塊,將它們重新導入到S3,並啓動捐款者審查流程。在這個工作流執行時,我們可以看到有大規模的Lambda,多達1000個並行執行的捐款者審查代碼。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"例如,我們承諾不接受來自天然氣和石油行業的說客和高管的捐款。這個過程只需要幾分鐘,就可以完成對捐贈者的全面審查,並將其與說客、外國代理以及油氣公司高管的黑名單進行覈對。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在這個流程完成之後,標記的條目將被整理成單個CSV文件,然後將該文件重新導到S3,供下載使用。隨後,SES將向開發人員Danielle發送一封電子郵件,表明標誌已準備好,可以進行進一步的驗證。驗證之後,結果被轉發給合規團隊,他們將採取適當的行動,可能是進行退款或做進一步的調查。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/imgopt.infoq.com\/fit-in\/1200x2400\/filters:quality(80)\/filters:no_upscale()\/articles\/tech-presidential-campaign\/en\/resources\/1Figure-2-The-automated-donor-vetting-process-1634559286236.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖2:自動化捐款者審覈流程"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果只是說這個過程爲競選節省了大量的時間,那也太輕描淡寫了。捐款者審查管道很快成爲技術平臺的核心組件和競選活動的重要組成部分。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Tattletale"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在一開始,我們的團隊規模還很小,但我們有很多供應商和雲服務,沒有足夠的時間和人手來檢查這些服務的安全狀態。我們需要一種簡單的方法來制定規則,針對關鍵的面向用戶的系統,確保我們總是遵循網絡安全最佳實踐。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Tattletale正是爲此而開發的一個框架。Tattletale是我們在競選活動中構建的最重要的技術之一。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們設定了一組任務,利用供應商系統API來確保開啓了雙因子身份驗證之類的功能,或者如果用戶帳戶在系統中是活躍的,但該用戶有一段時間沒有登錄,我們就會收到通知。休眠帳戶存在安全風險,因此我們希望確保所有配置都是面向最低權限的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"此外,Tattletale中的規則可以檢查負載均衡器是否無意中暴露在互聯網上、IAM權限的範圍是否太廣,等等。在審計規則集之後,如果Tattletale發現了違規行爲,它會通過Slack頻道發送通知,讓相關人員進行進一步的調查。它還可以通過電子郵件通知用戶有違規行爲發生,這樣用戶就可以自行採取糾正措施。如果超出了某個閾值,Tattletale會在Grafana中記錄一個指標,觸發PagerDuty升級策略,並立即通知待命的技術人員。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/imgopt.infoq.com\/fit-in\/1200x2400\/filters:quality(80)\/filters:no_upscale()\/articles\/tech-presidential-campaign\/en\/resources\/1Figure-3-The-Tattletale-schema-1634559286236.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖3:Tattletale架構"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當我們沒有時間或資源查看安全問題時,Tattletale就成了我們的網絡安全之眼。它還確保了我們遵循一套公共的網絡安全標準,並儘可能保持最高標準。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Conductor和Turbotots"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當我們的內部工具達到一定的複雜性和廣度時,我們需要更好地管理這些API。爲了管理那些系統,我們需要合併我們所構建的UI。我們還需要標準化安全模型,這樣就不會到處都是自定義身份認證和授權。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因此,我們創建了Conductor(因爲拜登喜歡火車,就用售票員來命名)、我們的內部工具UI,以及Turbotots(我們根據土豆來命名的衆多工具之一),也就是我們的平臺API。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/imgopt.infoq.com\/fit-in\/1200x2400\/filters:quality(80)\/filters:no_upscale()\/articles\/tech-presidential-campaign\/en\/resources\/1Figure-4-Conductor-and-Turbotots-1634559286236.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖4:Conductor和Turbotots"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖4實際上是對一個複雜架構的簡化圖,但這些概略的描述足以讓你瞭解Conductor和Turbotots做了哪些事情。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Conductor成爲我們爲競選工作人員開發的所有內部工具的統一入口。換句話說,這是所有參與競選活動的人想要訪問我們提供的服務都必須經過的地方。Conductor是一個React Web應用,通過S3來部署,並通過CloudFront來分發。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Turbotots是一個統一的API,它爲我們所做的一切提供了通用的身份認證和授權模型,與Conductor通信也是通過它。我們在AWS Cognito上構建了Turbotots的AuthN和AuthZ部分,這節省了大量工作,並通過G Suite\/Google Workspace提供了簡單的單點登錄(SSO)功能。身份驗證是通過解析JWT令牌來實現的。爲了在前端管理好它們,我們使用了AWS Amplify的React綁定,它被無縫地集成到應用程序中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖4右側的內容有點難以理解,我會盡量簡化。正向API是一種API網關,包含了完整的代理資源。API網關可以很容易與Cognito集成,這幫助我們實現了API請求的安全性。與API網關集成的Cognito授權器也會在請求通過代理之前執行JWT驗證。我們可以通過它清楚地知道請求在發送到後端之前已經過完整的驗證了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在開發API網關代理資源時,你可以通過VPC網絡負載均衡器讓運行在VPC中的代碼連通起來。這很複雜,但據我所知,它有效地在AWS內部的API網關和你的私有VPC之間創建了一個彈性網絡接口。反過來,NLB被附加到一個包含一組NGINX實例的自動伸縮組中,作爲我們的統一API。這就是Turbotots的主要部分,本質上就是在VPC中運行的所有內部服務的反向代理。在這種模式下,我們不需要向公網公開任何VPC資源。我們可以依賴AWS內置的安全裝置,這讓我們大家都輕鬆多了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當請求到達Turbotots時, Turbotots中的輕量級Lua腳本就會提取JWT令牌的用戶信息部分,並將該數據作爲新的請求負載的一部分傳遞給下游服務。當請求到達目標服務時,它就可以檢查用戶信息,看看是否對該用戶的請求做了驗證。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"用戶可以被添加到Cognito的授權組中,這樣他們就可以在下游服務中獲得不同級別的訪問權限。最小權限原則在這裏仍然適用,並且沒有默認權限。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Conductor和Turbotots作爲一個統一的用戶界面,爲內部工具提供了與G Suite帳戶的無縫SSO集成。在下一節,我們將介紹如何使用相同的架構向非內部用戶公開部分API。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Pencil和Turbotots"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Pencil是一個點對點短信平臺,從一系列簡單的早期需求開始,發展成爲一個我們爲之投入了大量時間的龐大架構。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"自己構建P2P短信平臺的初衷是爲了節省成本。我知道,我們可以通過Twilio發送短信,這比任其他何一個供應商的收費都要低。但是,在一開始,我們並不需要供應商提供的大量功能,我們很容易就能構建一個簡單的短信系統來滿足當時的需求。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"隨着項目越來越流行,它的規模在急劇擴張。Pencil被成千上萬的志願者用來接觸數以百萬計的選民,併成爲我們選民外聯工作流程中的一個重要組成部分。你收到的來自競選志願者的短信很可能是通過Pencil發送的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Pencil的外部架構看起來有點眼熟,我們重用了Conductor架構,只做了一些配置修改,不需要修改代碼。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/imgopt.infoq.com\/fit-in\/1200x2400\/filters:quality(80)\/filters:no_upscale()\/articles\/tech-presidential-campaign\/en\/resources\/1Figure-5-The-Pencil-architecture-looks-a-lot-like-Conductors-1634559286236.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖5:Pencil的架構看起來很像Conductor"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Pencil的用戶組件是一個通過S3部署並通過CloudFront分發的React Web應用程序。React應用程序反過來與連接到Cognito授權器的API網關資源對話。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"用戶通過電子郵件被邀請到Pencil平臺,這個在Cognito註冊賬戶的過程是獨立的。在用戶第一次登錄時,Pencil會自動將他們添加到Cognito的用戶組中,這樣他們就可以訪問Pencil的用戶API。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從用戶的角度來看,是他們點擊了一個註冊鏈接觸發了短信發送。經過競選活動數字團隊的簡單操作之後,用戶(志願者)就可以開始工作了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這是非常容易做到的,因爲它是建立在先前的Turbotots基礎設施之上。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Tots"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Tots的架構與Turbotots不同,它位於Turbotots的下游。在之前的圖中,Tots解析了Turbotots下面列出的很多服務,這些服務是Tots架構的一部分。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/imgopt.infoq.com\/fit-in\/1200x2400\/filters:quality(80)\/filters:no_upscale()\/articles\/tech-presidential-campaign\/en\/resources\/1Figure-6-The-Tots-platform-1634559286236.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖6:Tots平臺"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Tots是一個重要的平臺。隨着我們構建了越來越多的機器學習工具,我們很快意識到,我們需要合併邏輯和簡化架構。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在各種項目中,機器學習最主要的應用場景是從文本塊中提取詞袋。我們將這些數據保存到Elastic的索引中,讓數據主題可用於快速檢索。我們使用AWS Comprehend從文本中提取詞袋。這是一個很棒的服務——給它一段文本,它就會告訴你在這段文本中出現的人、地點、主題等等。文本以多種形式進入平臺,包括多媒體、新聞文章和文檔格式。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖6Tots平臺中的很多機制都涉及在將內容發送給Comprehend之前需要如何處理。這個平臺上的流程爲競選活動節省了成千上萬個小時的人力工作,包括花時間轉錄現場活動內容(如辯論),並將它們轉換成文本格式,以便競選人員日後閱讀。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"CouchPotato"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"CouchPotato幫助我們解決了計算機科學中最難的問題:在Linux上製作音頻。CouchPotato對我們來說非常重要,因爲它爲我們製作音頻和視頻材料節省了大量時間。這也是我們所構建的最強大的技術平臺之一。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/imgopt.infoq.com\/fit-in\/1200x2400\/filters:quality(80)\/filters:no_upscale()\/articles\/tech-presidential-campaign\/en\/resources\/1Figure-7-The-campaign-used-CouchPotato-to-handle-multimedia-1634559286236.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖7:競選活動用CouchPotato處理多媒體任務"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"CouchPotato是圖7所示的架構的主要角色,但從連接多個獨立服務的連線可以看出,在整個過程中,它也需要其他的支持角色。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"CouchPotato的主要功能是將URL或媒體文件作爲輸入,在隔離的X11 Virtual Frame Buffer中打開媒體文件,在PulseAudio播放設備(FFmpeg)上收聽回放,然後記錄X11會話的內容。這就產生了一個MP3文件,然後將它發送到AWS Transcribe進行自動語音識別(ASR)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在ASR完成之後,它會過一遍生成的文本,糾正一些常見錯誤。例如,它很少能把市長Pete Buttigieg的名字寫對。我們用RegEx做一些常見的文本替換。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在完成這些工作之後,轉錄文本將被髮送到Comprehend,它將提取其中的詞袋。最後,文本被索引到Elasticsearch中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"CouchPotato的關鍵之處在於它能夠利用FFmpeg的分段功能生成更小的音頻塊。它會爲每個片段執行整個過程,並確保被索引到Elasticsearch中時保持統一。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這種分段是CouchPotato最初的特性之一,因爲我們用它來實時記錄辯論內容。通常情況下,在競選活動中,會有一羣實習生觀看辯論,並把辯論內容打出來。但我們沒有一大批實習生來做這個事情,所以就有了CouchPotato。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"不過,分段帶來了順序問題。有時候,Transcribe會在前一個片段完成之前完成後一個片段的ASR,這意味着所有需要異步完成的工作在處理完畢之後需要按照正確的順序重新編譯。對我來說,這聽起來就像是一個反應式編程問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們花了很多時間來解決異步處理事件的順序問題。這很複雜,但我們做到了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這個時候,DocsWeb就派上用場了。DocsWeb將CouchPotato的片段轉錄輸出發送到Google Doc,讓我們可以幾乎實時地與競選活動的其他成員分享轉錄文本——除了從Transcribe到Elasticsearch有點延遲,但這並不算太糟。我們記錄下了每一場辯論以及大量其他媒體內容,這些內容需要一羣實習生花上一輩子的時間才能完成。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"解決這些片段的順序問題,其中有一部分工作是弄清楚如何替換CouchPotato實例——比如,在部署期間或實時轉錄事件已經在運行的時候。關於這個話題還有很多要說的,比如我們如何製作音頻幀拼接,以便可以逐步淘汰舊的實例,並逐步引入新的實例。這涉及了太多的內容。要知道,HotPotato做了很多工作,讓我們可以在零部署和零狀態損失的情況下對CouchPotato進行熱部署。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"KatoPotato是整個架構的切入點。它是一個編排引擎,協調CouchPotato、DocsWeb和Elastic之間的數據移動,並充當啓動CouchPotato工作負載的主要API。此外,它還負責監控CouchPotato的狀態,並決定HotPotato是否需要啓動新實例。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有時候,Linux上的音頻捕捉會罷工。CouchPotato能夠向KatoPotato報告它是否真的在捕捉音頻。如果它沒有在捕捉音頻,KatoPotato可以啓動新的CouchPotato實例,並通過HotPotato進行替換。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"CouchPotato是一個建立在雲機器學習之上的平臺,可以讓我們快速瞭解多媒體內容,不必花費寶貴的人力時間在這樣的任務上。這是一個偉大的工程,我很高興它能夠爲我們帶來價值。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"結論"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"關於我們所構建的技術,還有很多其他東西,比如關係組織應用程序、將S3和RDS連接到Redshift的數據管道,或者我們在整個過程中創建的微型網站。我還可以分享我們是如何在一個快節奏和充滿活力的環境中合作的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我已經分享了一些我們爲2020年總統競選所做的最有趣的架構。很高興能與大家分享這些,我也期待今後能與大家分享更多內容。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"作者簡介:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Dan Woods"},{"type":"text","text":"是Shipt的首席信息技術官和網絡安全副總裁。在擔任2020年總統競選首席技術官之前,他是Target的一名傑出的工程師,專注於基礎設施和運營,構建大規模、可靠的分佈式系統。在加入Target之前,他參與了2016年希拉里競選團隊的技術平臺工作。在參與總統競選技術工作之前,Woods曾在Netflix的運營工程部門擔任高級軟件工程師,並幫助創建了Netflix的開源持續交付平臺Spinnaker。Woods還是Ratpack Web框架的開源團隊成員。他是《Learning Ratpack》(於2016年由O'Reilly出版)一書的作者。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"原文鏈接"},{"type":"text","text":":"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/www.infoq.com\/articles\/tech-presidential-campaign\/","title":"xxx","type":null},"content":[{"type":"text","text":"Building Tech at Presidential Scale"}]}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章