AWS推出Apache Airflow全託管工作流MWAA

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最近,AWS推出了"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/managed-workflows-for-apache-airflow\/","title":"","type":null},"content":[{"type":"text","text":"亞馬遜Apache Airflow託管工作流"}]},{"type":"text","text":"(MWAA),這是一項全託管的服務,簡化了在AWS上運行開源版Apache Airflow和構建工作流來執行ETL作業和數據管道的工作。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Apache Airflow是一個開源工具,用於通過編程的方式開發、調度和監控被稱爲“工作流”的過程和任務序列。開發人員和數據工程師用Apache Airflow管理工作流,通過用戶界面(UI)來監控它們,並通過一組強大的插件來擴展它們的功能。但是,要使用Apache Airflow,需要進行手動安裝、維護和擴展。現在,AWS解決了這個問題,它爲開發人員和數據工程師提供了MWAA,讓他們可以在雲端構建和管理自己的工作流,無需關心與管理和擴展Airflow平臺基礎設施相關的問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在AWS有關MWAA的新聞稿中,負責應用集成的副總裁Jesse Dougherty說:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"客戶告訴我們,他們非常喜歡Apache Airflow,因爲它加快了數據處理和機器學習工作流的開發,但他們希望能夠去掉擴展、運維和保護服務器方面的負擔。通過使用Amazon MWAA,客戶可以使用與現在相同的Apache Airflow平臺,同時獲得由AWS提供的可伸縮性、可用性和安全性。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Amazon MWAA可以使用"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/athena","title":"","type":null},"content":[{"type":"text","text":"Amazon Athena"}]},{"type":"text","text":"獲取來自數據源(如"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/s3\/","title":"","type":null},"content":[{"type":"text","text":"Amazon Simple Storage Service"}]},{"type":"text","text":")的輸入,在"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/emr","title":"","type":null},"content":[{"type":"text","text":"Amazon EMR集羣"}]},{"type":"text","text":"上執行轉換,並使用生成的數據在"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/sagemaker\/","title":"","type":null},"content":[{"type":"text","text":"Amazon SageMaker"}]},{"type":"text","text":"上訓練機器學習模型。此外,開發人員和數據工程師可以使用Python在Amazon MWAA中編寫"},{"type":"link","attrs":{"href":"https:\/\/airflow.apache.org\/docs\/stable\/concepts.html#dags","title":"","type":null},"content":[{"type":"text","text":"有向無環圖(DAG)"}]},{"type":"text","text":"工作流。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/74\/d5\/74f3afeb2f712eb17a99eb76e9761ed5.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"來源:"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/managed-workflows-for-apache-airflow\/","title":"","type":null},"content":[{"type":"text","text":"https:\/\/aws.amazon.com\/managed-workflows-for-apache-airflow\/"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"AWS首席佈道師Danilo Poccia在NWAA的一篇介紹博文中寫道:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你可以通過以下三個步驟來使用亞馬遜MWAA:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"創建環境——每個環境都包含你的Airflow集羣,包括調度器、工作程序和Web服務器。開發人員和數據工程師可以從控制檯、"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/cli\/","title":"","type":null},"content":[{"type":"text","text":"AWS命令行接口"}]},{"type":"text","text":"(CLI)或"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/tools\/","title":"","type":null},"content":[{"type":"text","text":"AWS SDK"}]},{"type":"text","text":"創建新的Amazon MWAA環境。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上傳DAG和插件到S3——Amazon MWAA自動將代碼加載到Airflow中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在Airflow中運行DAG——從Airflow UI或命令行(CLI)運行DAG,並使用CloudWatch監控環境。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有了MWAA,開發人員和數據工程師可以通過插件獲得開放可擴展性所帶來的好處,他們可以創建與工作流所需的AWS或內部資源發生交互的任務,包括AWS Batch、Amazon CloudWatch、Amazon DynamoDB、AWS Lambda、Amazon Redshift、Amazon Simple Queue Service (SQS)和Amazon Simple Notification Service(SNS)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"需要注意的是,AWS還有其他工作流管理系統,比如"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/step-functions\/","title":"","type":null},"content":[{"type":"text","text":"Step Functions"}]},{"type":"text","text":"和"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/glue","title":"","type":null},"content":[{"type":"text","text":"AWS Glue"}]},{"type":"text","text":"。Hacker News上的一位受訪者在一篇帖子中解釋說:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"它是由內部的Orchestration團隊開發的——這個團隊也開發了Step Functions,並維護着"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/swf\/","title":"","type":null},"content":[{"type":"text","text":"AWS Simple Workflow"}]},{"type":"text","text":"。我認爲Glue與其他的工作流系統不一樣——它針對ETL進行了深度優化。我相信,隨着時間的推移,會出現更多有關Step Functions和Apache Airflow的詳細指南,不過簡單地說,Step Functions是完全AWS原生的(並且是無服務器的)編配引擎。當然,Apache Airflow是一個開源的項目,它擁有一個由其他插件組成的多樣化生態系統。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MWAA目前可在下列AWS區域使用:美國東部(俄亥俄州和弗吉尼亞州)、美國西部(俄勒岡州)、歐盟(斯德哥爾摩、愛爾蘭和法蘭克福)和亞太地區(東京、新加坡和悉尼),其他更多地區將會陸續可用。此外,有關服務的詳細信息可以在"},{"type":"link","attrs":{"href":"https:\/\/docs.aws.amazon.com\/mwaa\/latest\/userguide\/what-is-mwaa.html","title":"","type":null},"content":[{"type":"text","text":"文檔頁面"}]},{"type":"text","text":"上獲得,有關價格的詳細信息可以在"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/managed-workflows-for-apache-airflow\/pricing\/","title":"","type":null},"content":[{"type":"text","text":"定價頁面"}]},{"type":"text","text":"上獲得。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"原文鏈接"},{"type":"text","text":":"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/www.infoq.com\/news\/2020\/12\/amazon-managed-apache-airflow\/","title":"","type":null},"content":[{"type":"text","text":"AWS Introduces Amazon Managed Workflows for Apache Airflow"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章