AWS推出Apache Airflow全托管工作流MWAA

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最近,AWS推出了"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/managed-workflows-for-apache-airflow\/","title":"","type":null},"content":[{"type":"text","text":"亚马逊Apache Airflow托管工作流"}]},{"type":"text","text":"(MWAA),这是一项全托管的服务,简化了在AWS上运行开源版Apache Airflow和构建工作流来执行ETL作业和数据管道的工作。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Apache Airflow是一个开源工具,用于通过编程的方式开发、调度和监控被称为“工作流”的过程和任务序列。开发人员和数据工程师用Apache Airflow管理工作流,通过用户界面(UI)来监控它们,并通过一组强大的插件来扩展它们的功能。但是,要使用Apache Airflow,需要进行手动安装、维护和扩展。现在,AWS解决了这个问题,它为开发人员和数据工程师提供了MWAA,让他们可以在云端构建和管理自己的工作流,无需关心与管理和扩展Airflow平台基础设施相关的问题。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在AWS有关MWAA的新闻稿中,负责应用集成的副总裁Jesse Dougherty说:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"客户告诉我们,他们非常喜欢Apache Airflow,因为它加快了数据处理和机器学习工作流的开发,但他们希望能够去掉扩展、运维和保护服务器方面的负担。通过使用Amazon MWAA,客户可以使用与现在相同的Apache Airflow平台,同时获得由AWS提供的可伸缩性、可用性和安全性。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Amazon MWAA可以使用"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/athena","title":"","type":null},"content":[{"type":"text","text":"Amazon Athena"}]},{"type":"text","text":"获取来自数据源(如"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/s3\/","title":"","type":null},"content":[{"type":"text","text":"Amazon Simple Storage Service"}]},{"type":"text","text":")的输入,在"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/emr","title":"","type":null},"content":[{"type":"text","text":"Amazon EMR集群"}]},{"type":"text","text":"上执行转换,并使用生成的数据在"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/sagemaker\/","title":"","type":null},"content":[{"type":"text","text":"Amazon SageMaker"}]},{"type":"text","text":"上训练机器学习模型。此外,开发人员和数据工程师可以使用Python在Amazon MWAA中编写"},{"type":"link","attrs":{"href":"https:\/\/airflow.apache.org\/docs\/stable\/concepts.html#dags","title":"","type":null},"content":[{"type":"text","text":"有向无环图(DAG)"}]},{"type":"text","text":"工作流。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/74\/d5\/74f3afeb2f712eb17a99eb76e9761ed5.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"来源:"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/managed-workflows-for-apache-airflow\/","title":"","type":null},"content":[{"type":"text","text":"https:\/\/aws.amazon.com\/managed-workflows-for-apache-airflow\/"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"AWS首席布道师Danilo Poccia在NWAA的一篇介绍博文中写道:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你可以通过以下三个步骤来使用亚马逊MWAA:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"创建环境——每个环境都包含你的Airflow集群,包括调度器、工作程序和Web服务器。开发人员和数据工程师可以从控制台、"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/cli\/","title":"","type":null},"content":[{"type":"text","text":"AWS命令行接口"}]},{"type":"text","text":"(CLI)或"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/tools\/","title":"","type":null},"content":[{"type":"text","text":"AWS SDK"}]},{"type":"text","text":"创建新的Amazon MWAA环境。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上传DAG和插件到S3——Amazon MWAA自动将代码加载到Airflow中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在Airflow中运行DAG——从Airflow UI或命令行(CLI)运行DAG,并使用CloudWatch监控环境。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有了MWAA,开发人员和数据工程师可以通过插件获得开放可扩展性所带来的好处,他们可以创建与工作流所需的AWS或内部资源发生交互的任务,包括AWS Batch、Amazon CloudWatch、Amazon DynamoDB、AWS Lambda、Amazon Redshift、Amazon Simple Queue Service (SQS)和Amazon Simple Notification Service(SNS)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"需要注意的是,AWS还有其他工作流管理系统,比如"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/step-functions\/","title":"","type":null},"content":[{"type":"text","text":"Step Functions"}]},{"type":"text","text":"和"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/glue","title":"","type":null},"content":[{"type":"text","text":"AWS Glue"}]},{"type":"text","text":"。Hacker News上的一位受访者在一篇帖子中解释说:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"它是由内部的Orchestration团队开发的——这个团队也开发了Step Functions,并维护着"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/swf\/","title":"","type":null},"content":[{"type":"text","text":"AWS Simple Workflow"}]},{"type":"text","text":"。我认为Glue与其他的工作流系统不一样——它针对ETL进行了深度优化。我相信,随着时间的推移,会出现更多有关Step Functions和Apache Airflow的详细指南,不过简单地说,Step Functions是完全AWS原生的(并且是无服务器的)编配引擎。当然,Apache Airflow是一个开源的项目,它拥有一个由其他插件组成的多样化生态系统。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MWAA目前可在下列AWS区域使用:美国东部(俄亥俄州和弗吉尼亚州)、美国西部(俄勒冈州)、欧盟(斯德哥尔摩、爱尔兰和法兰克福)和亚太地区(东京、新加坡和悉尼),其他更多地区将会陆续可用。此外,有关服务的详细信息可以在"},{"type":"link","attrs":{"href":"https:\/\/docs.aws.amazon.com\/mwaa\/latest\/userguide\/what-is-mwaa.html","title":"","type":null},"content":[{"type":"text","text":"文档页面"}]},{"type":"text","text":"上获得,有关价格的详细信息可以在"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/managed-workflows-for-apache-airflow\/pricing\/","title":"","type":null},"content":[{"type":"text","text":"定价页面"}]},{"type":"text","text":"上获得。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"原文链接"},{"type":"text","text":":"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/www.infoq.com\/news\/2020\/12\/amazon-managed-apache-airflow\/","title":"","type":null},"content":[{"type":"text","text":"AWS Introduces Amazon Managed Workflows for Apache Airflow"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章