30万行代码的平台升级:给跑着的汽车换轮胎

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#333333","name":"user"}},{"type":"strong"}],"text":"本文最初发布于"},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"Mahmoud Hashemi的个人"},{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#333333","name":"user"}},{"type":"strong"}],"text":"博客,经原作者授权由InfoQ中文站翻译并分享。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"2020年可谓反复无常。尽管一切都超出了人们的控制,但随着时间的推移,我发现自己把越来越多的时间地投入到一件感觉唾手可及的事情中:为我帮助构建的大型企业级Web应用程序"},{"type":"link","attrs":{"href":"https:\/\/simplelegal.com\/","title":null,"type":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"SimpleLegal"}]},{"type":"text","marks":[{"type":"italic"}],"text":"设计一个面向未来的解决方案。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"现在已经完成了,这次平台升级很容易就可以在我最复杂的项目中名列前茅,此时此刻,最幸福的结局。幸福是要付出代价的,但是借助一些恰当的方法,代价可能不会像你想的那么高。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#262626","name":"user"}}],"text":"概述"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们将"},{"type":"link","attrs":{"href":"https:\/\/simplelegal.com\/","title":null,"type":null},"content":[{"type":"text","text":"SimpleLegal"}]},{"type":"text","text":"的主要产品,一个30万行的Django-1.11-Python 2.7-Redis-Postgres-10代码库,移植到Django 2.2-Python 3.8-Postgres-12技术栈,如期完成,而且没有发生重大站点事件。这感觉很棒。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"作为这个项目的技术主管,它看起来是什么样子?对我来说,是这样的:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/a7\/a7936f3e65682bf40d198be21b54bc02.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但作为工程总监,它的成本是多少?"},{"type":"text","marks":[{"type":"strong"}],"text":"3.5年的开发时间,每行代码只需要2美元。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我对这个结果感到特别自豪,因为在这个过程中,我们也大大提高了网站和开发过程本身的速度和可靠性。现在,该产品有了一个光明的未来,已经准备好在销售征求建议书和合规调查问卷上大放异彩了。最重要的是,你不必担心怎样委婉地告诉潜在客户,他们将使用的是不受支持的技术。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"简而言之,这是一笔巨大的、稳健的投资,而且已经取得了回报。如果你来这里只是为了看看我们自己对这项工作的估计,那就是上面这些了。这篇文章是介绍如何让你的团队达到同样的结果。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#262626","name":"user"}}],"text":"背景"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"故事开始于2013年,刚刚从YC孵化出来的SimpleLegal为一家新成立的SaaS法律技术公司做了所有正确的决定:Python、Django、Postgres和Redis。在典型的初创公司模式中,在技术不成障碍的情况下,功能是第一位的。软件包只是顺带升级。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"到2019年,这条技术跑道的终点已经临近。虽然Python 2可能得到了来自不同供应商的扩展支持,但在2021年,Django 1 CVE补丁的志愿者已经非常少了。Web框架成了风险较大的攻击面,所以是时候偿还我们的技术债务了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#262626","name":"user"}}],"text":"开端"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因此,我们在2019年第4季度开始了Tech Refresh平台升级计划。其目标是:升级技术栈,同时仍然提供新特性,就像给跑着的汽车换轮胎。我们要小心谨慎,而那需要时间。以下是一些长期项目的基本原则:"}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"任何每周工作10小时以上的项目都应该每周花30分钟进行同步。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"每次定期会议都应该有记录。把它放在邀请函里。使用项目日志记录进度、阻碍因素和决策。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"这是一场马拉松,不是短跑。要避免在晚上、周末和假期工作。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们从一个计划草图开始,经过开放地讨论,最终只有一半正确。有一些早期的猜测成功实现:"}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"转到"},{"type":"link","attrs":{"href":"https:\/\/github.com\/jazzband\/pip-tools","title":null,"type":null},"content":[{"type":"text","text":"pip-tools"}]},{"type":"text","text":",并根据广泛的变更日志分析解除依赖关系。识别不兼容py23版本的包。(尽管我们已经转向"},{"type":"link","attrs":{"href":"https:\/\/github.com\/python-poetry\/poetry","title":null,"type":null},"content":[{"type":"text","text":"poetry"}],"marks":[{"type":"color","attrs":{"color":"#41accd","name":"user"}}]},{"type":"text","text":"。)"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"在CI中加入行覆盖率报告。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"改进内部测试框架,让开发者可以快速编写测试。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下面有更多相关内容。其他的计划就不那么现实了:"}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"在6个月内将CI行覆盖率从大约60%提升到95%。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"在三个月内并行转换app程序包。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"利用美国节假日(感恩节、圣诞节、新年)期间的低流量时间,在2021年之前逐步切换到新应用。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们年轻!虽然我们天真,但至少我们知道有很多工作要做。为了分担这项工作,我们寻找、雇佣并培训了三名敬业的海外开发人员。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#262626","name":"user"}}],"text":"导向问题"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"即使新增了开发人员,到2020年中期,我们越来越认识到,95%的覆盖率就是在做梦,更不用说100%了。全部覆盖可能是最佳实践,但3个半开发人员没法做到这样的覆盖范围。我们做了有价值的测试,甚至发现了以前的Bug,但如果我们坚持这个计划,Django 2最终将成为一个2022年的项目。70%,我们决定修改目标。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们意识到,对于大多数站点来说,CI比大多数用户更敏感。所以我们专注于测试影响最大的代码。怎么才算影响大?1)失败了最易被察觉的代码;2)最难重试的代码。通过查看流量统计数据、批处理作业计划和询问支持人员,你可以在一周内构建出高影响代码清单。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"大约80%的代码库都不在这个高流量\/高影响列表中。那80%该怎么办呢?利用错误检测和快速修复。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#262626","name":"user"}}],"text":"转换Sentry的角色"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/f7\/f7c2752008eb562abd16aedce60c3018.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"创业生活的一个好处是,尝试新工具很容易。我们在SimpleLegal所采用的一种做法是,把每5个周的最后一周(即20%的时间)留给开发人员,让他们专注于开发过程本身。即使是最好的厨师也不能在脏乱的厨房里做出五星级的食物。这是我们改进工作的方法,最终加快了交付速度。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在这样一个时期,有人想出了一个天才的主意,使用"},{"type":"link","attrs":{"href":"https:\/\/sentry.io\/","title":null,"type":null},"content":[{"type":"text","text":"Sentry"}]},{"type":"text","text":"将专门的错误报告添加到系统中。在一两天内,我们就有了一个网站,你可以访问并获取堆栈跟踪。这非常神奇,但直到Tech Refresh计划开始我们才意识到,虽然集成只需要一天的开发时间,但完全采用却需要团队几个月的时间。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你看,在一个成熟但快速运转的系统上增加Sentry意味着一件事:噪音。我们的网站一直在出错。大多数错误是不可见的,也没有妨碍用户使用,有些用户已经悄悄学会了如何处理长期存在的网站怪癖。很快,我们的开发人员就学会了把Sentry当作调试信息的存储库。2019年,Sentry事件本身并不值得认真对待。2020年,情况发生了变化,负责将平台无缝升级的团队需要把Sentry变成另一种东西:响应性网站质量工具。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们是怎么做到的呢?第一步,通过"},{"type":"link","attrs":{"href":"https:\/\/docs.sentry.io\/product\/sentry-basics\/guides\/getting-started\/#-how-many-projects-should-i-create","title":null,"type":null},"content":[{"type":"text","text":"以下最佳实践"}]},{"type":"text","text":"增强流入Sentry的数据:"}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"将产品拆分成"},{"type":"link","attrs":{"href":"https:\/\/docs.sentry.io\/product\/sentry-basics\/guides\/getting-started\/#-how-many-projects-should-i-create","title":null,"type":null},"content":[{"type":"text","text":"单独的Sentry项目"}]},{"type":"text","text":"。这包括前端和后端。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"标记版本。不要用分支来标记开发环境部署,这会导致Releases UI混乱。添加一个单独的分支标签用于搜索。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"把环境分开。这对于定向报警至关重要。Sentry客户端环境是通过域约定和Django的"},{"type":"link","attrs":{"href":"https:\/\/docs.djangoproject.com\/en\/3.1\/ref\/contrib\/sites\/","title":null,"type":null},"content":[{"type":"text","text":"sites框架"}]},{"type":"text","text":"来配置的。为了便于理解,这里有一个基线,我们使用这些环境:"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"生产环境:当前正式版本。DevOps监控。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"沙箱环境:当前正式版本(部分公司会做下一次发布)。供用户测试变更使用。DevOps监控。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":6,"align":null,"origin":null},"content":[{"type":"text","text":"演示\/销售环境:上一个正式版本。主要是内部流量,但在前景演示时外部也可见。DevOps监控。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":7,"align":null,"origin":null},"content":[{"type":"text","text":"金丝雀环境:下一个正式版本。也称为过渡环境。内部流量。Dev监控。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":8,"align":null,"origin":null},"content":[{"type":"text","text":"ProdQA环境:当前正式版本。内部用于重现技术支持问题。Dev监控。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":9,"align":null,"origin":null},"content":[{"type":"text","text":"QA环境:Dev分支、dev发布、内部流量。未监控调试数据。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":10,"align":null,"origin":null},"content":[{"type":"text","text":"本地测试\/CI环境:默认不发布到Sentry。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"当问题最终被正确标记并且可以搜索之后,我们使用Sentry新增的"},{"type":"link","attrs":{"href":"https:\/\/docs.sentry.io\/product\/discover-queries\/","title":null,"type":null},"content":[{"type":"text","text":"Discover工具"}]},{"type":"text","text":"每周导出问题,并对遗留错误进行优先级排序。我们首先关注的是对于非内部人类用户高可见的生产错误。具体查询是:"},{"type":"codeinline","content":[{"type":"text","text":"has:user !transaction:\/api\/* event.type:error !user.username:*@simplelegal.*"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们将其分为4类:快速修复(小漏洞)、快速错误(将一个含糊的500错误转变成某种形式的可操作的400错误)、"},{"type":"link","attrs":{"href":"http:\/\/agiledictionary.com\/209\/spike\/","title":null,"type":null},"content":[{"type":"text","text":"Spike"}]},{"type":"text","text":"(比较大的漏洞,需要研究)和Silence(使用Sentry的忽略功能)。在6周的时间里,每周事件量由每周超过2500次下降到了不到500次。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通过进一步的努力,每周的事件量已经少于100次,并且分散在几个问题上,对于一个精益团队来说,这非常容易管理。虽然“Sentry Zero”是最理想的,但我们实现并维持了响应流的真正目标,这在很大程度上要归功于"},{"type":"link","attrs":{"href":"https:\/\/sentry.io\/integrations\/slack\/","title":null,"type":null},"content":[{"type":"text","text":"Slack集成"}]},{"type":"text","text":"。我们的团队不再从支持团队那里获取服务器错误信息。事实上,现在,当客户遇到麻烦时,我们会告诉他们,而我们已经有了一个处理中的工单。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"和支持团队建立紧密的联系非常重要。在上面的策略中,我们嵌入了比真实用户更敏感的CI。虽然完美很诱人,但要求企业用户有一点耐心也是可以的,前提是支持团队已经做好了准备。每周都和他们同步,这样惊喜就少了。如果他们干劲十足,你也可以教他们一些Sentry基础知识。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#262626","name":"user"}}],"text":"新征程"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/05\/05783d1d0f878419369cea675b3ad316.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"随着噪音的消除,我们已准备好快速行动。以下是我们在做出这些改变时积累的一些经验。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"诉诸事务"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果使用得当,回滚可以使错误看起来像从未发生过,这是快速修复策略的完美补充。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#262626","name":"user"}}],"text":"真正的原子请求"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"把操作尽可能地放入事务中。打开"},{"type":"link","attrs":{"href":"https:\/\/docs.djangoproject.com\/en\/3.1\/topics\/db\/transactions\/#tying-transactions-to-http-requests","title":null,"type":null},"content":[{"type":"text","text":"ATOMIC_REQUESTS"}]},{"type":"text","text":"(如果没打开的话)。但是,有些请求所做的不仅仅是更改数据库,比如它们会发送通知,将后台任务入队。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在SimpleLegal,我们重新设计了架构,将所有副作用(除了日志记录)推迟到成功返回响应时。中间件可以提供帮助,但我们主要是通过将Redis队列切换到基于PostgreSQL的任务队列\/代理来实现的。这种配置可以确保,如果发生错误,事务将被回滚,任务不会进入队列,用户将得到一个干净的失败。我们在Sentry中定位故障,切换到旧站点进行消除,他们下一次重试就会成功。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#262626","name":"user"}}],"text":"事务性测试设置"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"事实证明,事务性对我们的测试策略来说也很关键。SimpleLegal早已超过了Django原始的fixture系统。大多数测试都需要复杂的Python设置,这使得编写测试和运行测试都很慢。为了加快编写和运行的速度,我们将整个测试会话封装到一个事务中,然后,在运行任何测试用例之前,我们设置了示例性的基本状态。测试用例使用这些基本状态作为"},{"type":"link","attrs":{"href":"https:\/\/docs.pytest.org\/en\/stable\/fixture.html","title":null,"type":null},"content":[{"type":"text","text":"fixture"}]},{"type":"text","text":",并在每个测试用例之后回滚到基本状态。详情请参阅"},{"type":"link","attrs":{"href":"https:\/\/gist.github.com\/mahmoud\/10f6b6b0a9c5860030693357124131df","title":null,"type":null},"content":[{"type":"text","text":"contest.py摘录"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#262626","name":"user"}}],"text":"有些最佳实践并不适合你"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"软件场景的差别如此之大,知道哪些建议不适合你是一门艺术。以下是我们亲身了解到的各种死胡同。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#262626","name":"user"}}],"text":"命名空间的运用"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"考虑到代码被划分成模块、包、Django应用等的方式,把它们作为工作单元可能很有诱惑力。开始时不要这样。代码划分可能非常随意,很难知道你何时就进入了一个有风险的思路。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"假如有自动重构,就像在"},{"type":"link","attrs":{"href":"https:\/\/portingguide.readthedocs.io\/en\/latest\/","title":null,"type":null},"content":[{"type":"text","text":"2to3转换"}]},{"type":"text","text":"中一样,首先要按转换类型进行移植。这样,你只需要查看一个命令和受影响的路径列表。另外,自动修复必须遵循一种模式,这意味着更多的人可以修复重构导致的错误。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"盖"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#262626","name":"user"}}],"text":"覆盖率工具"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/57\/57cc3e421391def8f4e55d38a17bdba3.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"覆盖率对我们来说是好坏参半。显然,覆盖率优先策略是站不住脚的,但对优先级划分和状态检查,它仍然有用。就单次变更来说,我们发现覆盖率工具有些不可靠。我们从来没有弄清楚为什么覆盖率的作用有不确定性,我们得出了这样的结论:“像codecov这样的现成工具可能并不是针对我们这种规模的monorepos。”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在撞上覆盖率墙的过程中,我们研究了其他许多关于覆盖率的解释。对我们来说,“路由覆盖”(即每个URL至少有一个集成测试)和“模型表示复盖”(即每个模型对象都有一个有用的文本表示,可以用于Sentry调试)比行覆盖优先级高得多。如果有更多的时间,我们会希望围绕这些构建工具,甚至是围绕基于在线分析的覆盖率统计,从而优先考虑流量最高的路由,而不仅仅是流量最高的代码行。如果你听说过这些方法,我们很想和你讨论一下。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#262626","name":"user"}}],"text":"扁平化数据库迁移"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"从表面上看,减少需要升级的文件数量似乎是合理的。事实证明,扁平化"},{"type":"link","attrs":{"href":"https:\/\/docs.djangoproject.com\/en\/3.1\/topics\/migrations\/","title":null,"type":null},"content":[{"type":"text","text":"迁移"}]},{"type":"text","text":"是一种消除文件的低收益策略。更改历史迁移文件结构会使上线过程变得复杂,而升级没有扁平化的迁移文件则很简单。更不用说,如果只是想要加速CI,你可以像我们在"},{"type":"link","attrs":{"href":"https:\/\/openedx.atlassian.net\/wiki\/spaces\/AC\/pages\/23003228\/Everything+About+Database+Migrations#EverythingAboutDatabaseMigrations-SquashingMigrations","title":null,"type":null},"content":[{"type":"text","text":"Open edX平台"}]},{"type":"text","text":"上所做的那样:"},{"type":"link","attrs":{"href":"https:\/\/github.com\/edx\/edx-platform\/blob\/66f0f9891f00994f77604a51dbb29736aa605fa8\/scripts\/reset-test-db.sh#L75","title":null,"type":null},"content":[{"type":"text","text":"建立一个基本的DB缓存,每隔几个月检查一次"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"事实证明,"},{"type":"link","attrs":{"href":"https:\/\/sedimental.org\/awesome_python_applications.html#goal-1-a-better-development-cycle","title":null,"type":null},"content":[{"type":"text","text":"你可以从开源应用程序中学到很多东西"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#262626","name":"user"}}],"text":"慢慢适应新技术栈"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果你有多个应用程序,请使用相对比较小也比较简单的应用程序来试验更改。幸运的是,我们有一个独立的应用,它的测试运行速度更快,这让我们能够更紧凑地了解开发循环。同样地,如果你有多个生产环境,则从影响最小的一个环境开始推出。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"把CI作业复制到新的技术栈中,它们都会失败,但要克制住把它们标记为可选项的冲动。相反,构建一个包含所有测试及其当前测试状态的单文件清单。我们为测试运行程序"},{"type":"link","attrs":{"href":"https:\/\/docs.pytest.org\/en\/stable\/","title":null,"type":null},"content":[{"type":"text","text":"pytest"}]},{"type":"text","text":"构建了一个小扩展,它基于状态清单文件批量跳过测试。然后,ratchet:取消并修复测试,更新文件,检查测试是否通过,然后重复。这比遍布代码库的"},{"type":"link","attrs":{"href":"https:\/\/docs.pytest.org\/en\/latest\/skipping.html#skipping-test-functions","title":null,"type":null},"content":[{"type":"text","text":"pytest标记"}]},{"type":"text","text":"装饰器更方便和可扫描。详情请参阅"},{"type":"link","attrs":{"href":"https:\/\/gist.github.com\/mahmoud\/10f6b6b0a9c5860030693357124131df","title":null,"type":null},"content":[{"type":"text","text":"contest .py摘录"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#262626","name":"user"}}],"text":"上线试运行"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在2020年第四季度,我们增加了基础设施,以在相同的数据库支持下并行运行新旧站点。我们进入了这样一个循环,使流量到达新技术栈,构建一个需要修复的Sentry问题队列,然后关闭它,并跟踪时间。使用新技术栈大约120个小时后,经过昼夜不停地策略性扩展,组织已经建立起足够的信心,我们可以在最关键的时间让站点继续运行:在月初的周一和周二。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"唯一的问题是"},{"type":"link","attrs":{"href":"https:\/\/www.zdnet.com\/article\/aws-outage-impacts-thousands-of-online-services\/","title":null,"type":null},"content":[{"type":"text","text":"AWS在感恩节周的宕机"}]},{"type":"text","text":"。此时我们已经提前完成了计划,并且对快速修复工作流建立起了足够的信心,不再需要最初的假日测试窗口。为此,我们感谢了很多人。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们一直用快速修复的方法,直到我们完成。“完成”不是指新系统没有错误,而是指流量在新系统上时事件比旧系统少。然后,继续修复,并开始安排时间删除脚手架。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/7b\/7b1ef7b0ebac8d8b34b99a8bdb4b6ce7.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#262626","name":"user"}}],"text":"后记"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"所以,一旦你使用了Django、Python、Linux和Postgres当前的LTS版本,任务就完成了,对吧?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"谢天谢地,技术债务从不会到0。虽然按期更新并更换核心技术不是一件小事,但用闪亮的部件替换生锈的部件并不会改变设计。架构技术债务——抽象中的错误,包括缺乏抽象——可能会带来更大的挑战。这些问题的解决方案并不能在项目之间完全推广,但它们确实会受益于这个最新的、无错误的基础。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"对于所有希望更换轮胎的项目,我们希望这次回顾能够帮助你在未来几年充满信心地、务实地改进技术栈。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"查看英文原文:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/sedimental.org\/tech_refresh.html","title":null,"type":null},"content":[{"type":"text","text":"Changing the Tires on a Moving Codebase"}]}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章