微信群总是有人发广告?看我用Python写一个自动化机器人消灭他!

{"type":"doc","content":[{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"写在前面"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"微信群牛皮癣"},{"type":"text","text":",指的是在微信群里面恶心群发小广告的用户,是微信群主最痛恨的一波人。如果熟悉早起的读者可以知道我有一个技术交流群,但是自从建群以来就饱受小广告的困扰。他们伪装成正常人的样子混进群然后不停的发送广告轰炸,严重的打乱了群内的技术交流气氛👇"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/45/451cdaac19c0e7b9f330d7842848a1c1.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"或者是一声不吭的去骚扰每一个群成员👇"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/d0/d01cb56f958a0708c67e67c7ceff8079.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"虽然不清楚是什么能够驱使他们这样不折不扣的努力成为最强微信群牛皮癣(可能是钞能力),但是在太多次的骚扰之后,我决定拿起Python消灭这些小广告。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"第一回合"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其实实现思路很简单,总共分两步"}]},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"1.识别广告用户"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2.写代码移除之"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但是这两步,每一步都不简单,先来说说第一步如何准确的识别这些用户,网上没有数据也没有一个好的鉴别标准,只能用我的大脑完成特征识别。经过这几个月,近百份发广告用户的样本训练,基本可以判断一个非正常用户"},{"type":"text","marks":[{"type":"strong"}],"text":"至少满足下面几条中的三条以上"},{"type":"text","text":":"}]},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"没有设置微信号"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"头像为网红女生"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"微信名为特殊符号或者表情"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"没发过朋友圈"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"没有朋友圈背景图"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通过后不会有除进群申请外的其他回复"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"并且根据历史数据,"},{"type":"text","marks":[{"type":"strong"}],"text":"符合1、3条的用户有极大概率为小广告爱好者"},{"type":"text","text":",那么接下来要做的就是"},{"type":"text","marks":[{"type":"strong"}],"text":"用Python写代码找出微信里面的这些人"},{"type":"text","text":"。在总结出这一规律后很乐观的认为实现这一需求并不困难,因为我在几年前就曾拿过Python研究微信好友,不论是"},{"type":"codeinline","content":[{"type":"text","text":"wxpy"}]},{"type":"text","text":"还是"},{"type":"codeinline","content":[{"type":"text","text":"itchat"}]},{"type":"text","text":"操作起来应该都不复杂,但是事实确证明我还是太年轻了"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"不知从何时起,虽然这些库还能使用但是微信基本已经禁止了大部分人的网页版微信登陆权限,因此当我使用多个微信号分别扫完登陆微信的二维码之后,无一例外的提示我"}]},{"type":"codeblock","attrs":{"lang":""},"content":[{"type":"text","text":"1203\n为了你的帐号安全,此微信号已不允许登录网页微信。\n\n你可以使用Windows微信或Mac微信在电脑端登录。\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"这就让人头疼了,总不能手动的去一个一个check我的几千个微信好友吧,于是我开始思考是否有其他的解决办法。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"第二回合"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果你经常写Python爬虫,那么你会知道在有些情况下,与其使用"},{"type":"codeinline","content":[{"type":"text","text":"Requests"}]},{"type":"text","text":"对付一些恶心的反爬措施,不如"},{"type":"codeinline","content":[{"type":"text","text":"Selenium"}]},{"type":"text","text":"操作起来方便。所以在发现想使用基于微信API的思路失效后,我将目光转向了相对笨一点的方法————"},{"type":"codeinline","content":[{"type":"text","text":"pynput"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"pynput"},{"type":"text","text":"是一款使用Python来控制和监控电脑鼠标、键盘的第三方库,说到这里你大概明白我想怎么做了,直接用API取数据搞不定,那么我就像Selenium一样,"},{"type":"text","marks":[{"type":"strong"}],"text":"模拟点击"},{"type":"text","text":"一个一个好友来实现我想要的操作。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下面简单说一下这个库,因为没有太多依赖库所以安装起来很简单,直接"},{"type":"codeinline","content":[{"type":"text","text":"pip install pynput"}]},{"type":"text","text":"即可,使用起来也很简单,对于鼠标操作只依赖座标,看个demo👇"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/eb/ebc8c802de4c5bfca1bb4d7cba8d6bf6.gif","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"就像上面GIF演示的一样,先导入"},{"type":"codeinline","content":[{"type":"text","text":"pynput"}]},{"type":"text","text":"并实例一个鼠标控制器,接着将微信在状态栏的位置提交给"},{"type":"codeinline","content":[{"type":"text","text":"mouse.position"}]},{"type":"text","text":",这样鼠标就会移动到该位置,再使用"},{"type":"codeinline","content":[{"type":"text","text":"mouse.press"}]},{"type":"text","text":"来模拟鼠标点击即可自动打开微信。那么问题来了,如何获得我想要的位置的座标?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"pynput除了使可以使用"},{"type":"codeinline","content":[{"type":"text","text":"Controller"}]},{"type":"text","text":"来控制鼠标,也可以监控鼠标,比如使用下面的代码就可以记录下程序启动后鼠标的每一个点击操作所在的位置👇"}]},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"from pynput import mouse\n\ndef on_move(x, y ):\n print('鼠标移动至 {0}'.format(\n (x,y)))\n\ndef on_click(x, y , button, pressed):\n print('{0} 在座标 {1}'.format('鼠标点击' if pressed else '鼠标释放', (x, y)))\n if not pressed:\n return False\n\nwhile True:\n with mouse.Listener(on_move = on_move,on_click = on_click) as listener:\n listener.join()"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/a9/a97e624447d7451efc3078794cedb856.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那么接下来的任务就简单了,我们只需要保持微信窗口不移动,在记录下每一个关键位置的座标("},{"type":"text","marks":[{"type":"strong"}],"text":"微信图标位置,群聊窗口位置,单个群成员头像位置"},{"type":"text","text":")之后,比如我们想对上面说的第一条规则进行判断即获取每一个群成员微信号是否设置,就可以按照模拟以下操作实现:"}]},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"点击微信app"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"点击需要的群聊"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"依次点击每一个群成员头像"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"移动到微信号的位置"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"双击"},{"type":"text","text":"该微信号"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"复制该微信号判断是否为初始微信号"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在上面的过程中,值得说的是最后一步,复制我们可以使用pynput中的键盘控制器,在双击选中对应微信号之后通过下面的代码实现模拟键盘输入"},{"type":"codeinline","content":[{"type":"text","text":"Command + C"}]},{"type":"text","text":"完成复制操作"}]},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"from pynput.keyboard import Key\nfrom pynput.keyboard import Controller as Controller1\nkeyboard = Controller1()\n\nwith keyboard.pressed(Key.cmd):\n keyboard.press('c')\n keyboard.release('c')"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但是粘贴则不需要使用pynput通过模拟"},{"type":"codeinline","content":[{"type":"text","text":"command+c"}]},{"type":"text","text":"来粘贴到另一个编辑中复杂过程,我们可以使用第三方库"},{"type":"codeinline","content":[{"type":"text","text":"pyperclip"}]},{"type":"text","text":",直接通过下面两行代码即可将复制好的文字转为"},{"type":"text","marks":[{"type":"strong"}],"text":"字符串"}]},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"import pyperclip\npyperclip.paste()"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在将群成员的微信号转换为字符串后,不论我们是通过判断字符串的长度还是用正则表达式或者是其他的方法都可以轻松的判断该成员的微信号是否为初始微信号,实现规则1的判断,下面的代码与动态图就是一次**完整的过程"}]},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"from pynput.mouse import Button, Controller\nimport time\nfrom pynput.keyboard import Key\nfrom pynput.keyboard import Controller as Controller1\nimport pyperclip\n\nmouse = Controller()\n\n# 点击微信\nmouse.position = (1046.14453125, 4.546875)\ntime.sleep(2)\nmouse.press(Button.left)\nmouse.release(Button.left)\n\n#点击头像\nmouse.position = (1194.140625, 441.05859375)\ntime.sleep(1)\nmouse.press(Button.left)\nmouse.release(Button.left)\n\n# 点击选中文本\nmouse.position = (965.60546875, 284.0390625)\ntime.sleep(1)\nmouse.click(Button.left, 2)\n\nkeyboard = Controller1()\n\nwith keyboard.pressed(Key.cmd):\n keyboard.press('c')\n keyboard.release('c')\n time.sleep(1)\n\nwechatid = pyperclip.paste()\nprint(f\"微信号{wechatid}疑似广告号\" if len(wechatid) > 20 else f\"微信号{wechatid}不是广告号\")"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"可以看到成功将早小起的微信从广告号中排除"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/05/058836df6083db3d029d5f16232df7fb.gif","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/0a/0a11cad1c9504cffca2395cd63efc2ad.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那么接下来只需要记录下每两个群成员之间间隔的座标距离,之后循环去"},{"type":"text","marks":[{"type":"strong"}],"text":"模拟滚动"},{"type":"text","text":"或者*"},{"type":"text","marks":[{"type":"italic"}],"text":"下拉"},{"type":"text","text":"*来实现上述过程,就可以将群里所有成员的微信号根据规则1进行判断,找到异常的那些成员"},{"type":"text","marks":[{"type":"strong"}],"text":"单独进行判断"},{"type":"text","text":"。"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/61/619f7570bfe98b28ee6ad3f43334beb6.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"可以看到最终是找到了6个疑似广告号的微信,接下来通过其他规则的手动判断最终将两个用户判定为广告高风险用户并移除。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"写在最后"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通过上面的操作,虽然成功的踢出了两个疑似广告号,但是依旧很难去判断是否真的踢对了人,如果踢错了,那么则粉丝-1,同时也可以发现想用Python准确找到群里的牛皮癣还是非常困难的,使用pynput最多可以完成微信号及头像(使用识图API)的判断,但是更多的信息却很难提取挖掘。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"同时"},{"type":"codeinline","content":[{"type":"text","text":"pynput"}]},{"type":"text","text":"有着和"},{"type":"codeinline","content":[{"type":"text","text":"selenium"}]},{"type":"text","text":"同样的缺点,那就是由于"},{"type":"text","marks":[{"type":"strong"}],"text":"模拟真人操作而导致的速度慢"},{"type":"text","text":",并且它的定位方式仅支持座标,所以还需要保证在操作的过程中微信窗口不可以被移动,否则之前记录的元素将全部失效(建议开发者可以支持更多的定位方式)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果你有好的思路可以在留言区和我交流,如果你想看更详细的pynput讲解,不要忘记给个三连。如果你对本文的代码感兴趣,可以在公众号早起Python中找到。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"今天的文章就到这里,为了维护良好的群内交流环境,对抗小广告的路还在继续,拜拜~"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章