實戰批量查詢四級成績

小白一枚，自己閒來無事，就想搞個自動批量查詢四級成績的程序，經過不懈努力最終成功，將經驗分享，大家共同進步，如有紕漏或可改進地方，望大佬指正，不勝感激

正文：

先上網址:http://www.neea.edu.cn/cet

乍一看，似乎是個很簡單的網站，搞起來應該比較輕鬆，實際操作起來，卻發現果然還是有些碰壁

這次我用的庫是requests庫，然後創建一個session()對象進行請求，這種做法對cookie的操作上來說要方便些

先來常規操作，填入准考證和姓名，抓個包，發現帶cookie,what?啥時候出來的cookie？我是隱私窗口打開的，肯定不是以前的cookie，那就很明顯了，這個cookie應該是這個網頁剛加載的時候在某個請求時設置的,那麼我們從頭再來

重新抓包,經過觀察，發現在load.js的時候就帶有cookie了，我們觀察cookie：

Hm_lvt_dc1d69ab90346d48ee02f18510292577=1535592621;

Hm_lpvt_dc1d69ab90346d48ee02f18510292577=1535592621

這個15開頭的10位數很眼熟吧，一看就是訪問時間的時間戳，經過測試，也確實是這樣

獲得時間戳的代碼如下：

這下我們再回到填考生信息的時候,點擊驗證碼框的時候會有這麼一個請求，返回信息裏面就是驗證碼圖片鏈接，這個驗證碼不是加載了現畫的那種，是一個驗證碼對應一張圖片提前存放好的，那就簡單了，訪問圖片地址拿到圖片數據，識別就完事兒了，請求裏面還有個set-cookie，由於用的session對象訪問，這個就不需要我們自己管了。值得一提的是，請求地址裏面最後的t參數應該很明顯吧，就是一個隨機數。

驗證碼識別方面，我用的第三方，一個驗證碼不到一分錢，方便快捷準確，附上地址:http://www.ruokuai.com/,自己對接

好了這下啥都搞定了，可以查詢了

這最後一步就簡單了，一個簡單的post，帶上之前的cookies，準考號，姓名，識別到的驗證碼，就可以獲取返回結果，自己解析一下就可以得到成績

總體來說，也沒啥特別難的地方，六級查詢應該也一樣的道理，舉一反三，只是分享一下個人經驗

附總的源碼：

import requests as req
import datetime
import random
#testUrl = 'http://httpbin.org/get'
import rk
import time
import threading
def getCtime(number,name,class1):
    s = req.session()
    url1 = 'http://www.neea.edu.cn/html1/report/17101/6429-1.htm'
    headers1 = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36'
    }
    rep = s.get(url1, headers=headers1)
    list1 = rep.headers['Date']
    year = list1[12:16]
    month = list1[8:11]
    day = list1[5:7]
    hour = str(int(list1[17:19]) - 4)
    min = list1[20:22]
    sec = list1[23:25]
    str1 = month + ' ' + day + ', ' + year + ' ' + hour + ':' + min + ':' + sec + ' PM'
    ctime = int(datetime.datetime.strptime(str1, '%b %d, %Y %I:%M:%S %p').timestamp())
    getCode(s,ctime,number,name,class1)
def getCode(s,ctime,number,name,class1):
    n = random.random()
    url2 = 'http://cache.neea.edu.cn/Imgs.do?c=CET&ik='+str(number)+'&t=' + str(n)
    headers2 = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36',
        'Connection': 'keep-alive',
        'Cache-Control': 'no-cache',
        'Pragma': 'no-cache',
        'Referer': 'http://cet.neea.edu.cn/cet/'
    }
    s.cookies.set('Hm_lvt_dc1d69ab90346d48ee02f18510292577', str(ctime))
    s.cookies.set('Hm_lpvt_dc1d69ab90346d48ee02f18510292577', str(ctime))
    s.cookies.set('language', '1')
    response = s.get(url2, headers=headers2).text
    codeImage = response[13:-3]
    image = req.get(codeImage).content
    R = rk.RClient('你的若快賬戶','你的若快密碼')
    result = R.rk_create(image,3040)
    getGrade(s,headers2,result,number,name,class1)
def getGrade(s,headers2,code,number,name,class1):
    url3 = 'http://cache.neea.edu.cn/cet/query'
    data1 = {
        'data': 'CET4_181_DANGCI,'+str(number)+','+name,
        'v': code
    }
    headers3 = headers2
    response = s.post(url3,headers = headers3,data = data1)
    result = response.text
    index = result.index('s:')
    relResult = result[index+2:index+8]
    if float(relResult) >= 425:
        with open('D:/Pass.txt','a') as f1:
            f1.write(name+'----'+class1+'----'+relResult+'\n')
    else:
        with open('D:/notPass.txt','a') as f2:
            f2.write(name+'----'+class1+'----'+relResult+'\n')
    s.close()
if __name__ == '__main__':
    with open('D:/cet4.txt','r') as f:
        for i in f.readlines():
            number = i[0:15]
            name = i[19:22]
            if '-' in name:
                name = i[19:21]
            last = i.rfind('-')
            class1 = i[last+1:-1]
            print(number,name,class1)
            try:
                getCtime(number, name, class1)
            except Exception:
                continue
            time.sleep(0.2)
        f.close()

附封裝好的若快包:

#!/usr/bin/env python
# coding:utf-8
import requests
from hashlib import md5
class RClient(object):
    def __init__(self, username, password, soft_id = 1, soft_key = 'b40ffbee5c1cf4e38028c197eb2fc751'):
        self.username = username
        self.password = md5(password.encode()).hexdigest()
        self.soft_id = soft_id
        self.soft_key = soft_key
        self.base_params = {
            'username': self.username,
            'password': self.password,
            'softid': self.soft_id,
            'softkey': self.soft_key,
        }
        self.headers = {
            'Connection': 'Keep-Alive',
            'Expect': '100-continue',
            'User-Agent': 'ben',
        }
    def rk_create(self, im, im_type, timeout=60):
        """
        im: 圖片字節
        im_type: 題目類型
        """
        params = {
            'typeid': im_type,
            'timeout': timeout,
        }
        params.update(self.base_params)
        files = {'image': ('a.jpg', im)}
        r = requests.post('http://api.ruokuai.com/create.json', data=params, files=files, headers=self.headers)
        result = r.json()
        return result['Result']
    def rk_report_error(self, im_id):
        """
        im_id:報錯題目的ID
        """
        params = {
            'id': im_id,
        }
        params.update(self.base_params)
        r = requests.post('http://api.ruokuai.com/reporterror.json', data=params, headers=self.headers)
        return r.json()

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

實戰批量查詢四級成績

一個有趣的現象

關於linux定時執行python腳本無法執行的錯誤解決

招聘數據統計網站搭建全紀實

linux上使用python+chrome+selenium的一些問題記錄

關於chrome等瀏覽器無法正常訪問django項目網頁

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結