關於python爬蟲課程設計

文章目錄

python爬蟲課程設計
背單詞吧

題目簡介

大學四年中，必不可少的證就是英語四六級證書，我們的項目可以通過pycharm來對我們所掌握的詞彙量進行一個小測試，能夠將掌握得不是很好的單詞，保存到一個文檔裏面，以便我們能夠及時的複習，並對掌握得還不錯的單詞進一步鞏固，爭取早日拿到英語四六級證書。

目標定義

目標：在扇貝單詞的基礎上新增錯詞本和可視化正確率。
主要解決的問題：單詞是否正確的檢驗，需要檢查推測出每個單詞的編號變量是什麼；最後發現是rank唯一確定。

.流程圖

4.實現/5.分析

第一步：get請求爬取到相應的詞庫

import requests
url = 'https://www.shanbay.com/api/v1/vocabtest/category/?'
word_type_json = requests.get(url).json()    #json()解析下載內容
word_type = word_type_json['data'] #獲取詞庫種類（解析爲字典，data對應的值爲各種詞彙）
n = 0 #用於計數
for type in word_type:
    print(str(n) + '. ' + type[1])
    n += 1
choose_type = int(input('請輸入出題範圍的編號：'))
ciku = word_type[choose_type][0]     #選擇得到對應的詞庫
#利用字符串+運算後是將兩個字符串連接構造詞庫的URL
test = requests.get('https://www.shanbay.com/api/v1/vocabtest/vocabularies/?category=' + ciku)
words = test.json()['data'] #獲取所有單詞信息
#用於存單詞的list
danci = [] #認識的單詞
words_knows = [] #認識的單詞
not_knows = [] #不認識的單詞
num = int(input('\n輸入你想測試的單詞數'))
print ('\n測試開始。如果你認識這個單詞，請輸入1後按Enter，否則直接按Enter：')

第四行的word_type_json[‘data’]因爲詞庫種類在返回的字典的data鍵對應的值裏面然後分別爲各種詞庫標號並顯示出來，給用戶選擇，效果如圖：

在扇貝官網中：

第二步：仿照扇貝單詞，用戶選擇認識的單詞

#用戶選擇認識 or 不認識
n=1   #用於計數
for x in words:
    print('第' + str(n) + '個：' + x['content']) #獲取單個單詞
    answer = input('認識請輸入1，否則直接敲Enter：')
    if answer == '1':
        danci.append(x['content']) #把選擇的單詞加入，只加入單詞
        words_knows.append(x) #這裏是把整個單詞的信息加入，即整個字典加入，因爲後面要用到
    else:
        not_knows.append(x) #不認識的放這裏
    if n == num:
        break
    n += 1

可以看到這裏扇貝官網上一次生成的單詞太多了，
先爬取下來，用戶輸入num來表示想測試幾個單詞，代碼用n=1來計數，每測完一個就++

if n == num:
    break
n += 1

第三步：檢測單詞
先觀察官網，找到對應的正確翻譯

這樣就可以做出單詞檢測了
代碼第48行

print(str(i+1) + y['definition_choices'][i]['definition'])

顯示選擇號碼，還有中文
代碼第50行

if y['definition_choices'][choice-1]['rank'] == y['rank']: #選擇正確
通過每個單詞特有的rank對應的值來判斷用戶是否選擇正確

第四步：生成錯詞本和陌生單詞本
這裏是陌生單詞本，錯詞本差不多

for x in not_knows:
    n += 1
    right_ans = x['rank']
    for i in range(4):
        if x['definition_choices'][i]['rank'] == right_ans:
            mean = x['definition_choices'][i]['definition']
            break
    word = x['content'] + '  ' + mean
    print(str(n) + '.' + word)
    f.write(word + '\n')
f.close()

第五步：查看錯詞本
因爲open創建的文件在當前目錄下，所以這裏把當前路徑打印一下

import os
print("您的錯題本的生成路徑：")
print(os.getcwd())

全部代碼

import requests, sys
print('')
url = '[圖片]https://www.shanbay.com/api/v1/vocabtest/category/?'
word_type_json = requests.get(url).json()    #json()解析下載內容
word_type = word_type_json['data'] #獲取詞庫種類（解析爲字典，data對應的值爲各種詞彙）
n = 0 #用於計數
for type in word_type:
    print(str(n) + '. ' + type[1])
    n += 1
choose_type = int(input('請輸入出題範圍的編號：'))
ciku = word_type[choose_type][0]     #選擇得到對應的詞庫
#利用字符串+運算後是將兩個字符串連接構造詞庫的URL
test = requests.get('[圖片]https://www.shanbay.com/api/v1/vocabtest/vocabularies/?category=' + ciku)
words = test.json()['data'] #獲取所有單詞信息
#用於存單詞的list
danci = [] #認識的單詞
words_knows = [] #認識的單詞
not_knows = [] #不認識的單詞
num = int(input('\n輸入你想測試的單詞數'))
print ('\n測試開始。如果你認識這個單詞，請輸入1後按Enter，否則直接按Enter：')
#用戶選擇認識 or 不認識
n=1 #用於計數
for x in words:
    print('第' + str(n) + '個：' + x['content']) #獲取單個單詞
    answer = input('認識請輸入1，否則直接敲Enter：')
    if answer == '1':
        danci.append(x['content']) #把選擇的單詞加入，只加入單詞
        words_knows.append(x) #這裏是把整個單詞的信息加入，即整個字典加入，因爲後面要用到
    else:
        not_knows.append(x) #不認識的放這裏
    if n == num:
        break
    n += 1


#單詞檢測
if len(danci) != 0:
    print('\n在上述' + str(num) +'個單詞中，有' + str(len(danci)) + '個是你覺得自己認識的，它們是：')
    for word in danci:
        print(word, end=' ')
    #以選擇題形式來檢測
    print ('\n\n現在來檢測一下，你有沒有真正掌握它們：\n')
    wrong_words = [] #錯詞本（也是單詞所有信息加入，因爲保存本地會用到）
    right_num = 0 #記錄選擇正確的個數
    n = 1 #用於計數
    for y in words_knows:
        print('第' + str(n) + '個：')
        for i in range(4): #遍歷四個選項
            print(str(i+1) + y['definition_choices'][i]['definition'])
        choice = int(input('請選擇單詞\" ' + y['content'] + ' \"的正確翻譯：'))
        if y['definition_choices'][choice-1]['rank'] == y['rank']: #選擇正確
            right_num += 1
        else:
            wrong_words.append(y) #否則整個單詞的信息加入錯詞list
        n += 1
        print('\n')
    print ('你的成績:')
    print ('共' + str(num) + ' 個詞彙，你認識其中 ' + str(len(danci)) + ' 個，')
    print('實際掌握 ' + str(right_num) + ' 個，錯誤 ' + str(len(wrong_words)) + ' 個。')


    #錯誤單詞本
    if len(wrong_words) != 0:
        save = input ('\n\n你是否想打印錯詞集並保存在本地？（輸入 Y 或 N：）')
        if save == 'Y':
            f = open('錯誤單詞本.txt', 'a+')
            print ('你記錯的單詞有：')
            n = 1
            mean = ''
            for z in wrong_words:
                right_ans = z['rank'] #正確詞意號
                for i in range(4):
                    if z['definition_choices'][i]['rank'] == right_ans: #找到正確詞意
                        mean = z['definition_choices'][i]['definition']
                        break
                word = z['content'] + '  ' + mean #把單詞和正確詞意連起來
                print (str(n) + '.' + word)
                f.write(word + '\n')
                n += 1
            f.close()
            print('\n不熟單詞已保存至當前文件目錄下')
    else:
        print('繼續加油')
else:
    print('\n居然TMD一個也不認得?')


#陌生單詞本
if len(not_knows) != 0:
    save = input ('\n是否打印不認識的單詞並保存在本地？（輸入Y或N：）')
    if save == 'Y':
        f = open('陌生單詞本.txt', 'a+')
        print ('以下是你不認識的單詞：')
        n=0
        mean = ''
        for x in not_knows:
            n += 1
            right_ans = x['rank']
            for i in range(4):
                if x['definition_choices'][i]['rank'] == right_ans:
                    mean = x['definition_choices'][i]['definition']
                    break
            word = x['content'] + '  ' + mean
            print(str(n) + '.' + word)
            f.write(word + '\n')
        f.close()
        print ('\n記住的詞已保存至當前文件目錄下，歡迎下次使用')
    else:
        print('\n歡迎下次使用')
else:    #全對的情況
    print('\n你真棒！')
#查看錯題本路徑
import os
print("您的錯題本的生成路徑：")
print(os.getcwd())
import warnings
warnings.filterwarnings('ignore')
#數據可視化
import matplotlib.pyplot as plt
z = [str(right_num), str(len(wrong_words))]
plt.rcParams['font.sans-serif'] = 'SimHei'    #爲了中文正常顯示
plt.title('正確率')
plt.pie(z, autopct='%.2f%%', labels=['正確', '錯誤'])
plt.show()

關於python爬蟲課程設計

文章目錄

題目簡介

目標定義

.流程圖

4.實現/5.分析

全部代碼

如何使用 JS 判斷用戶是否處於活躍狀態

lightdb秒級增加列和刪除列（not null帶默認值）

lightdb數據庫超時相關控制參數

通過HPA+CronHPA組合應對業務複雜彈性伸縮場景

❤️‍🔥 Solon Cloud Event 新的事務特性與應用

lightdb mysql 8.0兼容之不可見主鍵

使用 JS 實現在瀏覽器控制檯打印圖片 console.image()

基於Ubuntu-22.04安裝K8s-v1.28.2實驗（四）使用域名訪問網站應用

過山車-二分圖匹配

精簡代碼

HDU - 1250-大數加法

第三週周賽題

二分-

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結