百度PC端排名腳本，需要使用py3,pip安裝一些常用函數

baidupcrank.py

# -*- coding: utf-8 -*-

import urllib.request
import re


import urllib.request
def getPage(url):
      headers = {"User-Agent": ua}#①設置頭部信息
      request = urllib.request.Request(url, headers=headers)
      response = urllib.request.urlopen(request)
      return response.read().decode("utf-8") #使用https請求會出錯

    """
    解析：寫個請求函數，記得：①留個位置定義ua，
    """


def extract_content(html):
    """
    提取網頁的信息
    :param html: 網頁源碼
    :return: 文章標題和正文信息
    """
    items = re.findall(r'<div class="pic-wrap">(.*?)<div class="js_supplyState">',
                       html, re.S|re.I)#
    return items


    """
    解析：①findall，意思是循環匹配。
         ②findall，後面有個r，表示字符串爲非轉義的原始字符串，讓編譯器忽略反斜槓。
         ③re.S點號包括換行，re.I忽略大小寫。
         ④match 和 search 是匹配一次
         ⑤re.sub也是替換，比如 phone = "2004-959-559 # 這是一個國外電話號碼"
         num = re.sub(r'#.*$', "", phone)就替換了#後面的字符，並且使用$結束。。。

    """


if __name__ == "__main__":
    ua = "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
    url='https://www.cnblogs.com/xieshengsen/p/6727064.html'
    result=getPage(url)
    # print (result)
    result=extract_content(result)
    print (result)

def main():
  dict = {'Name': 'Zara',
          "res": ['physics', 'chemistry', 1997, 2000,{'Age': 7, 'Class': 'First'}],
          'Age': 7, 'Class': 'First'}

  print(dict['res'][4]['Age'])

if __name__ == '__main__':
  main()

"""
說明：打印結果是7
解析：大括號{}的是dict，中括號[]的是list，dict可以根據鍵值取出來，list要跟進下標索引取出來
"""

python 中 urlparse 模塊介紹

urlparse模塊主要是用於解析url中的參數對url按照一定格式進行拆分或拼接

1.urlparse.urlparse

將url分爲6個部分，返回一個包含6個字符串項目的元組：協議、位置、路徑、參數、查詢、片段。

import urlparse

url_change = urlparse.urlparse('https://i.cnblogs.com/EditPosts.aspx?opt=1')

print url_change

　　輸出結果爲：

ParseResult(scheme='https', netloc='i.cnblogs.com', path='/EditPosts.aspx', params='', query='opt=1', fragment='')

其中 scheme 是協議 netloc 是域名服務器 path 相對路徑 params是參數，query是查詢的條件

python新筆記

python 中 urlparse 模塊介紹

【SQL進階】CASE語句的使用

npm error Cannot read properties of null (reading 'isDescendantOf')

手把手教你創建magento2主題

好文：「大搜車」憑什麼獲得阿里如此青睞？

php函數大全

lnmp一鍵安裝包 nginx配置文件 rewrite重寫規則

文本分類解決方法綜述(1)(2)(3)

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結