爬蟲程序如何設置多個請求保持代理IP不變

 

可以通過設置Proxy-Connection: Keep-Alive和Connection: Keep-Alive可以保持請求在一個TCP會話中,保持代理IP不變。實現過程如下:

 #! -*- encoding:utf-8 -*-
    import requests
    import random
    import requests.adapters

    # 要訪問的目標頁面
    targetUrlList = [
        "https://www.bilibili.com/",
        "https://www.bilibili.com/",
        "https://www.bilibili.com/",
    ]

    # 代理服務器(產品官網 www.16yun.cn)
    proxyHost = "t.16yun.cn"
    proxyPort = "31111"

    # 代理驗證信息
    proxyUser = "16HGRRIK"
    proxyPass = "458687"

    proxyMeta = "http://%(user)s:%(pass)s@%(host)s:%(port)s" % {
        "host": proxyHost,
        "port": proxyPort,
        "user": proxyUser,
        "pass": proxyPass,
    }

    # 設置 http和https訪問都是用HTTP代理
    proxies = {
        "http": proxyMeta,
        "https": proxyMeta,
    }

    #  設置IP切換頭
    tunnel = random.randint(1, 10000)
    headers = {"Proxy-Tunnel": str(tunnel)}


    class HTTPAdapter(requests.adapters.HTTPAdapter):
        def proxy_headers(self, proxy):
            headers = super(HTTPAdapter, self).proxy_headers(proxy)
            if hasattr(self, 'tunnel'):
                headers['Proxy-Tunnel'] = self.tunnel
            return headers


    # 訪問三次網站,使用相同的tunnel標誌,均能夠保持相同的外網IP
    for i in range(3):
        s = requests.session()

        a = HTTPAdapter()

        #  設置IP切換頭
        a.tunnel = tunnel
        s.mount('https://', a)

        for url in targetUrlList:
            r = s.get(url, proxies=proxies)

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章