scrapy-亞馬遜

import time

import scrapy
from scrapy import Request


class MobileSpider(scrapy.Spider):
    name = 'mobile'
    allowed_domains = ['amazon.com']
    start_urls = ['https://www.amazon.cn/s?k=mobile+phone&s=price-desc-rank&__mk_zh_CN=%E4%BA%9A%E9%A9%AC%E9%80%8A%E7%BD%91%E7%AB%99&crid=215CPRDHDI9WF&qid=1584240877&sprefix=mobile%2Caps%2C182&ref=sr_st_price-desc-rank']


    def parse(self, response):
        print(response.body)
        time.sleep(5)
        title = response.xpath('//div[@class="sg-col-inner"]//h2//span[contains(@class,"a-size-base-plus")]/text()').extract()
        hrefs=response.xpath('//div[@class="sg-col-inner"]//h2/a/@href').extract()
        price=response.xpath('//div[@class="a-section a-spacing-none a-spacing-top-small"]//span[@class="a-price"]/span[@class="a-offscreen"]/text()').extract()
        hrefs = [str("https://www.amazon.cn"+href) for href in hrefs]
        for item in zip(title,hrefs,price):
            yield{
                "title":item[0],
                "hrefs":item[1],
                "price":item[2]
            }

        next ="https://www.amazon.cn"+response.xpath('//ul[@class="a-pagination"]/li[@class="a-last"]/a/@href').extract_first()
        yield Request(next)

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

scrapy-亞馬遜

詐騙（殺豬盤）網站進行滲透測試

Python 潮流週刊#50：我最喜歡的 Python 3.13 新特性！

【Python】保存gym截圖

【譯】使用 GitHub Copilot 作爲你的編碼 GPS

Linux 服務器配置-安裝portainer-ce社區版

外行也能讀懂的網絡硬件設備功能原理速成

5-04標註

5-05特徵選擇（特徵預處理第一步）

5-03異常值處理

python將圖片變成水墨畫

python爬蟲——requests裏面的response對象

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結