比Selenium更優秀的playwright介紹與未來展望

Playwright是微軟開發的，專門爲滿足端到端測試需求而創建的。Playwright支持包括Chromium、WebKit和Firefox在內的所有現代渲染引擎。在Windows、Linux和macOS上進行測試，本地或在CI上，無頭或有頭，帶有本機移動仿真。

安裝

安裝playwright庫

pip install --upgrade pip  
pip install playwright

然後安裝browsers:

playwright install

同步與異步

安裝後，就可以導入Playwright 庫，支持三種瀏覽器 (chromium, firefox and webkit).

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch()
    page = browser.new_page()
    page.goto("http://playwright.dev")
    print(page.title())
    browser.close()

Playwright支持同步和異步兩種API: synchronous and asynchronous. 如果你的工程使用 asyncio, 可以使用async API:

import asyncio
from playwright.async_api import async_playwright

async def main():
    async with async_playwright() as p:
        browser = await p.chromium.launch()
        page = await browser.new_page()
        await page.goto("http://playwright.dev")
        print(await page.title())
        await browser.close()

asyncio.run(main())

首個例子

訪問https://playwright.dev/網站並截圖保存screenshot

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.webkit.launch()
    page = browser.new_page()
    page.goto("https://playwright.dev/")
    page.screenshot(path="example.png")
    browser.close()

上面的代碼會將網頁保持到example.png。

默認情況下，Playwright 以無頭模式運行瀏覽器，就是看不到窗口。要查看瀏覽器 UI，可以摘啓動瀏覽器時傳遞 headless=False 標誌。還可以使用 Slow_mo 來減慢執行速度。

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.webkit.launch(headless=False)
    page = browser.new_page()
    page.goto("https://playwright.dev/")
    page.screenshot(path="example.png")
    browser.close()

加上headless=False後，我們可以看到瀏覽器UI。

使用指南

Actions 表單元素交互

Playwright 可以與 HTML 輸入元素進行交互，比如文本輸入、複選框、單選按鈕、選擇選項、鼠標點擊、輸入字符、按鍵和快捷鍵，還可以上傳文件和聚焦元素。

比如對於文本：

# Text input
page.get_by_role("textbox").fill("Peter")

# Date input
page.get_by_label("Birth date").fill("2020-02-02")

# Time input
page.get_by_label("Appointment time").fill("13:15")

# Local datetime input
page.get_by_label("Local time").fill("2020-03-02T05:15")

通過get_by_role， get_by_label 獲取文本框，通過fill 填充

對應的checkbox 和ratio, 使用check 操作表單

# Check the checkbox
page.get_by_label('I agree to the terms above').check()

# Assert the checked state
expect(page.get_by_label('Subscribe to newsletter')).to_be_checked()

# Select the radio button
page.get_by_label('XL').check()

select下拉框則是select_option

# Single selection matching the value or label  
page.get_by_label('Choose a color').select_option('blue')  
  
# Single selection matching the label  
page.get_by_label('Choose a color').select_option(label='Blue')  
  
# Multiple selected items  
page.get_by_label('Choose multiple colors').select_option(['red', 'green', 'blue'])

還可以模擬點擊操作，除了常規的click，還支持Shift+Click，hover，以及指定點擊位置等。

# Generic click 普通點擊
page.get_by_role("button").click()

# Double click 雙擊
page.get_by_text("Item").dblclick()

# Right click 鼠標右鍵點擊
page.get_by_text("Item").click(button="right")

# Shift + click
page.get_by_text("Item").click(modifiers=["Shift"])

# Hover over element
page.get_by_text("Item").hover()

# Click the top left corner
page.get_by_text("Item").click(position={ "x": 0, "y": 0})

對於上傳文件，也不在話下：

page.get_by_label("Upload file").set_input_files('myfile.pdf')

拖拽：

page.locator("#item-to-be-dragged").drag_to(page.locator("#item-to-drop-at"))

Auto-waiting 自動等待

頁面加載是有等待時間的，比如你要點擊一個按鈕，需要等待頁面就緒後，所以playwright提供自動等待功能，確保操作的action符合預期。

舉個例子, locator.click(), Playwright 執行前會確保：

locator 能定位到唯一的element
element 可見 Visible
element 是穩定狀態 Stable, 不是動畫元素
element Receives Events, as in not obscured by other elements
element is Enabled

Authentication 認證

一些網站訪問需要認證，通常有一些登錄表單，我們可以模擬登陸

page = context.new_page()
page.goto('https://github.com/login')

# Interact with login form
page.get_by_label("Username or email address").fill("username")
page.get_by_label("Password").fill("password")
page.get_by_role("button", name="Sign in").click()
# Continue with the test

爲了避免每次登錄，可以保存狀態和恢復狀態：

# Save storage state into the file.
storage = context.storage_state(path="state.json")

# Create a new context with the saved storage state.
context = browser.new_context(storage_state="state.json")

有的網站存儲state用的session storage，也可以操作：

import os
# Get session storage and store as env variable
session_storage = page.evaluate("() => JSON.stringify(sessionStorage)")
os.environ["SESSION_STORAGE"] = session_storage

# Set session storage in a new context
session_storage = os.environ["SESSION_STORAGE"]
context.add_init_script("""(storage => {
  if (window.location.hostname === 'example.com') {
    const entries = JSON.parse(storage)
    for (const [key, value] of Object.entries(entries)) {
      window.sessionStorage.setItem(key, value)
    }
  }
})('""" + session_storage + "')")

原理是通過page.evaluate 執行一個js，獲取頁面的sessionStorage，在新的context啓動時，注入保存的session_storage。

加載chrome插件

可以摘啓動context的時候，通過launch_persistent_context指定extension路徑

from playwright.sync_api import sync_playwright, Playwright

path_to_extension = "./my-extension"
user_data_dir = "/tmp/test-user-data-dir"


def run(playwright: Playwright):
    context = playwright.chromium.launch_persistent_context(
        user_data_dir,
        headless=False,
        args=[
            f"--disable-extensions-except={path_to_extension}",
            f"--load-extension={path_to_extension}",
        ],
    )
    if len(context.background_pages) == 0:
        background_page = context.wait_for_event('backgroundpage')
    else:
        background_page = context.background_pages[0]

    # Test the background page as you would any other page.
    context.close()


with sync_playwright() as playwright:
    run(playwright)

其實這裏的launch_persistent_context，還能指定更多的東西，包含錄屏等。

——更多使用指南，待續——

Playwright 爬蟲demo

我們模擬一個加載小紅書cookie，然後打開搜索頁面，並解析搜索結果。

首先，打開瀏覽器獲取小紅書登錄後的cookie，F12查看網絡請求，隨便招一個複製cookie即可

然後存儲到COOKIE變量中：

COOKIE = '複製的cookie'

我們啓動Playwright，加載cookie：

def load_cookie():  
    # 讀取保存的 Cookie 文件  
    cookies = []  
    lines = COOKIE.split(";")  
    for line in lines:  
        name, value = line.strip().split('=', 1)  
        cookies.append({'name': name, 'value': value, 'domain': '.xiaohongshu.com', 'path': '/', 'expires': -1})  
    # 添加 Cookie 到瀏覽器上下文  
    context.add_cookies(cookies)  
  
  
with sync_playwright() as playwright:  
    browser = playwright.chromium.launch(headless=False)  
    context = browser.new_context()  
  
    # 加載cookie  
    load_cookie()

然後打開搜索頁面，並解析搜索結果：

# 創建一個新頁面，訪問小紅書搜搜  
page = context.new_page()  
page.goto('https://www.xiaohongshu.com/search_result?keyword=AI&source=unknown&type=51')  
  
# 解析搜索結果  
html = page.content()  
cards = parse_cards(html)  
print(cards)

通過page.content() 獲取到html，然後用常規的html解析就可以, 這裏可以扔給大模型寫解析代碼，prompt是python playwright 將頁面中這樣的多個卡片解析出來，包含標題，圖片，url，like數量, html是...`

# 解析card  
def parse_cards(html):  
    cards = []  
    soup = BeautifulSoup(html, "html.parser")  
    for card in soup.find_all("section", class_="note-item"):  
        title = card.find("a", class_="title")  
        if not title:  
            continue  
        title = title.text.strip()  
        image_url = card.find("img")["src"]  
        url = card.find("a", class_="cover")["href"]  
        like_count = card.find("span", class_="count").text.strip()  
        if "w" in like_count:  
            like_count = str(float(like_count.replace("w", "")) * 10000)  
        cards.append({  
            "title": title,  
            "image_url": image_url,  
            "url": url,  
            "like_count": like_count  
        })  
    return cards

最後等到結果：

[
  {
	   'title': '人生建議，2024一定要學會AI，真的會開掛‼️', 
	   'image_url': 'https://sns-webpic-qc.xhscdn.com/202405141753/7bec9f3771d1787c19343079183c95fd/1040g008310an3v106g005pahpd9gl25gv5aqt08!nc_n_webp_mw_1',          'url': '/search_result/65f2b75a000000000d00f8f3', 
	    'like_count': '14000.0'
  } 
 // 省略...
 ]

這裏默認只加載了首頁，如果我們先多加載幾頁，可以控制瀏覽器滾動，可以發現每次loading時，會顯示loading中的動畫，因此我們可以等待這個loading不顯示就認爲加載結束：

# 滾動加載10頁，滾動到頁腳出發自動加載  
for i in range(10):  
    page.evaluate("window.scrollTo(0, document.body.scrollHeight)")  
    # 等待loadding消失  
    page.wait_for_selector(".feeds-loading", state="hidden")

完整代碼：

from playwright.sync_api import sync_playwright  
from bs4 import BeautifulSoup  
  
COOKIE = '...你的cookie...'  
  
  
# 解析card  
def parse_cards(html):  
    cards = []  
    soup = BeautifulSoup(html, "html.parser")  
    for card in soup.find_all("section", class_="note-item"):  
        title = card.find("a", class_="title")  
        if not title:  
            continue  
        title = title.text.strip()  
        image_url = card.find("img")["src"]  
        url = card.find("a", class_="cover")["href"]  
        like_count = card.find("span", class_="count").text.strip()  
        if "w" in like_count:  
            like_count = str(float(like_count.replace("w", "")) * 10000)  
        cards.append({  
            "title": title,  
            "image_url": image_url,  
            "url": url,  
            "like_count": like_count  
        })  
    return cards  
  
  
def load_cookie():  
    # 讀取保存的 Cookie 文件  
    cookies = []  
    lines = COOKIE.split(";")  
    for line in lines:  
        name, value = line.strip().split('=', 1)  
        cookies.append({'name': name, 'value': value, 'domain': '.xiaohongshu.com', 'path': '/', 'expires': -1})  
    # 添加 Cookie 到瀏覽器上下文  
    context.add_cookies(cookies)  
  
  
with sync_playwright() as playwright:  
    browser = playwright.chromium.launch(headless=False)  
    context = browser.new_context()  
  
    # 加載cookie  
    load_cookie()  
  
    # 創建一個新頁面，訪問小紅書搜搜  
    page = context.new_page()  
    page.goto('https://www.xiaohongshu.com/search_result?keyword=AI&source=unknown&type=51')  
  
    # 滾動加載10頁，滾動到頁腳出發自動加載  
    for i in range(10):  
        page.evaluate("window.scrollTo(0, document.body.scrollHeight)")  
        # 等待loadding消失  
        page.wait_for_selector(".feeds-loading", state="hidden")  
  
    # 解析搜索結果  
    html = page.content()  
    cards = parse_cards(html)  
    print(cards)  
  
    page.pause()  
  
    browser.close()

使用LLM驅動Playwright

在skyvern框架中，我們看到了通過LLM來識別網頁，決策選擇頁面元素，然後通過Playwright控制頁面元素，實現自動化完成一些任務，比如購買車險。

我們簡單分析下。

skyvern 首先定義了一些列的action枚舉，用來表示網頁元素的操作

class ActionType(StrEnum):  
    CLICK = "click"  
    INPUT_TEXT = "input_text"  
    UPLOAD_FILE = "upload_file"  
  
    # This action is not used in the current implementation. Click actions are used instead."  
    DOWNLOAD_FILE = "download_file"  
  
    SELECT_OPTION = "select_option"  
    CHECKBOX = "checkbox"  
    WAIT = "wait"  
    NULL_ACTION = "null_action"  
    SOLVE_CAPTCHA = "solve_captcha"  
    TERMINATE = "terminate"  
    COMPLETE = "complete"  
    # Note: Remember to update ActionTypeUnion with new actions

抓取頁面時，會獲取頁面內容、元素、以及組合到一起的截圖：

async def scrape_web_unsafe(  
    browser_state: BrowserState,  
    url: str,  
) -> ScrapedPage:  
    """  
    Asynchronous function that performs web scraping without any built-in error handling. This function is intended    for use cases where the caller handles exceptions or in controlled environments. It directly scrapes the provided    URL or continues on the given page.  
    :param browser_context: BrowserContext instance used for scraping.    :param url: URL of the web page to be scraped. Used only when creating a new page.    :param page: Optional Page instance for scraping, a new page is created if None.    :return: Tuple containing Page instance, base64 encoded screenshot, and page elements.  
    :note: This function does not handle exceptions. Ensure proper error handling in the calling context.    """    # We only create a new page if one does not exist. This is to allow keeping the same page since we want to    # continue working on the same page that we're taking actions on.    # *This also means URL is only used when creating a new page, and not when using an existing page.    page = await browser_state.get_or_create_page(url)  
    # Take screenshots of the page with the bounding boxes. We will remove the bounding boxes later.  
    # Scroll to the top of the page and take a screenshot.    # Scroll to the next page and take a screenshot until we reach the end of the page.    # We check if the scroll_y_px_old is the same as scroll_y_px to determine if we have reached the end of the page.    # This also solves the issue where we can't scroll due to a popup.(e.g. geico first popup on the homepage after    # clicking start my quote)  
    LOG.info("Waiting for 5 seconds before scraping the website.")  
    await asyncio.sleep(5)  
  
    screenshots: list[bytes] = []  
    scroll_y_px_old = -30.0  
    scroll_y_px = await scroll_to_top(page, drow_boxes=True)  
    # Checking max number of screenshots to prevent infinite loop  
    # We are checking the difference between the old and new scroll_y_px to determine if we have reached the end of the    # page. If the difference is less than 25, we assume we have reached the end of the page.    while (  
        abs(scroll_y_px_old - scroll_y_px) > 25  
        and len(screenshots) < SettingsManager.get_settings().MAX_NUM_SCREENSHOTS  
    ):  
        screenshot = await browser_state.take_screenshot(full_page=False)  
        screenshots.append(screenshot)  
        scroll_y_px_old = scroll_y_px  
        LOG.info("Scrolling to next page", url=url, num_screenshots=len(screenshots))  
        scroll_y_px = await scroll_to_next_page(page, drow_boxes=True)  
        LOG.info("Scrolled to next page", scroll_y_px=scroll_y_px, scroll_y_px_old=scroll_y_px_old)  
    await remove_bounding_boxes(page)  
    await scroll_to_top(page, drow_boxes=False)  
  
    elements, element_tree = await get_interactable_element_tree(page)  
    element_tree = cleanup_elements(copy.deepcopy(element_tree))  
  
    _build_element_links(elements)  
  
    id_to_xpath_dict = {}  
    id_to_element_dict = {}  
    for element in elements:  
        element_id = element["id"]  
        # get_interactable_element_tree marks each interactable element with a unique_id attribute  
        id_to_xpath_dict[element_id] = f"//*[@{SKYVERN_ID_ATTR}='{element_id}']"  
        id_to_element_dict[element_id] = element  
  
    text_content = await get_all_visible_text(page)  
    return ScrapedPage(  
        elements=elements,  
        id_to_xpath_dict=id_to_xpath_dict,  
        id_to_element_dict=id_to_element_dict,  
        element_tree=element_tree,  
        element_tree_trimmed=trim_element_tree(copy.deepcopy(element_tree)),  
        screenshots=screenshots,  
        url=page.url,  
        html=await page.content(),  
        extracted_text=text_content,  
    )

注意上面的ScrapedPage，包含elements、screenshots等信息。

用戶可以給定一個任務prompt，skyvern 會將ScrapedPage 和預製prompt組合到一起，調用LLM進行決策。我們可以看下預製的prompt：

Identify actions to help user progress towards the user goal using the DOM elements given in the list and the screenshot of the website.  
Include only the elements that are relevant to the user goal, without altering or imagining new elements.  
Use the details from the user details to fill in necessary values. Always satisfy required fields if the field isn't already filled in. Don't return any action for the same field, if this field is already filled in and the value is the same as the one you would have filled in.  
MAKE SURE YOU OUTPUT VALID JSON. No text before or after JSON, no trailing commas, no comments (//), no unnecessary quotes, etc.  
Each element is tagged with an ID.  
If you see any information in red in the page screenshot, this means a condition wasn't satisfied. prioritize actions with the red information.  
If you see a popup in the page screenshot, prioritize actions on the popup.  
  
Reply in JSON format with the following keys:  
{  
    "actions": array // An array of actions. Here's the format of each action:  
    [{  
        "reasoning": str, // The reasoning behind the action. Be specific, referencing any user information and their fields and element ids in your reasoning. Mention why you chose the action type, and why you chose the element id. Keep the reasoning short and to the point.  
        "confidence_float": float, // The confidence of the action. Pick a number between 0.0 and 1.0. 0.0 means no confidence, 1.0 means full confidence  
        "action_type": str, // It's a string enum: "CLICK", "INPUT_TEXT", "UPLOAD_FILE", "SELECT_OPTION", "WAIT", "SOLVE_CAPTCHA", "COMPLETE", "TERMINATE". "CLICK" is an element you'd like to click. "INPUT_TEXT" is an element you'd like to input text into. "UPLOAD_FILE" is an element you'd like to upload a file into. "SELECT_OPTION" is an element you'd like to select an option from. "WAIT" action should be used if there are no actions to take and there is some indication on screen that waiting could yield more actions. "WAIT" should not be used if there are actions to take. "SOLVE_CAPTCHA" should be used if there's a captcha to solve on the screen. "COMPLETE" is used when the user goal has been achieved AND if there's any data extraction goal, you should be able to get data from the page. Never return a COMPLETE action unless the user goal is achieved. "TERMINATE" is used to terminate the whole task with a failure when it doesn't seem like the user goal can be achieved. Do not use "TERMINATE" if waiting could lead the user towards the goal. Only return "TERMINATE" if you are on a page where the user goal cannot be achieved. All other actions are ignored when "TERMINATE" is returned.  
        "id": int, // The id of the element to take action on. The id has to be one from the elements list  
        "text": str, // Text for INPUT_TEXT action only  
        "file_url": str, // The url of the file to upload if applicable. This field must be present for UPLOAD_FILE but can also be present for CLICK only if the click is to upload the file. It should be null otherwise.  
        "option": {  // The option to select for SELECT_OPTION action only. null if not SELECT_OPTION action  
            "label": str, // the label of the option if any. MAKE SURE YOU USE THIS LABEL TO SELECT THE OPTION. DO NOT PUT ANYTHING OTHER THAN A VALID OPTION LABEL HERE  
            "index": int, // the id corresponding to the optionIndex under the the select element.  
            "value": str // the value of the option. MAKE SURE YOU USE THIS VALUE TO SELECT THE OPTION. DO NOT PUT ANYTHING OTHER THAN A VALID OPTION VALUE HERE  
        },  
{% if error_code_mapping_str %}  
        "errors": array // A list of errors. This is used to surface any errors that matches the current situation for COMPLETE and TERMINATE actions. For other actions or if no error description suits the current situation on the screenshots, return an empty list. You are allowed to return multiple errors if there are multiple errors on the page.  
        [{  
            "error_code": str, // The error code from the user's error code list  
            "reasoning": str, // The reasoning behind the error. Be specific, referencing any user information and their fields in your reasoning. Keep the reasoning short and to the point.  
            "confidence_float": float // The confidence of the error. Pick a number between 0.0 and 1.0. 0.0 means no confidence, 1.0 means full confidence  
        }]  
{% endif %}  
    }],  
}  
  
{% if action_history %}  
Consider the action history from the last step and the screenshot together, if actions from the last step don't yield positive impact, try other actions or other action combinations.  
{% endif %}
...

省略了一些，prompt是指示多模LLM，根據提供的元素、圖片，根據用戶的prompt決定action，skyvern會解析action並調用Playwright操作頁面元素，直到完成任務或者LLM指示停止。

可以看到，skyvern 有效的利用了大模型的多模識別能力，來自主決策做一些具體的任務，但是這裏有個坑，是否每次都需要大模型去決策呢？理論上，對於相同的任務，記錄下路徑，必要時再讓大模型去決策或許是更好的解決方案。

playwright selenium 對比

這裏轉載一個https://www.cnblogs.com/yoyoketang/p/17387733.html 提到的對比，大家可以參考

編號	功能	Playwright	Selenium	哪個更優秀
1	學習資料	相對少	多	Selenium
2	用戶羣體	出現的比較晚，用戶量相對少	出現的早，用戶量多	Selenium
3	支持語言	TypeScript、JavaScript、Python、.NET、Java	C#,Java,Perl,PHP,Python 和Ruby	Selenium
4	支持瀏覽器	Chromium（包含chrome， msedge）、WebKit 和 Firefox	IE（7, 8, 9, 10, 11），Firefox，Safari，Google Chrome，Opera，Edge等	Selenium
5	跨平臺	Windows，Linux(只支持Ubuntu部分系統) ，Mac	Windows，Linux，Mac 都支持	Selenium
6	瀏覽器安裝	命令行安裝	自己安裝	Playwright
7	瀏覽器驅動	不需要驅動	下載對應版本驅動	Playwright
8	啓動速度	快	慢	Playwright
9	context 環境隔離	有	無	Playwright
10	headless 無頭模式	默認headless，也可以設置GUI	默認GUI模式，也可以設置headless	Playwright
11	無痕模式	默認無痕模式，對應測試很有幫助，對於爬蟲用戶可能訪問頁面不通過	默認非無痕默認，爬蟲用戶特別喜歡	Selenium
12	頁面等待	wait_for_load_state可以精準等待commit,domcontentloaded,load,networkidle四種狀態	implicitly_wait等待頁面加載完成	Playwright
13	元素定位	提供多個內置定位器，定位方式更貼近業務，定位方式更多	八大定位	Playwright
14	元素等待	定位元素自帶等待機制	需要自己封裝等待方法	Playwright
15	點擊元素等操作	會判斷元素狀態，出現位置，是否可點擊智能判斷	需要自己封裝webdriverwait.until方法,難度較大	Playwright
16	定位報錯	會人性化告訴你定位到幾個元素，並推薦定位方式	報錯需要自己去猜謎，自己排除各種可能性	Playwright
17	元素不在當前屏幕	會判斷元素位置，自動滾動元素出現位置	需要自己去判斷滾動	Playwright
18	iframe	通過對象操作，不用切換	需要來回切換	Playwright
19	alert	默認監聽自動關閉，可以異步監聽	需要自己判斷，無異步監聽	Playwright
20	文件上傳	監聽文件上傳時間，處理優雅	無法解決非input 上傳	Playwright
21	文件下載	可以監聽下載	只能設置瀏覽器默認位置	Playwright
22	多窗口標籤	可以監聽窗口事件，操作方便	需要來回切換	Playwright
23	事件監聽	可以監聽各種事件	無法監聽	Playwright
24	捕獲ajax 請求	可以捕獲ajax 請求和返回	無法捕獲	Playwright
25	mock 功能	可以模擬想要的任何接口數據	無mock 功能	Playwright
26	斷言	提供expect 豐富斷言	需要自己封裝webdriverwait.until方法,難度較大	Playwright
27	錄製視頻	錄製用例視頻	無	Playwright
28	trace 追蹤	有	無	Playwright
29	斷點調試	有	無	Playwright
30	錄製	可以生成pytest用例	錄製功能比較簡單	Playwright
31	鼠標鍵盤操作	調用簡單方便	導入模塊，操作複雜	Playwright
32	base_url	可以添加全局base_url	無此功能	Playwright
33	接口測試	提供接口測試	無此功能	Playwright
34	grid 分佈式	無	selenium-grid 分佈式	Selenium
35	協議	websockt 協議，可以實時獲取頁面狀態	http 協議，只能獲取當時的狀態，需自己輪詢判斷	Playwright
36	執行JavaScript	可以在page,iframe,元素對象執行JavaScript	只能在driver對象執行JavaScrip	Playwright
37	async異步	有同步和異步2種方式	無異步代碼	Playwright
38	面試	要求playwright 比較少	問selenium 比較多	Selenium
39	學習難易程度	容易，無需封裝，直接用	難度較大，需要封裝	Playwright

原文作者評價：

Playwright的優點是簡單方便、功能強大、穩定性高，缺點是相對新，用戶羣體少，學習資料少。  
Selenium的優點是靈活性高、用戶羣體大、學習資料多，缺點是需要自己封裝或者導入其他模塊來實現一些功能，啓動速度慢，穩定性差。

個人評價，Playwright是微軟在吸收了前面框架優勢基礎上研發出來的新測試框架，站在巨人肩膀上，微軟出品下限很高，如果沒有歷史包袱，可以優先採用Playwright。

總結

Playwright是新興的自動化測試工具，擁有豐富的功能和API，隱藏在衆多的爬蟲和自動化工具背後，而多模LLM的出現讓Playwright可以如虎添翼，自動化智能化的RPA工具預計將會井噴般出現。

比Selenium更優秀的playwright介紹與未來展望

安裝

同步與異步

首個例子

使用指南

Actions 表單元素交互

Auto-waiting 自動等待

Authentication 認證

加載chrome插件

Playwright 爬蟲demo

使用LLM驅動Playwright

playwright selenium 對比

總結

公司剛入職了一名 Java 中級開發，短短 4 行代碼居然湊齊了 3 個 bug！我哭了~~

公衆號5月C#/.NET熱文一覽

git 下載大陸鏡像地址

PhiData 一款開發AI搜索、agents智能體和工作流應用的AI框架

Google出品的NotebookLM 人工智能筆記本，一款基於RAG的personalized AI產品

比Selenium更優秀的playwright介紹與未來展望

LLM生態下爬蟲程序的現狀與未來

淺談sparse vec檢索工程化實現

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結