跳至主要内容

入门 - 库

安装

Pip

PyPI version

pip install --upgrade pip
pip install playwright
playwright install

Conda

Anaconda version

conda config --add channels conda-forge
conda config --add channels microsoft
conda install playwright
playwright install

这些命令会下载 Playwright 包并安装 Chromium、Firefox 和 WebKit 的浏览器二进制文件。若要修改此行为,请参见安装参数

用法

安装完成后,您可以在 Python 脚本中 import Playwright,并启动 3 个浏览器中的任何一个 (chromiumfirefoxwebkit)。

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
page.goto("https://playwright.net.cn")
print(page.title())
browser.close()

Playwright 支持两种 API 变体:同步和异步。如果您的现代项目使用 asyncio,您应该使用异步 API

import asyncio
from playwright.async_api import async_playwright

async def main():
async with async_playwright() as p:
browser = await p.chromium.launch()
page = await browser.new_page()
await page.goto("https://playwright.net.cn")
print(await page.title())
await browser.close()

asyncio.run(main())

第一个脚本

在我们的第一个脚本中,我们将导航到 https://playwright.net.cn/ 并使用 WebKit 拍摄一张截图。

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
browser = p.webkit.launch()
page = browser.new_page()
page.goto("https://playwright.net.cn/")
page.screenshot(path="example.png")
browser.close()

默认情况下,Playwright 以无头模式运行浏览器。若要查看浏览器 UI,请将 headless 选项设置为 False。您还可以使用 slow_mo 来减慢执行速度。在调试工具的部分中了解更多信息。

firefox.launch(headless=False, slow_mo=50)

交互模式 (REPL)

您可以启动交互式 python REPL

python

然后在其中启动 Playwright 以进行快速实验

from playwright.sync_api import sync_playwright
playwright = sync_playwright().start()
# Use playwright.chromium, playwright.firefox or playwright.webkit
# Pass headless=False to launch() to see the browser UI
browser = playwright.chromium.launch()
page = browser.new_page()
page.goto("https://playwright.net.cn/")
page.screenshot(path="example.png")
browser.close()
playwright.stop()

异步 REPL(如 asyncio REPL)

python -m asyncio
from playwright.async_api import async_playwright
playwright = await async_playwright().start()
browser = await playwright.chromium.launch()
page = await browser.new_page()
await page.goto("https://playwright.net.cn/")
await page.screenshot(path="example.png")
await browser.close()
await playwright.stop()

Pyinstaller

您可以将 Playwright 与 Pyinstaller 一起使用以创建独立的可执行文件。

main.py
from playwright.sync_api import sync_playwright

with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
page.goto("https://playwright.net.cn/")
page.screenshot(path="example.png")
browser.close()

如果您想将浏览器与可执行文件捆绑在一起

PLAYWRIGHT_BROWSERS_PATH=0 playwright install chromium
pyinstaller -F main.py
注意

将浏览器与可执行文件捆绑在一起会生成更大的二进制文件。建议只捆绑您使用的浏览器。

已知问题

time.sleep() 导致状态过时

您很可能不需要手动等待,因为 Playwright 具有 自动等待。如果您仍然依赖它,您应该使用 page.wait_for_timeout(5000) 而不是 time.sleep(5),最好根本不等待超时,但有时它对于调试很有用。在这种情况下,请使用我们的等待 (wait_for_timeout) 方法而不是 time 模块。这是因为我们内部依赖于异步操作,当使用 time.sleep(5) 时,它们无法得到正确处理。

与 Windows 上 asyncioSelectorEventLoop 不兼容

Playwright 在子进程中运行驱动程序,因此它需要 Windows 上 asyncioProactorEventLoop,因为 SelectorEventLoop 不支持异步子进程。

在 Windows Python 3.7 上,Playwright 将默认事件循环设置为 ProactorEventLoop,因为它是 Python 3.8+ 的默认事件循环。

线程

Playwright 的 API 不是线程安全的。如果您在多线程环境中使用 Playwright,您应该为每个线程创建一个 playwright 实例。有关更多详细信息,请参见线程问题