网络
简介
Playwright 提供了 API 来监控和修改浏览器网络流量,包括 HTTP 和 HTTPS。页面发出的任何请求,包括 XHR 和 fetch 请求,都可以被跟踪、修改和处理。
模拟 API
查看我们的 API 模拟指南,了解如何
- 模拟 API 请求,永远不实际访问 API
- 执行 API 请求并修改响应
- 使用 HAR 文件模拟网络请求。
HTTP 身份验证
执行 HTTP 身份验证。
- 同步
- 异步
context = browser.new_context(
http_credentials={"username": "bill", "password": "pa55w0rd"}
)
page = context.new_page()
page.goto("https://example.com")
context = await browser.new_context(
http_credentials={"username": "bill", "password": "pa55w0rd"}
)
page = await context.new_page()
await page.goto("https://example.com")
HTTP 代理
您可以配置页面通过 HTTP(S) 代理或 SOCKSv5 加载。代理可以全局设置用于整个浏览器,也可以为每个浏览器上下文单独设置。
您可以选择为 HTTP(S) 代理指定用户名和密码,也可以指定要绕过 代理 的主机。
这是一个全局代理的示例
- 同步
- 异步
browser = chromium.launch(proxy={
"server": "http://myproxy.com:3128",
"username": "usr",
"password": "pwd"
})
browser = await chromium.launch(proxy={
"server": "http://myproxy.com:3128",
"username": "usr",
"password": "pwd"
})
也可以为每个上下文单独指定
- 同步
- 异步
browser = chromium.launch()
context = browser.new_context(proxy={"server": "http://myproxy.com:3128"})
browser = await chromium.launch()
context = await browser.new_context(proxy={"server": "http://myproxy.com:3128"})
网络事件
您可以监控所有的 请求 (Request) 和 响应 (Response)
- 同步
- 异步
from playwright.sync_api import sync_playwright, Playwright
def run(playwright: Playwright):
chromium = playwright.chromium
browser = chromium.launch()
page = browser.new_page()
# Subscribe to "request" and "response" events.
page.on("request", lambda request: print(">>", request.method, request.url))
page.on("response", lambda response: print("<<", response.status, response.url))
page.goto("https://example.com")
browser.close()
with sync_playwright() as playwright:
run(playwright)
import asyncio
from playwright.async_api import async_playwright, Playwright
async def run(playwright: Playwright):
chromium = playwright.chromium
browser = await chromium.launch()
page = await browser.new_page()
# Subscribe to "request" and "response" events.
page.on("request", lambda request: print(">>", request.method, request.url))
page.on("response", lambda response: print("<<", response.status, response.url))
await page.goto("https://example.com")
await browser.close()
async def main():
async with async_playwright() as playwright:
await run(playwright)
asyncio.run(main())
或者使用 page.expect_response() 等待按钮点击后的网络响应
- 同步
- 异步
# Use a glob url pattern
with page.expect_response("**/api/fetch_data") as response_info:
page.get_by_text("Update").click()
response = response_info.value
# Use a glob url pattern
async with page.expect_response("**/api/fetch_data") as response_info:
await page.get_by_text("Update").click()
response = await response_info.value
变体
使用 page.expect_response() 等待 响应 (Response)
- 同步
- 异步
# Use a regular expression
with page.expect_response(re.compile(r"\.jpeg$")) as response_info:
page.get_by_text("Update").click()
response = response_info.value
# Use a predicate taking a response object
with page.expect_response(lambda response: token in response.url) as response_info:
page.get_by_text("Update").click()
response = response_info.value
# Use a regular expression
async with page.expect_response(re.compile(r"\.jpeg$")) as response_info:
await page.get_by_text("Update").click()
response = await response_info.value
# Use a predicate taking a response object
async with page.expect_response(lambda response: token in response.url) as response_info:
await page.get_by_text("Update").click()
response = await response_info.value
处理请求
- 同步
- 异步
page.route(
"**/api/fetch_data",
lambda route: route.fulfill(status=200, body=test_data))
page.goto("https://example.com")
await page.route(
"**/api/fetch_data",
lambda route: route.fulfill(status=200, body=test_data))
await page.goto("https://example.com")
您可以通过在 Playwright 脚本中处理网络请求来模拟 API 端点。
变体
使用 browser_context.route() 或 page.route() 在整个浏览器上下文或页面上设置路由。它将应用于弹出窗口和打开的链接。
- 同步
- 异步
context.route(
"**/api/login",
lambda route: route.fulfill(status=200, body="accept"))
page.goto("https://example.com")
await context.route(
"**/api/login",
lambda route: route.fulfill(status=200, body="accept"))
await page.goto("https://example.com")
修改请求
- 同步
- 异步
# Delete header
def handle_route(route):
headers = route.request.headers
del headers["x-secret"]
route.continue_(headers=headers)
page.route("**/*", handle_route)
# Continue requests as POST.
page.route("**/*", lambda route: route.continue_(method="POST"))
# Delete header
async def handle_route(route):
headers = route.request.headers
del headers["x-secret"]
await route.continue_(headers=headers)
await page.route("**/*", handle_route)
# Continue requests as POST.
await page.route("**/*", lambda route: route.continue_(method="POST"))
您可以继续请求并进行修改。上面的示例从传出的请求中删除了一个 HTTP 标头。
中止请求
您可以使用 page.route() 和 route.abort() 中止请求。
- 同步
- 异步
page.route("**/*.{png,jpg,jpeg}", lambda route: route.abort())
# Abort based on the request type
page.route("**/*", lambda route: route.abort() if route.request.resource_type == "image" else route.continue_())
await page.route("**/*.{png,jpg,jpeg}", lambda route: route.abort())
# Abort based on the request type
await page.route("**/*", lambda route: route.abort() if route.request.resource_type == "image" else route.continue_())
修改响应
要修改响应,请使用 APIRequestContext 获取原始响应,然后将响应传递给 route.fulfill()。您可以通过选项覆盖响应上的各个字段
- 同步
- 异步
def handle_route(route: Route) -> None:
# Fetch original response.
response = route.fetch()
# Add a prefix to the title.
body = response.text()
body = body.replace("<title>", "<title>My prefix:")
route.fulfill(
# Pass all fields from the response.
response=response,
# Override response body.
body=body,
# Force content type to be html.
headers={**response.headers, "content-type": "text/html"},
)
page.route("**/title.html", handle_route)
async def handle_route(route: Route) -> None:
# Fetch original response.
response = await route.fetch()
# Add a prefix to the title.
body = await response.text()
body = body.replace("<title>", "<title>My prefix:")
await route.fulfill(
# Pass all fields from the response.
response=response,
# Override response body.
body=body,
# Force content type to be html.
headers={**response.headers, "content-type": "text/html"},
)
await page.route("**/title.html", handle_route)
Glob URL 模式
Playwright 在网络拦截方法(如 page.route() 或 page.expect_response())的 URL 匹配中使用简化的 glob 模式。这些模式支持基本通配符
- 星号
- 单个
*
匹配除/
之外的任何字符 - 双星号
**
匹配包括/
在内的任何字符
- 单个
- 问号
?
匹配除/
之外的任何单个字符 - 大括号
{}
可用于匹配以逗号,
分隔的选项列表
示例
https://example.com/*.js
匹配https://example.com/file.js
,但不匹配https://example.com/path/file.js
**/*.js
同时匹配https://example.com/file.js
和https://example.com/path/file.js
**/*.{png,jpg,jpeg}
匹配所有图像请求
重要提示
- glob 模式必须匹配整个 URL,而不仅仅是 URL 的一部分。
- 当使用 glob 进行 URL 匹配时,请考虑完整的 URL 结构,包括协议和路径分隔符。
- 对于更复杂的匹配要求,请考虑使用 [RegExp] 而不是 glob 模式。
WebSockets
Playwright 开箱即用地支持 WebSockets 的检查、模拟和修改。请参阅我们的 API 模拟指南,了解如何模拟 WebSockets。
每次创建 WebSocket 时,都会触发 page.on("websocket") 事件。此事件包含 WebSocket 实例,用于进一步的 WebSocket 帧检查
def on_web_socket(ws):
print(f"WebSocket opened: {ws.url}")
ws.on("framesent", lambda payload: print(payload))
ws.on("framereceived", lambda payload: print(payload))
ws.on("close", lambda payload: print("WebSocket closed"))
page.on("websocket", on_web_socket)
缺少网络事件和服务 Workers
Playwright 内置的 browser_context.route() 和 page.route() 允许您的测试原生路由请求并执行模拟和拦截。
- 如果您正在使用 Playwright 的原生 browser_context.route() 和 page.route(),并且网络事件似乎丢失了,请通过将 service_workers 设置为
'block'
来禁用 Service Workers。 - 可能是您正在使用模拟工具,例如 Mock Service Worker (MSW)。虽然此工具可以开箱即用地用于模拟响应,但它会添加自己的 Service Worker 来接管网络请求,从而使它们对 browser_context.route() 和 page.route() 不可见。如果您对网络测试和模拟都感兴趣,请考虑使用内置的 browser_context.route() 和 page.route() 进行 响应模拟。
- 如果您不仅对使用 Service Workers 进行测试和网络模拟感兴趣,而且对路由和监听 Service Workers 本身发出的请求感兴趣,请参阅 此实验性功能。