You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: agent-skill/Scrapling-Skill/SKILL.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
---
2
2
name: scrapling-official
3
3
description: Scrape web pages using Scrapling with anti-bot bypass (like Cloudflare Turnstile), stealth headless browsing, spiders framework, adaptive scraping, and JavaScript rendering. Use when asked to scrape, crawl, or extract data from websites; web_fetch fails; the site has anti-bot protections; write Python code to scrape/crawl; or write spiders.
Copy file name to clipboardExpand all lines: agent-skill/Scrapling-Skill/references/mcp-server.md
+34-13Lines changed: 34 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,8 @@
1
1
# Scrapling MCP Server
2
2
3
-
The Scrapling MCP server exposes nine web scraping tools over the MCP protocol. It supports CSS-selector-based content narrowing (reducing tokens by extracting only relevant elements before returning results), three levels of scraping capability (plain HTTP, browser-rendered, and stealth/anti-bot bypass), and persistent browser session management.
3
+
The Scrapling MCP server exposes ten tools over the MCP protocol. It supports CSS-selector-based content narrowing (reducing tokens by extracting only relevant elements before returning results), three levels of scraping capability (plain HTTP, browser-rendered, and stealth/anti-bot bypass), persistent browser session management, and page screenshots returned as real image content blocks.
4
4
5
-
All scraping tools return a `ResponseModel` with fields: `status` (int), `content` (list of strings), `url` (str).
5
+
All scraping tools return a `ResponseModel` with fields: `status` (int), `content` (list of strings), `url` (str). The `screenshot` tool returns a list of MCP content blocks: an `ImageContent` (the screenshot bytes) followed by a `TextContent` (the post-redirect URL).
6
6
7
7
## Tools
8
8
@@ -99,17 +99,18 @@ Opens a browser session that stays alive across multiple fetch calls, avoiding t
Plus all other browser session parameters (`google_search`, `real_chrome`, `cdp_url`, `locale`, `timezone_id`, `useragent`, `extra_headers`, `cookies`, `disable_resources`, `network_idle`, `wait_selector`, `wait_selector_state`).
115
116
@@ -131,6 +132,25 @@ Returns a list of `SessionInfo` objects, each with `session_id`, `session_type`,
131
132
132
133
No parameters.
133
134
135
+
### `screenshot` -- Capture a page screenshot
136
+
137
+
Navigates to a URL inside an existing browser session and returns the screenshot as an MCP `ImageContent` block (the bytes the model can see directly, not a base64 string in JSON) followed by a `TextContent` block carrying the post-redirect URL.
138
+
139
+
Requires an open browser session. Call `open_session` first, then pass the `session_id` here. Both `dynamic` and `stealthy` sessions are accepted.
| Multiple pages from the same site |`open_session` + `fetch`/`stealthy_fetch` with `session_id`|
165
+
| Need a screenshot of a page |`open_session` + `screenshot` with `session_id`|
145
166
146
167
Start with `get` (fastest, lowest resource cost). Escalate to `fetch` if content requires JS rendering. Escalate to `stealthy_fetch` only if blocked. For multiple pages from the same site, use a persistent session to avoid browser launch overhead.
0 commit comments