Scrapegraph Start Smartscraper
scrapegraph_start_smartscraperExtract content from a webpage using AI by providing a natural language prompt and a URL.
When to Use
Use this tool when you need to extract content from a webpage using ai by providing a natural language prompt and a url.. This is part of the Scrapegraphai API provider on xpay✦.
MCP Connection
Connect to xpay✦ to access this tool (and 10+ others):
{
"mcpServers": {
"xpay": {
"url": "https://mcp.xpay.sh/mcp?key=YOUR_API_KEY"
}
}
}
For Claude Code:
claude mcp add --transport http xpay "https://mcp.xpay.sh/mcp?key=YOUR_API_KEY"
How to Execute
Use the xpay✦ meta-tools to run this tool:
xpay_details— Get full input schema:xpay_details("scrapegraph/scrapegraph_start_smartscraper")xpay_run— Execute:xpay_run("scrapegraph/scrapegraph_start_smartscraper", { ...inputs })
Input Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
headers | object | No | Optional custom HTTP headers to send with the request. Useful for setting User-Agent, cookies, authentication tokens, and other request metadata. Example: {"User-Agent": "Mozilla/5.0...", "Cookie": "session=abc123"} |
website_markdown | string | No | Raw Markdown content to process directly (max 2MB). Mutually exclusive with website_url and website_html. Perfect for extracting structured data from Markdown documentation, README files, or any content already in Markdown format. |
steps | string | No | Optional array of interaction steps to perform on the webpage before extraction. Each step is a string describing the action to take (e.g., “click on filter button”, “wait for results to load”). Example: ["click on search button", "type query in search box", "wait for results"] |
cookies | object | No | Optional cookies object for authentication and session management. Useful for accessing authenticated pages or maintaining session state. Example: {"session_id": "abc123", "auth_token": "xyz789"} |
user_prompt | string | Yes | Natural language description of what information you want to extract from the webpage. |
total_pages | number | No | Optional parameter to enable pagination and scrape multiple pages. Specify the number of pages to extract data from. Default: 1 Range: 1-100 |
render_heavy_js | boolean | No | Optional parameter to enable enhanced JavaScript rendering for heavy JS websites (React, Vue, Angular, SPAs). Use when standard rendering doesn’t capture all content. Default: false |
number_of_scrolls | number | No | Optional parameter for infinite scroll pages. Specify how many times to scroll down to load more content before extraction. Default: 0 Range: 0-50 |
website_html | string | No | Raw HTML content to process directly (max 2MB). Mutually exclusive with website_url and website_markdown. Useful when you already have HTML content cached or want to process modified HTML. |
website_url | string | Yes | The URL of the webpage you want to extract information from. You must provide exactly one of: website_url, website_html, or website_markdown. |
stealth | boolean | No | Enable stealth mode to bypass bot protection using advanced anti-detection techniques. Adds +4 credits to the request cost |
output_schema | object | No | Optional schema to structure the output. If provided, the AI will attempt to format the results according to this schema. |
mock | boolean | No | Optional parameter to enable mock mode. When set to true, the request will return mock data instead of performing an actual extraction. Useful for testing and development. Default: false |
Pricing
- Cost: $0.08/call
- Balance check: Use
xpay_balanceto check remaining credits - Get your API key at xpay.tools — $5 free credits included
Related Skills
- Scrapegraphai API (all tools) — 11 tools
- Scrapegraph Scrape — $0.06/call
- Scrapegraph Start Searchscraper — $0.24/call
- Scrapegraph Start Smartcrawler — $0.08/call
- Scrapegraph Start Sitemap — $0.02/call
- Scrapegraph Start Markdownify — $0.06/call
Links
- Tool page: https://xpay.tools/scrapegraph/scrapegraph-start-smartscraper/
- Provider: https://xpay.tools/scrapegraph/
- All tools: https://xpay.tools/explore
How to Execute
Input Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
headers | object | No | Optional custom HTTP headers to send with the request. Useful for setting User-Agent, cookies, authentication tokens, and other request metadata. Example: {"User-Agent": "Mozilla/5.0...", "Cookie": "session=abc123"} |
website_markdown | string | No | Raw Markdown content to process directly (max 2MB). Mutually exclusive with website_url and website_html. Perfect for extracting structured data from Markdown documentation, README files, or any content already in Markdown format. |
steps | string | No | Optional array of interaction steps to perform on the webpage before extraction. Each step is a string describing the action to take (e.g., “click on filter button”, “wait for results to load”). Example: ["click on search button", "type query in search box", "wait for results"] |
cookies | object | No | Optional cookies object for authentication and session management. Useful for accessing authenticated pages or maintaining session state. Example: {"session_id": "abc123", "auth_token": "xyz789"} |
user_prompt | string | Yes | Natural language description of what information you want to extract from the webpage. |
total_pages | number | No | Optional parameter to enable pagination and scrape multiple pages. Specify the number of pages to extract data from. Default: 1 Range: 1-100 |
render_heavy_js | boolean | No | Optional parameter to enable enhanced JavaScript rendering for heavy JS websites (React, Vue, Angular, SPAs). Use when standard rendering doesn’t capture all content. Default: false |
number_of_scrolls | number | No | Optional parameter for infinite scroll pages. Specify how many times to scroll down to load more content before extraction. Default: 0 Range: 0-50 |
website_html | string | No | Raw HTML content to process directly (max 2MB). Mutually exclusive with website_url and website_markdown. Useful when you already have HTML content cached or want to process modified HTML. |
website_url | string | Yes | The URL of the webpage you want to extract information from. You must provide exactly one of: website_url, website_html, or website_markdown. |
stealth | boolean | No | Enable stealth mode to bypass bot protection using advanced anti-detection techniques. Adds +4 credits to the request cost |
output_schema | object | No | Optional schema to structure the output. If provided, the AI will attempt to format the results according to this schema. |
mock | boolean | No | Optional parameter to enable mock mode. When set to true, the request will return mock data instead of performing an actual extraction. Useful for testing and development. Default: false |
Related Tools from Scrapegraphai API
Install Skill
Pricing
Cost
$0.08/call
Model
Flat rate
Provider
Scrapegraphai API

