Python SDK
Install and use the MrScraper Python SDK (mrscraper-sdk) to call the MrScraper API asynchronously.
The mrscraper-sdk is a typed Python client for the MrScraper web scraping API. All client methods are async and must be used with asyncio or another async runtime.
Requirements
- Python 3.9 or latest
- Basic familiarity with async/await syntax in Python
Package Information
See the mrscraper-sdk on PyPI for the latest release, version history, and metadata.
Installation
Install the SDK from PyPI using pip:
pip install mrscraper-sdkImport the client in your Python code:
from mrscraper import MrScraperAuthentication
Initialize the client with your MrScraper API token. You can get your API token from the MrScraper dashboard.
from mrscraper import MrScraper
client = MrScraper(token="YOUR_MRSCRAPER_API_TOKEN")Best Practice
Store your API token in environment variables rather than hardcoding it in your source code:
import os
from mrscraper import MrScraper
client = MrScraper(token=os.getenv("MRSCRAPER_API_TOKEN"))For more information on generating API tokens, see the Generate Token guide.
Core Methods
Fetch Raw HTML
Use fetch_html to load a page with the MrScraper stealth browser and return rendered HTML content.
import asyncio
from mrscraper import MrScraper
async def main():
client = MrScraper(token="YOUR_MRSCRAPER_API_TOKEN")
result = await client.fetch_html(
"https://example.com/product",
geo_code="US",
timeout=120,
block_resources=False,
)
print(result["data"]) # raw HTML string
asyncio.run(main())Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
url | string | Yes | Target page URL to fetch |
timeout | integer | No | Request timeout in seconds (default: 60) |
geo_code | string | No | Geographic/proxy region (e.g., "US", "UK", "SG") |
block_resources | boolean | No | When True, blocks images, CSS, and fonts to reduce bandwidth |
Create AI Scraper
Create and run an AI-powered scraper using natural language instructions. The response includes scraper metadata and an ID for future reruns.
result = await client.create_scraper(
url="https://example.com/products",
message="Extract all product names, prices, and ratings",
agent="listing", # "general" | "listing" | "map"
proxy_country="US",
)
scraper_id = result["data"]["data"]["id"]
print("Scraper ID:", scraper_id)Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
url | string | Yes | Target URL to scrape |
message | string | Yes | Natural language instructions for data extraction |
agent | string | No | Agent type: "general", "listing", or "map" (default: "general") |
proxy_country | string | No | Country code for proxy (e.g., "US", "UK") |
Agent Types:
| Agent | Best For | Additional Parameters |
|---|---|---|
"general" | Single pages, product details, articles | None |
"listing" | Product listings, search results, directories | max_pages |
"map" | Site crawling, URL discovery, sitemaps | max_depth, max_pages, limit, include_patterns, exclude_patterns |
Note
For detailed information on agents, see AI Scraper Agents.
Rerun AI Scraper
Rerun an existing AI scraper on a new URL without creating a new scraper configuration.
result = await client.rerun_scraper(
scraper_id="scraper_12345",
url="https://example.com/products?page=2",
)Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
scraper_id | string | Yes | ID of the existing scraper |
url | string | Yes | New URL to scrape |
max_depth | integer | No | For map agents: link depth to follow |
max_pages | integer | No | For map/listing agents: maximum pages to scrape |
limit | integer | No | For map agents: maximum results to return |
include_patterns | list | No | For map agents: regex patterns for URLs to include |
exclude_patterns | list | No | For map agents: regex patterns for URLs to exclude |
Bulk Rerun AI Scraper
Run an existing AI scraper on multiple URLs in a single request.
result = await client.bulk_rerun_ai_scraper(
scraper_id="scraper_12345",
urls=[
"https://example.com/products/item1",
"https://example.com/products/item2",
"https://example.com/products/item3",
],
)Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
scraper_id | string | Yes | ID of the existing AI scraper |
urls | list | Yes | List of URLs to scrape (non-empty) |
Performance Tip
Bulk operations are more efficient than individual rerun calls. Use this method when scraping multiple URLs to reduce API calls and improve performance.
Rerun Manual Scraper
Rerun an existing manual scraper (created in the MrScraper dashboard) on a new URL.
result = await client.rerun_manual_scraper(
scraper_id="manual_scraper_67890",
url="https://example.com/products/new-item",
)Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
scraper_id | string | Yes | ID of the existing manual scraper |
url | string | Yes | New URL to scrape |
See Manual Rerun in the API docs for more details.
Bulk Rerun Manual Scraper
Run an existing manual scraper on multiple URLs in a single request.
result = await client.bulk_rerun_manual_scraper(
scraper_id="manual_scraper_67890",
urls=[
"https://www.example.com/products/item1",
"https://www.example.com/products/item2",
"https://www.example.com/products/item3",
],
)Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
scraper_id | string | Yes | ID of the manual scraper from the dashboard |
urls | list | Yes | List of URLs to scrape (non-empty) |
Retrieving Results
Get All Results
Retrieve a paginated list of scraping results with filtering and sorting options.
page = await client.get_all_results(
sort_field="updatedAt",
sort_order="DESC",
page_size=20,
page=1,
search="product",
date_range_column="updatedAt",
start_at="2026-01-01",
end_at="2026-01-31",
)
print(page["data"])Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
sort_field | string | No | Field to sort by (e.g., "updatedAt", "createdAt") |
sort_order | string | No | Sort direction: "ASC" or "DESC" |
page_size | integer | No | Number of results per page (default: 20) |
page | integer | No | Page number to retrieve (default: 1) |
search | string | No | Keyword to filter results (optional) |
date_range_column | string | No | Date field to filter by (e.g., "updatedAt") |
start_at | string | No | Start date for filtering (ISO 8601 format) |
end_at | string | No | End date for filtering (ISO 8601 format) |
Note
- The
searchparameter is optional. Omit it if you don't need keyword filtering and only want date-based or paginated results. - Related REST documentation: Get All Results in Range.
Get Result by ID
Retrieve a single scraping result using its unique ID.
result = await client.get_result_by_id("result_12345")
print(result["data"])Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
result_id | string | Yes | Unique identifier of the result |
Note
Related REST documentation: Result Detail
Error Handling
The SDK raises typed exceptions from mrscraper.exceptions for different error scenarios.
Exception Types
| Exception | When It's Raised |
|---|---|
MrScraperError | Base class for all SDK errors |
AuthenticationError | Invalid or missing API token (HTTP 401) |
APIError | Non-success API response (exposes .status_code) |
NetworkError | Network timeouts or connection failures |
Example Error Handling
from mrscraper.exceptions import AuthenticationError, APIError, NetworkError
try:
result = await client.fetch_html("https://example.com")
except AuthenticationError:
print("Authentication failed. Check your API token at https://app.mrscraper.com")
except APIError as e:
print(f"API error {e.status_code}: {e}")
except NetworkError as e:
print(f"Network problem: {e}")Best Practices
- Always wrap SDK calls in try-except blocks
- Log errors with appropriate context for debugging
- Implement retry logic for
NetworkErrorexceptions - Verify your API token if you encounter
AuthenticationError
Migration from Firecrawl
If you're migrating from the Firecrawl Python SDK, use this mapping to find equivalent MrScraper methods.
| Firecrawl Method | MrScraper Equivalent |
|---|---|
| Scrape a URL with HTML output | fetch_html |
| Scrape a URL with structured JSON | create_scraper with agent="general" or rerun_scraper |
| Crawl a website or map URLs | create_scraper with agent="map" or rerun_scraper with map options |
| Batch scrape | bulk_rerun_ai_scraper or bulk_rerun_manual_scraper |
| Scrape listing/pagination pages | create_scraper with agent="listing" and max_pages parameter |
Key Differences
-
Agent-Based Approach: MrScraper uses specialized agents (
general,listing,map) instead of format-based parameters. -
Natural Language Instructions: Instead of defining extraction schemas, use the
messageparameter to describe what data you want in plain English. -
Pagination Handling: Use
agent="listing"withmax_pagesto automatically handle paginated content.