Python SDK

Install and use the MrScraper Python SDK (mrscraper-sdk) to call the MrScraper API asynchronously.

The mrscraper-sdk is a typed Python client for the MrScraper web scraping API. All client methods are async and must be used with asyncio or another async runtime.

Requirements

  • Python 3.9 or latest
  • Basic familiarity with async/await syntax in Python

Package Information

See the mrscraper-sdk on PyPI for the latest release, version history, and metadata.

Installation

Install the SDK from PyPI using pip:

pip install mrscraper-sdk

Import the client in your Python code:

from mrscraper import MrScraper

Authentication

Initialize the client with your MrScraper API token. You can get your API token from the MrScraper dashboard.

from mrscraper import MrScraper

client = MrScraper(token="YOUR_MRSCRAPER_API_TOKEN")

Best Practice

Store your API token in environment variables rather than hardcoding it in your source code:

import os
from mrscraper import MrScraper

client = MrScraper(token=os.getenv("MRSCRAPER_API_TOKEN"))

For more information on generating API tokens, see the Generate Token guide.

Core Methods

Fetch Raw HTML

Use fetch_html to load a page with the MrScraper stealth browser and return rendered HTML content.

import asyncio
from mrscraper import MrScraper

async def main():
    client = MrScraper(token="YOUR_MRSCRAPER_API_TOKEN")

    result = await client.fetch_html(
        "https://example.com/product",
        geo_code="US",
        timeout=120,
        block_resources=False,
    )
    print(result["data"])  # raw HTML string

asyncio.run(main())

Parameters:

ParameterTypeRequiredDescription
urlstringYesTarget page URL to fetch
timeoutintegerNoRequest timeout in seconds (default: 60)
geo_codestringNoGeographic/proxy region (e.g., "US", "UK", "SG")
block_resourcesbooleanNoWhen True, blocks images, CSS, and fonts to reduce bandwidth

Create AI Scraper

Create and run an AI-powered scraper using natural language instructions. The response includes scraper metadata and an ID for future reruns.

result = await client.create_scraper(
    url="https://example.com/products",
    message="Extract all product names, prices, and ratings",
    agent="listing",  # "general" | "listing" | "map"
    proxy_country="US",
)
scraper_id = result["data"]["data"]["id"]
print("Scraper ID:", scraper_id)

Parameters:

ParameterTypeRequiredDescription
urlstringYesTarget URL to scrape
messagestringYesNatural language instructions for data extraction
agentstringNoAgent type: "general", "listing", or "map" (default: "general")
proxy_countrystringNoCountry code for proxy (e.g., "US", "UK")

Agent Types:

AgentBest ForAdditional Parameters
"general"Single pages, product details, articlesNone
"listing"Product listings, search results, directoriesmax_pages
"map"Site crawling, URL discovery, sitemapsmax_depth, max_pages, limit, include_patterns, exclude_patterns

Note

For detailed information on agents, see AI Scraper Agents.

Rerun AI Scraper

Rerun an existing AI scraper on a new URL without creating a new scraper configuration.

result = await client.rerun_scraper(
    scraper_id="scraper_12345",
    url="https://example.com/products?page=2",
)

Parameters:

ParameterTypeRequiredDescription
scraper_idstringYesID of the existing scraper
urlstringYesNew URL to scrape
max_depthintegerNoFor map agents: link depth to follow
max_pagesintegerNoFor map/listing agents: maximum pages to scrape
limitintegerNoFor map agents: maximum results to return
include_patternslistNoFor map agents: regex patterns for URLs to include
exclude_patternslistNoFor map agents: regex patterns for URLs to exclude

Bulk Rerun AI Scraper

Run an existing AI scraper on multiple URLs in a single request.

result = await client.bulk_rerun_ai_scraper(
    scraper_id="scraper_12345",
    urls=[
        "https://example.com/products/item1",
        "https://example.com/products/item2",
        "https://example.com/products/item3",
    ],
)

Parameters:

ParameterTypeRequiredDescription
scraper_idstringYesID of the existing AI scraper
urlslistYesList of URLs to scrape (non-empty)

Performance Tip

Bulk operations are more efficient than individual rerun calls. Use this method when scraping multiple URLs to reduce API calls and improve performance.

Rerun Manual Scraper

Rerun an existing manual scraper (created in the MrScraper dashboard) on a new URL.

result = await client.rerun_manual_scraper(
    scraper_id="manual_scraper_67890",
    url="https://example.com/products/new-item",
)

Parameters:

ParameterTypeRequiredDescription
scraper_idstringYesID of the existing manual scraper
urlstringYesNew URL to scrape

See Manual Rerun in the API docs for more details.

Bulk Rerun Manual Scraper

Run an existing manual scraper on multiple URLs in a single request.

result = await client.bulk_rerun_manual_scraper(
    scraper_id="manual_scraper_67890",
    urls=[
        "https://www.example.com/products/item1",
        "https://www.example.com/products/item2",
        "https://www.example.com/products/item3",
    ],
)

Parameters:

ParameterTypeRequiredDescription
scraper_idstringYesID of the manual scraper from the dashboard
urlslistYesList of URLs to scrape (non-empty)

Retrieving Results

Get All Results

Retrieve a paginated list of scraping results with filtering and sorting options.

page = await client.get_all_results(
    sort_field="updatedAt",
    sort_order="DESC",
    page_size=20,
    page=1,
    search="product",
    date_range_column="updatedAt",
    start_at="2026-01-01",
    end_at="2026-01-31",
)
print(page["data"])

Parameters:

ParameterTypeRequiredDescription
sort_fieldstringNoField to sort by (e.g., "updatedAt", "createdAt")
sort_orderstringNoSort direction: "ASC" or "DESC"
page_sizeintegerNoNumber of results per page (default: 20)
pageintegerNoPage number to retrieve (default: 1)
searchstringNoKeyword to filter results (optional)
date_range_columnstringNoDate field to filter by (e.g., "updatedAt")
start_atstringNoStart date for filtering (ISO 8601 format)
end_atstringNoEnd date for filtering (ISO 8601 format)

Note

  • The search parameter is optional. Omit it if you don't need keyword filtering and only want date-based or paginated results.
  • Related REST documentation: Get All Results in Range.

Get Result by ID

Retrieve a single scraping result using its unique ID.

result = await client.get_result_by_id("result_12345")
print(result["data"])

Parameters:

ParameterTypeRequiredDescription
result_idstringYesUnique identifier of the result

Note

Related REST documentation: Result Detail

Error Handling

The SDK raises typed exceptions from mrscraper.exceptions for different error scenarios.

Exception Types

ExceptionWhen It's Raised
MrScraperErrorBase class for all SDK errors
AuthenticationErrorInvalid or missing API token (HTTP 401)
APIErrorNon-success API response (exposes .status_code)
NetworkErrorNetwork timeouts or connection failures

Example Error Handling

from mrscraper.exceptions import AuthenticationError, APIError, NetworkError

try:
    result = await client.fetch_html("https://example.com")
except AuthenticationError:
    print("Authentication failed. Check your API token at https://app.mrscraper.com")
except APIError as e:
    print(f"API error {e.status_code}: {e}")
except NetworkError as e:
    print(f"Network problem: {e}")

Best Practices

  • Always wrap SDK calls in try-except blocks
  • Log errors with appropriate context for debugging
  • Implement retry logic for NetworkError exceptions
  • Verify your API token if you encounter AuthenticationError

Migration from Firecrawl

If you're migrating from the Firecrawl Python SDK, use this mapping to find equivalent MrScraper methods.

Firecrawl MethodMrScraper Equivalent
Scrape a URL with HTML outputfetch_html
Scrape a URL with structured JSONcreate_scraper with agent="general" or rerun_scraper
Crawl a website or map URLscreate_scraper with agent="map" or rerun_scraper with map options
Batch scrapebulk_rerun_ai_scraper or bulk_rerun_manual_scraper
Scrape listing/pagination pagescreate_scraper with agent="listing" and max_pages parameter

Key Differences

  • Agent-Based Approach: MrScraper uses specialized agents (general, listing, map) instead of format-based parameters.

  • Natural Language Instructions: Instead of defining extraction schemas, use the message parameter to describe what data you want in plain English.

  • Pagination Handling: Use agent="listing" with max_pages to automatically handle paginated content.

On this page