General Agent
Use the AI Scraper General Agent to extract structured data from single web pages using natural language prompts.
The General Agent is designed specifically for extracting data from a single web page, especially product detail pages, profile pages, article pages, property detail pages, and other one-off pages where you already have the final URL. Perfect when you just need to pull specific information from one page only, without navigation or crawling across listings.
Note
Avoid using this for listing or catalog pages containing multiple products. For those pages, use the Listing Agent.
General Agent Usage
| Category | Scenarios | Example URLs |
|---|---|---|
| Use General Agent | - Extract structured data from a single page (product, article, profile, event, job posting) - Scrape a product or listing detail page without visiting other pages - You already have the final target URL - No navigation, pagination, clicking, or scrolling required - Page can be processed in one static or dynamically rendered view | https://www.walmart.com/ip/15377670482 https://www.zillow.com/homedetails/101-Frederica-St-UNIT-301-Owensboro-KY-42301/455517252_zpid/ https://www.bbc.co.uk/news/articles/ckgmy90z991o |
| Do NOT Use General Agent | - Listing pages with multiple products/items (use Listing Agent) - Search results pages - Category pages with multiple products - Directory pages with many profiles/businesses - Multi-page extraction requiring pagination | https://www.amazon.com/s?k=laptops https://www.walmart.com/browse/electronics https://www.zillow.com/homes/fo |
Limitations
The General Agent has the following limitations:
- No Browser Automation: Cannot perform clicks, scrolling, or interact with dynamic elements
- No Pagination: Cannot navigate to next pages or load more content automatically
- Single Page Only: Cannot open or follow links to detail pages from listing URLs
- No Form Submission: Cannot fill out forms, log in, or submit data
- Static Content Focus: Works best with content that's immediately visible on page load
- No Multi-Step Workflows: Cannot perform sequences like "click product → extract details → go back"
Note
If you need to scrape multiple items from listing pages or navigate through pagination, use the Listing Agent or Map Agent instead.
Example Usage
Follow these steps to use the General Agent from your dashboard:
- Log in to MrScraper, then click Scraper in the left sidebar
- Click New AI Scraper + at the top to create a new scraper
- Select the General Scraper Agent and enter the URL to scrape
- Choose between Cheap or Super Agent
General Agent uses the Super type by default for optimal accuracy.
- Wait for the AI to process the provided URL
- Enter your prompt describing the data you want to extract
- The AI will analyze your prompt and extract the requested data
- Once complete, review your results or export them as JSON or CSV
Example: Scraping E-Commerce Product Details
Example URL:
https://www.walmart.com/ip/15377670482
Initial Extracted Data:
{
"data": {
"id": "1",
"name": "Restored Dell Latitude 3190 | 11.6\" Touchscreen Laptop PC | Intel Core Pentium Silver N5030 (1.1 GHz) | 8GB RAM | 128GB SSD | Windows 11 Pro (Refurbished)",
"price": "158.00",
"rating": "4.6",
"seller": {
"name": "Discount Computer Depot",
"rating": "3.8",
"reviews_count": "9721"
},
"source": "product",
"reviews": "18",
"features": {
"Display Features": "11.6-inch touchscreen display",
"Memory & Storage": "8GB RAM, 128GB SSD",
"Operating System": "Windows 11 Pro",
"Processor Details": "Intel Core Pentium Silver N5030, 1.1 GHz",
"Graphics Capability": "Intel UHD Graphics 605",
"Connectivity Options": "Display Port",
"Integrated Peripherals": "Built-in webcam"
},
"shipping": {
"method": "Shipping",
"arrival": "Dec 16",
"availability": "Free"
},
"return_policy": "Free 90-day returns"
}
}Refining with a Follow-up Prompt:
Remove the seller and shipping fields from the extracted JSON.
Refined Output:
{
"data": {
"id": "1",
"name": "Restored Dell Latitude 3190 | 11.6\" Touchscreen Laptop PC | Intel Core Pentium Silver N5030 (1.1 GHz) | 8GB RAM | 128GB SSD | Windows 11 Pro (Refurbished)",
"price": "158.00",
"rating": "4.6",
"source": "product",
"reviews": "18",
"features": {
"Display Features": "11.6-inch touchscreen display",
"Memory & Storage": "8GB RAM, 128GB SSD",
"Operating System": "Windows 11 Pro",
"Processor Details": "Intel Core Pentium Silver N5030, 1.1 GHz",
"Graphics Capability": "Intel UHD Graphics 605",
"Connectivity Options": "Display Port",
"Integrated Peripherals": "Built-in webcam"
},
"return_policy": "Free 90-day returns"
}
}Tips and Best Practices
- Be Specific with Your Prompts: Clear, detailed prompts yield better results. Instead of "Get product info," try "Extract product name, price, rating, and available colors"
- Validate Before Scaling: Always review a sample extraction before automating large-scale scraping jobs
- Use Cheap Mode first: Start with cheap mode to test whether extraction works. If it fails, then switch to Super Mode.
- Refine Iteratively: If initial results are incomplete, adjust your prompt and re-run. You can build upon previous extractions with follow-up prompts
- Retry When Needed: If results look incomplete, adjust your prompt and rerun.
- Use Structured Prompts: Frame your requests clearly, e.g., "Extract: product title, price, SKU, availability status, and customer ratings"
Tip
For the best results, describe both what data you want and how you want it structured in your prompt.