Three Agents Flow

Learn how to use MrScraper's Multi-Agent Flow starting from a single seed URL to extract comprehensive data using Map, Listing, and General Agents.

In this workflow, you’ll move through Map Agent → Listing Agent → General Agent. This is the most complete approach for extracting data from an entire website when all you have is the main domain URL.

When to Use This Workflow

Use this workflow when you:

  • Only have the seed URL (e.g., https://example.com) and want to discover all available products
  • Need to scrape an entire e-commerce site without manually finding category pages
  • Want comprehensive data coverage across all sections of a website
  • Don't know the site structure or specific listing page URLs

How It Works

Discover URLs with Map Agent

Input the seed URL to discover all pages on the website.

Filter Listing URLs

Identify and filter URLs that contain product listings or categories.

Extract Listings with Listing Agent

Scrape all listing pages to collect product URLs and basic information.

Extract Details with General Agent

Use General Agent on each product URL to get comprehensive data.

Step-by-Step Process

Step 1: Discover All URLs with Map Agent

Start by using the Map Agent to discover every URL on the website:

// Input
{
  "url": "https://books.toscrape.com",
  "agent": "map"
}

// Output (sample)
{
  "urls": [
    "https://books.toscrape.com",
    "https://books.toscrape.com/catalogue/category/books_1/index.html",
    "https://books.toscrape.com/catalogue/category/books/travel_2/index.html",
    "https://books.toscrape.com/catalogue/category/books/mystery_3/index.html",
    "https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html",
    "https://books.toscrape.com/catalogue/tipping-the-velvet_999/index.html"
  ],
  "count": 1053
}

Step 2: Filter Listing URLs

Filter the discovered URLs to identify only listing/category pages. You can do this by:

  • URL pattern matching: Look for patterns like /category/, /browse/, /search/
  • URL structure analysis: Identify URLs that typically contain multiple products
  • Manual filtering: Review and select relevant category pages
// Example filtering logic
const listingUrls = allUrls.filter(url => {
  return url.includes('/category/') || 
         url.includes('/catalogue/category/') ||
         (url.match(/\/page-\d+/) !== null);
});

// Filtered Result
[
  "https://books.toscrape.com/catalogue/category/books/travel_2/index.html",
  "https://books.toscrape.com/catalogue/category/books/mystery_3/index.html",
  "https://books.toscrape.com/catalogue/category/books/fiction_10/index.html"
]

Step 3: Extract Listings with Listing Agent

Use the Listing Agent on each filtered URL to get all product listings and their detail page URLs:

// Input
{
  "url": "https://books.toscrape.com/catalogue/category/books/travel_2/index.html",
  "agent": "listing",
  "prompt": "Extract all book titles, prices, ratings, availability, and detail page URLs"
}

// Output (sample)
{
  "response": [
    {
      "page_num": 0,
      "data": {
        "mode": "direct",
        "data": [
          {
            "id": "1",
            "title": "It's Only the Himalayas",
            "price": "£45.17",
            "rating": "2",
            "availability": "In stock",
            "url": "https://books.toscrape.com/catalogue/its-only-the-himalayas_981/index.html"
          },
          {
            "id": "2",
            "title": "Full Moon over Noah's Ark",
            "price": "£49.43",
            "rating": "4",
            "availability": "In stock",
            "url": "https://books.toscrape.com/catalogue/full-moon-over-noahs-ark_811/index.html"
          }
        ]
      }
    }
  ]
}

Step 4: Extract Detailed Data with General Agent

Loop through each detail page URL and use the General Agent to extract comprehensive product information:

// Input (for each URL from Listing Agent)
{
  "url": "https://books.toscrape.com/catalogue/its-only-the-himalayas_981/index.html",
  "agent": "general",
  "prompt": "Extract book title, price, rating, availability, product description, UPC, number of reviews, and category"
}

// Output (sample)
{
  "data": {
    "id": "1",
    "title": "It's Only the Himalayas",
    "price": "£45.17",
    "rating": "2",
    "availability": "In stock (19 available)",
    "description": "Wherever you go, whatever you do, just don't do anything stupid. ' (Tess' Nan)Tess, an unlucky-in-love city girl, has...",
    "upc": "a22124811bfa8350",
    "reviews_count": "0",
    "category": "Travel",
    "product_type": "Books",
    "tax": "£0.00"
  }
}