MrScraper Documentation

Step-by-step instructions for scraping content from Wetu with the manual workflow.

This guide explains how to scrape Wetu using the Manual Workflow. Because Wetu pages rely on dynamic scripts and complex HTML, the AI scraper may miss key content. The Manual Workflow lets you inject custom JavaScript so you can control how the page loads and extract exactly what you need.

Scenario

In this scenario, you need to extract all internal and external links from a Wetu site. Because the website’s structure is complex, the AI scraper cannot process it correctly. To handle this, you will use the Manual Workflow to run a custom JavaScript extraction instead.

Step 1: Create a Manual Workflow with Custom JavaScript

Navigate to your MrScraper dashboard.
click Scraper in the left sidebar, and click New Manual Scraper + at the top to create a new manual scraper.
Input https://content.wetu.com/Africa to the URL field, then press submit.
Add an Inject JavaScript step.
Fill the name field with wetu_content_central, script timeout field with 300. Then paste this code into the code field :

Custom JavaScript to scrape Wetu content

(async () => {
    const tableRows = Array.from(document.querySelectorAll('#brochures > tbody > tr'));

    const slugify = s => s
        .toString()
        .normalize('NFKD') // remove accents
        .replace(/[\u0300-\u036f]/g, '')
        .toLowerCase()
        .trim()
        .replace(/[^a-z0-9\s-]/g, '') // remove invalid chars
        .replace(/\s+/g, '_') // spaces to underscores
        .replace(/-+/g, '_') // hyphens to underscores
        .replace(/_+/g, '_') // collapse multiple underscores
        .replace(/^_+|_+$/g, ''); // trim underscores

    const makeSubUrls = (id, slug) => {
        if (!id || !slug) return [];
        return [
            { section: 'Overview', url: `https://wetu.com/iBrochure/en/Home/${id}/${slug}` },

            { section: 'About Us - Why Stay Here', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Why-Stay-Here` },
            { section: 'About Us - Why-Do-This', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Why-Do-This` },
            { section: 'About Us - Facilities', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Facilities` },
            { section: 'About Us - Documentation', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Documentation` },
            { section: 'About Us - Specials', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Specials` },
            { section: 'About Us - Rates', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Rates` },

            { section: 'Stay - Room Types', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Room-Types` },
            { section: 'Stay - Suites', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Suites` },
            { section: 'Stay - Unit Types', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Unit-Types` },
            { section: 'Stay - Sleeping Arrangements', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/SleepingArrangements` },
            { section: 'Stay - Tents', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Tents` },

            { section: 'Gallery - Photos', url: `https://wetu.com/iBrochure/en/Photos/${id}/${slug}` },
            { section: 'Gallery - Download Photos', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/ImageLibrary` },
            { section: 'Gallery - Videos', url: `https://wetu.com/iBrochure/en/Videos/${id}/${slug}` },
            { section: 'Gallery - Download Videos', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/VideoLibrary` },
            { section: 'Gallery - Virtual Tours', url: `https://wetu.com/iBrochure/en/Virtual-Tours/${id}/${slug}` },

            { section: 'Enjoy - Activities', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Activities` },
            { section: 'Enjoy - Restaurants', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Restaurants` },
            { section: 'Enjoy - Options', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Options` },

            { section: 'Map - Location', url: `https://wetu.com/iBrochure/en/Map/${id}/${slug}/Location` },

            { section: 'Contact', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Contact` }
        ];
    };

    const dataArray = tableRows.map(row => {
        const nameCell = row.querySelector('td:nth-child(1)');
        const destinationCell = row.querySelector('td:nth-child(2)');
        const countryCell = row.querySelector('td:nth-child(3)');
        const linkEl = row.querySelector('td:nth-child(6) a:first-child');

        const name = nameCell ? nameCell.textContent.trim() : '';
        const destination = destinationCell ? destinationCell.textContent.trim() : '';
        const country = countryCell ? countryCell.textContent.trim() : '';
        if (!linkEl) return { name, id: null, link: null, subUrls: [] };

        let id = '';
        const href = linkEl.getAttribute('href') || '';
        try {
            id = new URL(href, location.href).pathname.split('/').filter(Boolean).pop();
        } catch (e) {
            id = href.split('/').filter(Boolean).pop();
        }

        const nameSlug = slugify(name || id);
        const link = `https://wetu.com/iBrochure/en/Home/${id}/${nameSlug}`;
        const subUrls = makeSubUrls(id, nameSlug);

        return { id, name, destination, country, link, subUrls };
    });

    console.log(JSON.stringify(dataArray, null, 2));

    return dataArray;
})();

Note

This script is tailored for the https://content.wetu.com/Africa platform. If you need to scrape a different website, please contact us to help you create the appropriate custom JavaScript.

Save workflow by pressing the Save button at the bottom of the page.
Press Run Scraper to start scraping.
Wait until the scraping is complete.
Example output:

Scrape Result

{
  "wetu_content_central": [
    {
      "id": "24061",
      "name": "!Xaus Lodge",
      "destination": "Kgalagadi Transfrontier Park (South Africa)",
      "country": "South Africa",
      "link": "https://wetu.com/iBrochure/en/Home/24061/xaus_lodge",
      "subUrls": [
        {
          "section": "Overview",
          "url": "https://wetu.com/iBrochure/en/Home/24061/xaus_lodge"
        },
        {
          "section": "About Us - Why Stay Here",
          "url": "https://wetu.com/iBrochure/en/Information/24061/xaus_lodge/Why-Stay-Here"
        },
        {
          "section": "About Us - Why-Do-This",
          "url": "https://wetu.com/iBrochure/en/Information/24061/xaus_lodge/Why-Do-This"
        },
        {
          "section": "About Us - Facilities",
          "url": "https://wetu.com/iBrochure/en/Information/24061/xaus_lodge/Facilities"
        },
        {
          "section": "About Us - Documentation",
          "url": "https://wetu.com/iBrochure/en/Information/24061/xaus_lodge/Documentation"
        }
      ]
    }
  ]
}

To call the scraper through the API, click the Settings button.
Select Manual Scraper API Access.
Copy the provided cURL example and use it in Postman, or call the endpoint from the Rerun a Manual Scraper.

Step 2: Scrape the Returned Link Detail Page

You can retrieve page details in two ways: by creating a new scraper or by calling our API. Use the link returned from the manual workflow above. This is the URL you will scrape to get the detailed data.

New Scraper

Open our dashboard on https://v3.app.mrscraper.com/auth/login
Click Scraper in the left sidebar, and click New AI Scraper + at the top to create a new AI scraper
Input the link of the page into the URL field, then press submit to start scraping
Wait until the scraping is complete.
Example output:

Scrape Detail Result

{
  "name": "!Xaus Lodge",
  "images": [
    "/ImageHandler/w1920x1080/24061/024.jpg",
    "/ImageHandler/w1920x1080/24061/006.jpg",
    "/ImageHandler/w1920x1080/24061/022.jpg"
  ],
  "fast_facts": {
    "rating": "Other",
    "check_in_time": "14:30",
    "check_out_time": "11:00",
    "number_of_rooms": 12,
    "spoken_languages": [
      "Afrikaans",
      "English"
    ],
    "special_interests": [
      "Adventure",
      "Birding",
      "History & Culture",
      "Indigenous Culture / Art",
      "Leisure",
      "Nature",
      "Relaxation",
      "Star Gazing",
      "Wildlife"
    ]
  },
  "contact_link": "/iBrochure/en/Information/24061/xaus_lodge/Contact",
  "book_now_link": "https://www.nightsbridge.co.za/bridge/book?bbid=13334",
  "gallery_links": [
    {
      "link": "/iBrochure/en/Photos/24061/xaus_lodge",
      "type": "Images"
    },
    {
      "link": "/iBrochure/en/Information/24061/xaus_lodge/ImageLibrary",
      "type": "Download Images"
    },
    {
      "link": "/iBrochure/en/Videos/24061/xaus_lodge",
      "type": "Videos"
    }
  ],
  "location_link": "/iBrochure/en/Map/24061/xaus_lodge/Location",
  "overview_link": "/iBrochure/en/Home/24061/xaus_lodge",
  "activities_link": "/iBrochure/en/Information/24061/xaus_lodge/Activities",
  "room_types_link": "/iBrochure/en/Information/24061/xaus_lodge/Room-Types"
}

API Calling

Select your AI scraper and click the Settings button.
Select AI Scraper API Access.
Copy the provided cURL example and use it in Postman, or call the endpoint from the Rerun an AI Scraper.

Wetu

Scenario

Step 1: Create a Manual Workflow with Custom JavaScript

Step 2: Scrape the Returned Link Detail Page

New Scraper

API Calling

On this page