Wetu
Step-by-step instructions for scraping content from Wetu with the manual workflow.
This guide explains how to scrape Wetu using the Manual Workflow. Because Wetu pages rely on dynamic scripts and complex HTML, the AI scraper may miss key content. The Manual Workflow lets you inject custom JavaScript so you can control how the page loads and extract exactly what you need.
Scenario
In this scenario, you need to extract all internal and external links from a Wetu site. Because the website’s structure is complex, the AI scraper cannot process it correctly. To handle this, you will use the Manual Workflow to run a custom JavaScript extraction instead.
Step 1: Create a Manual Workflow with Custom JavaScript
- Navigate to your MrScraper dashboard.
- click Scraper in the left sidebar, and click New Manual Scraper + at the top to create a new manual scraper.
- Input https://content.wetu.com/Africa to the URL field, then press submit.
- Add an
Inject JavaScriptstep. - Fill the name field with
wetu_content_central, script timeout field with300. Then paste this code into the code field :
(async () => {
const tableRows = Array.from(document.querySelectorAll('#brochures > tbody > tr'));
const slugify = s => s
.toString()
.normalize('NFKD') // remove accents
.replace(/[\u0300-\u036f]/g, '')
.toLowerCase()
.trim()
.replace(/[^a-z0-9\s-]/g, '') // remove invalid chars
.replace(/\s+/g, '_') // spaces to underscores
.replace(/-+/g, '_') // hyphens to underscores
.replace(/_+/g, '_') // collapse multiple underscores
.replace(/^_+|_+$/g, ''); // trim underscores
const makeSubUrls = (id, slug) => {
if (!id || !slug) return [];
return [
{ section: 'Overview', url: `https://wetu.com/iBrochure/en/Home/${id}/${slug}` },
{ section: 'About Us - Why Stay Here', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Why-Stay-Here` },
{ section: 'About Us - Why-Do-This', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Why-Do-This` },
{ section: 'About Us - Facilities', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Facilities` },
{ section: 'About Us - Documentation', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Documentation` },
{ section: 'About Us - Specials', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Specials` },
{ section: 'About Us - Rates', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Rates` },
{ section: 'Stay - Room Types', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Room-Types` },
{ section: 'Stay - Suites', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Suites` },
{ section: 'Stay - Unit Types', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Unit-Types` },
{ section: 'Stay - Sleeping Arrangements', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/SleepingArrangements` },
{ section: 'Stay - Tents', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Tents` },
{ section: 'Gallery - Photos', url: `https://wetu.com/iBrochure/en/Photos/${id}/${slug}` },
{ section: 'Gallery - Download Photos', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/ImageLibrary` },
{ section: 'Gallery - Videos', url: `https://wetu.com/iBrochure/en/Videos/${id}/${slug}` },
{ section: 'Gallery - Download Videos', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/VideoLibrary` },
{ section: 'Gallery - Virtual Tours', url: `https://wetu.com/iBrochure/en/Virtual-Tours/${id}/${slug}` },
{ section: 'Enjoy - Activities', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Activities` },
{ section: 'Enjoy - Restaurants', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Restaurants` },
{ section: 'Enjoy - Options', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Options` },
{ section: 'Map - Location', url: `https://wetu.com/iBrochure/en/Map/${id}/${slug}/Location` },
{ section: 'Contact', url: `https://wetu.com/iBrochure/en/Information/${id}/${slug}/Contact` }
];
};
const dataArray = tableRows.map(row => {
const nameCell = row.querySelector('td:nth-child(1)');
const destinationCell = row.querySelector('td:nth-child(2)');
const countryCell = row.querySelector('td:nth-child(3)');
const linkEl = row.querySelector('td:nth-child(6) a:first-child');
const name = nameCell ? nameCell.textContent.trim() : '';
const destination = destinationCell ? destinationCell.textContent.trim() : '';
const country = countryCell ? countryCell.textContent.trim() : '';
if (!linkEl) return { name, id: null, link: null, subUrls: [] };
let id = '';
const href = linkEl.getAttribute('href') || '';
try {
id = new URL(href, location.href).pathname.split('/').filter(Boolean).pop();
} catch (e) {
id = href.split('/').filter(Boolean).pop();
}
const nameSlug = slugify(name || id);
const link = `https://wetu.com/iBrochure/en/Home/${id}/${nameSlug}`;
const subUrls = makeSubUrls(id, nameSlug);
return { id, name, destination, country, link, subUrls };
});
console.log(JSON.stringify(dataArray, null, 2));
return dataArray;
})();Note
This script is tailored for the https://content.wetu.com/Africa platform. If you need to scrape a different website, please contact us to help you create the appropriate custom JavaScript.
- Save workflow by pressing the Save button at the bottom of the page.
- Press Run Scraper to start scraping.
- Wait until the scraping is complete.
- Example output:
{
"wetu_content_central": [
{
"id": "24061",
"name": "!Xaus Lodge",
"destination": "Kgalagadi Transfrontier Park (South Africa)",
"country": "South Africa",
"link": "https://wetu.com/iBrochure/en/Home/24061/xaus_lodge",
"subUrls": [
{
"section": "Overview",
"url": "https://wetu.com/iBrochure/en/Home/24061/xaus_lodge"
},
{
"section": "About Us - Why Stay Here",
"url": "https://wetu.com/iBrochure/en/Information/24061/xaus_lodge/Why-Stay-Here"
},
{
"section": "About Us - Why-Do-This",
"url": "https://wetu.com/iBrochure/en/Information/24061/xaus_lodge/Why-Do-This"
},
{
"section": "About Us - Facilities",
"url": "https://wetu.com/iBrochure/en/Information/24061/xaus_lodge/Facilities"
},
{
"section": "About Us - Documentation",
"url": "https://wetu.com/iBrochure/en/Information/24061/xaus_lodge/Documentation"
}
]
}
]
}- To call the scraper through the API, click the Settings button.
- Select Manual Scraper API Access.
- Copy the provided cURL example and use it in Postman, or call the endpoint from the Rerun a Manual Scraper.
Step 2: Scrape the Returned Link Detail Page
You can retrieve page details in two ways: by creating a new scraper or by calling our API. Use the link returned from the manual workflow above. This is the URL you will scrape to get the detailed data.
New Scraper
- Open our dashboard on https://v3.app.mrscraper.com/auth/login
- Click Scraper in the left sidebar, and click New AI Scraper + at the top to create a new AI scraper
- Input the link of the page into the URL field, then press submit to start scraping
- Wait until the scraping is complete.
- Example output:
{
"name": "!Xaus Lodge",
"images": [
"/ImageHandler/w1920x1080/24061/024.jpg",
"/ImageHandler/w1920x1080/24061/006.jpg",
"/ImageHandler/w1920x1080/24061/022.jpg"
],
"fast_facts": {
"rating": "Other",
"check_in_time": "14:30",
"check_out_time": "11:00",
"number_of_rooms": 12,
"spoken_languages": [
"Afrikaans",
"English"
],
"special_interests": [
"Adventure",
"Birding",
"History & Culture",
"Indigenous Culture / Art",
"Leisure",
"Nature",
"Relaxation",
"Star Gazing",
"Wildlife"
]
},
"contact_link": "/iBrochure/en/Information/24061/xaus_lodge/Contact",
"book_now_link": "https://www.nightsbridge.co.za/bridge/book?bbid=13334",
"gallery_links": [
{
"link": "/iBrochure/en/Photos/24061/xaus_lodge",
"type": "Images"
},
{
"link": "/iBrochure/en/Information/24061/xaus_lodge/ImageLibrary",
"type": "Download Images"
},
{
"link": "/iBrochure/en/Videos/24061/xaus_lodge",
"type": "Videos"
}
],
"location_link": "/iBrochure/en/Map/24061/xaus_lodge/Location",
"overview_link": "/iBrochure/en/Home/24061/xaus_lodge",
"activities_link": "/iBrochure/en/Information/24061/xaus_lodge/Activities",
"room_types_link": "/iBrochure/en/Information/24061/xaus_lodge/Room-Types"
}API Calling
- Select your AI scraper and click the Settings button.
- Select AI Scraper API Access.
- Copy the provided cURL example and use it in Postman, or call the endpoint from the Rerun an AI Scraper.