Manual Scraper
How to setup Manual Scraper
This page explains how to set up a Manual Scraper in MrScraper. It’s a more customizable option for integrating a scraper into your apps and workflows. Unlike ScrapeGPT, you’ll need to provide the CSS selectors for the data you want.
Requirements
- MrScraper console account.
- CSS selectors for the data that you want to retrieve.
Manual Scraping Example
In this example, we’ll retrieve events data from Luma, returning results based on the defined workflow.
Follow the steps below to use our AI Scraper API to extract the data:
- Login to your MrScraper App Dashboard
- Navigate to the Scraper menu in the Sidebar.
- Click the Create Scraper button on the top right of the page.
- Select Manual as the scraper type, then fill in the Scraper Name and the default URLs. Then click the Create button.
- Navigate to the Workflow tab to define the workflow for the scraper by adding steps. There are some types of workflow steps available as demonstated on the screenshot below:
- Scrape data: Extracts the specified data from the target website.
- Infinite scroll: Automates scrolling on pages with endless content to load more data.
- Wait time: Pauses the scraping process for a set duration to allow dynamic content to load.
- Click element: Simulates clicking an element, such as a button or link, to trigger actions or load more data.
- Take screenshot: Captures a screenshot of the current webpage or a specific area for reference.
- Input text: Automatically types text into fields, such as search bars or forms, to interact with the website.
- Algolia Crawler: Scrapes data specifically from websites that use the Algolia search engine for content retrieval.
-
If you choose Scrape Data, a new step form as seen in the screenshot below will be created.
You need to enter the name, CSS selector, data type, and quantity yourself. -
Recheck the result format by clicking on Preview Data button in the top right corner of the screen.
-
If you want to get a list of object data complete with its’ properties, the first Scrape Data type needs to be a
Collection (List of sub-items)
and the quantity to beAll Matches
. -
You can find the
Parent selector
by right-clicking on the site you want to scrape then click Inspect. Then click on the Add item to collection button. -
After you click the button, you need to fill in the same form but for the collection items (or you can say for the children of the parent you just created on the previous step)
-
Click Save changes when you’re done.
-
Click Run scraper button on the top right of the page to run the scraper.
Tutorial Video
Here is the tutorial demo video for creating Manual Scraper in MrScraper.
And here is the result snippet of the scraping result presented in the video:
Features
The Manual Scraper offers a range of optional features that you can customize and use as needed when setting up your scraper, allowing you to tailor the process to your specific requirements.
- Pagination: Configure settings for handling pagination to scrape additional content across multiple pages.
- Scheduler: Set up the timing for when the scraper should run, allowing for automated or scheduled scrapes.
- Proxy: Manage proxy settings to rotate or apply specific proxies during the scraping process.
- Advanced: Adjust advanced options to fine-tune the scraper’s behavior or performance.
- Parsers: Modify the extracted data by adding custom parsers.
- Logs: View and monitor detailed logs of the scraping activity, including successes, errors, and other events for troubleshooting or auditing purposes.