Bulk Scraping
Learn how to efficiently scrape multiple URLs using our bulk scraping features.
MrScraper’s Bulk Scraping feature allows you to extract data from multiple URLs in a single operation, saving you time and effort. You can upload a list of URLs and apply the same scraping configuration to all of them, making it easy to gather large datasets quickly.
Tip
This feature is perfect for users who need to scrape data from multiple pages with similar structures, such as product listings, articles, or profiles.
Usage Example
To perform a bulk scrape, follow these steps:
- Open a new or existing scraper in your MrScraper dashboard.
- Click Bulk Scraping button on the top section.
- You can either upload an Excel file (.xlsx, .xls) containing the list of URLs or paste them directly into the provided text area (one URL per line). For example:
https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.htmlhttps://books.toscrape.com/catalogue/tipping-the-velvet_999/index.htmlhttps://books.toscrape.com/catalogue/soumission_998/index.htmlhttps://books.toscrape.com/catalogue/sharp-objects_997/index.html
Excel Format
The file should have a column with header "url" or "URL". Each row should contain one URL.
Example :
- Once the URLs are added, click Save and Run All to initiate the process.
Note
Token will be consumed for each URL scraped in bulk, so ensure you have sufficient tokens in your account before initiating a bulk scrape.
- Once completed, you can view and download the scraped data for each URL individually. For example:
Result :
{
"1": {
"id": "1",
"name": "A Light in the Attic",
"rows": [
[
"A Light in the Attic",
"£51.77",
"In stock"
]
],
"source": "table",
"headers": [
"Title",
"Price",
"Availability"
]
}
}{
"1": {
"id": "1",
"name": "Tipping the Velvet",
"rows": [
[
"Tipping the Velvet",
"£53.74",
"In stock"
]
],
"source": "table",
"headers": [
"Title",
"Price",
"Availability"
]
}
}{
"1": {
"id": "1",
"name": "Soumission",
"rows": [
[
"Soumission",
"£50.10",
"In stock (20 available)"
]
],
"source": "product",
"headers": [
"Title",
"Price",
"Availability"
]
}
}{
"1": {
"id": "1",
"name": "Sharp Objects",
"rows": [
[
"Sharp Objects",
"£47.82",
"In stock (20 available)"
]
],
"source": "product",
"headers": [
"Title",
"Price",
"Availability"
]
}
}