π Transform web content into clean, LLM-ready Markdown! π Scrape multiple pages, extract main content, and convert to Markdown format. Perfect for AI researchers, data scientists, and LLM developers. Fast, efficient, and customizable. Supercharge your AI training data today! πππ§
This powerful Apify Actor transforms web content into clean, readable Markdown format, perfect for training Large Language Models (LLMs). It's an essential tool for AI researchers, data scientists, and developers working on natural language processing tasks.
Configure your scraping job with these options:
urls
: List of URLs to start scraping fromFor each scraped page, you'll get:
If you encounter any issues or have questions, please reach out through Apify's support channels.
Transform web content into clean, LLM-ready Markdown with just a few clicks! πππ§
A full explanation of an input example in JSON.
1{ 2 "urls": [ 3 "https://apify.com", 4 "https://www.google.com" 5 ] 6}
The results will be wrapped into a dataset which you can always find in theΒ StorageΒ tab. Here's an excerpt from the data you'd get if you apply the input parameters above:
And here is the same data but in JSON. You can choose in which format to download your data: JSON, JSONL, Excel spreadsheet, HTML table, CSV, or XML.
1[ 2 { 3 "url": "https://apify.com", 4 "markdown": "# Apify: Full-stack web scraping and data extraction platform
Apify is the largest ecosystem where developers build, deploy, and publish data extraction and web automation tools. We call them Actors.
[

### TikTok Data Extractor
clockworks/free-tiktok-scraper
Extract data about videos, users, and channels based on hashtags or scrape full user profiles including posts, total likes, name, nickname, numbers of comments, shares, followers, following, and more.
](https://apify.com/clockworks/free-tiktok-scraper)[

### Google Maps Extractor
compass/google-maps-extractor
Extract data from hundreds of places fast. Scrape Google Maps by keyword, category, location, URLs & other filters. Get addresses, contact info, opening hours, popular times, prices, menus & more. Export scraped data, run the scraper via API, schedule and monitor runs, or integrate with other tools.

Compass
](https://apify.com/compass/google-maps-extractor)[

### Instagram Scraper
apify/instagram-scraper
Scrape and download Instagram posts, profiles, places, hashtags, photos, and comments. Get data from Instagram using one or more Instagram URLs or search queries. Export scraped data, run the scraper via API, schedule and monitor runs or integrate with other tools.

Apify
](https://apify.com/apify/instagram-scraper)[

### Website Content Crawler
apify/website-content-crawler
Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with π¦π LangChain, LlamaIndex, and the wider LLM ecosystem.

Apify
](https://apify.com/apify/website-content-crawler)[

### Amazon Scraper
junglee/free-amazon-product-scraper
Gets you product data from Amazon. Unofficial API. Scrapes and downloads product information without using the Amazon API, including reviews, prices, descriptions, and ASIN.

Junglee
](https://apify.com/junglee/free-amazon-product-scraper)[

### Build your own Actor
you/new-idea
Apify gives you all the tools and documentation you need to build reliable scrapers. Fast.

You? π«΅
](https://apify.com/templates)
[

### TikTok Data Extractor
clockworks/free-tiktok-scraper
Extract data about videos, users, and channels based on hashtags or scrape full user profiles including posts, total likes, name, nickname, numbers of comments, shares, followers, following, and more.

Clockworks
](https://apify.com/clockworks/free-tiktok-scraper)
[

### Google Maps Extractor
compass/google-maps-extractor
Extract data from hundreds of places fast. Scrape Google Maps by keyword, category, location, URLs & other filters. Get addresses, contact info, opening hours, popular times, prices, menus & more. Export scraped data, run the scraper via API, schedule and monitor runs, or integrate with other tools.

Compass
](https://apify.com/compass/google-maps-extractor)
[Browse 3,000+ Actors](https://apify.com/store)
Trusted by global technology leaders
Not just a web scraping API
---------------------------
Easily integrate
Zapierany appGitHubGoogle SheetsPineconeAirbyteKeboolaGoogle DriveSlackZapier
with Actors
--------------------------------------------------------------------------------------------------------------
Build reliable web scrapers. Fast.
----------------------------------
### We love open source
Apify works great with both Python and JavaScript, as well as Playwright, Puppeteer, Selenium, Scrapy, and Crawlee - our own web crawling and browser automation library.
[](https://crawlee.dev/)
```
1import { PuppeteerCrawler, Dataset } from "crawlee";
2
3const crawler = new PuppeteerCrawler({
4 async requestHandler({ request, page, enqueueLinks }) {
5 await Dataset.pushData({
6 url: request.url,
7 title: await page.title(),
8 });
9 await enqueueLinks();
10 },
11});
12
13await crawler.run(["https://crawlee.dev"]);
```
Publish Actors. Get paid.
-------------------------
### Reach thousands of new customers
Building and running a SaaS is hard. Building an Actor and selling it on Apify Store is 10x easier. Get visitors from day one.

#### No upfront costs
Publishing your Actor is free of chargeβthe customers pay for the computing resources. New creators get $500 free platform credits.
#### Rely on Apify infra
Actors scale automatically as you gain new users. You donβt need to worry about compute, storage, proxies, or authentication.
#### Billing is on us
Handling payments, taxes, and invoicing is a painful part of running a SaaS. Apify does all that and sends you a net payout every month.", 5 "timestamp": "2025-01-10T07:10:53.476Z" 6 }, 7 { 8 "url": "https://apify.com/actors", 9 "title": "Actors - fast and easy scraping in the cloud Β· Apify", 10 "markdown": "Actors are serverless cloud programs that run on the Apify platform and do computing jobs. They are called Actors because, like human actors, they perform actions based on a script.

### Long-running serverless jobs[](#long-running-serverless-jobs)
Apify Actors can perform time-consuming jobs that are longer than the lifespan of a single HTTP transaction.

### Publish your Actor[](#publish-your-actor)
Join hundreds of developers who share their Actors on Apify Store and earn money from coding.
[Go to Apify Store](/store)

### Auto-generated user interface[](#auto-generated-user-interface)
Actors can easily define a user interface for their input configuration. Take advantage of lower-level features and settings, or run Actors using our API.
[Learn about Input Schema](https://docs.apify.com/academy/deploying-your-code/input-schema)


Host code anywhere
Edit your code on our platform, fetch from a Git repository, or push from your machine.

Docker support
Actors run inside Docker containers on Apify servers. Use a custom Dockerfile.

Ready for scale
Run as many Actors as you need. The Apify platform provisions the necessary resources.

Custom memory and CPU
Assign each Actor any RAM volume needed. CPU share is allocated automatically.

Command-line tool
Develop and test your Actors locally, push them to the Apify platform when you're ready.

Logging
View and download logs to debug your code and monitor performance on production.

Actorize your Scrapy spiders[](#actorize-your-scrapy-spiders)
-------------------------------------------------------------
Deploy your Scrapy code to the cloud with just a few commands. Turn your Scrapy projects into Actors, run, schedule, monitor and monetize them.
[Learn more](/run-scrapy-in-cloud)" 11 }, 12 ... 13]
Yes, if you're scraping publicly available data for personal or internal use. Always review Websute's Terms of Service before large-scale use or redistribution.
No. This is a no-code tool β just enter a job title, location, and run the scraper directly from your dashboard or Apify actor page.
It extracts job titles, companies, salaries (if available), descriptions, locations, and post dates. You can export all of it to Excel or JSON.
Yes, you can scrape multiple pages and refine by job title, location, keyword, or more depending on the input settings you use.
You can use the Try Now button on this page to go to the scraper. Youβll be guided to input a search term and get structured results. No setup needed!