Automated website screenshot crawler using Pyppeteer and Apify. This open-source actor captures screenshots from specified URLs, uploads them to the Apify Key-Value Store, and provides easy access to the results, making it ideal for monitoring website changes and archiving web content.
A template for automated website screenshot capturing. This actor takes screenshots of websites from specified URLs, uploads them to Apify Key-Value Store, and provides screenshot URLs in a dataset. It is ideal for monitoring website changes, archiving web content, or capturing visuals for reports. The actor uses Pyppeteer for browser automation and screenshot generation.
You can find the source code for this actor in my GitHub account:
The input for this actor should be JSON containing the necessary configuration. The only required field is link_urls
, which must be an array of website URLs. All other fields are optional. Here’s a detailed description of the input fields:
Field | Type | Description | Allowed Values |
---|---|---|---|
link_urls | Array | An array of website URLs to capture screenshots of. | Any valid URL |
Sleep | Number | Duration to wait after the page has loaded before taking a screenshot (in seconds). | Minimum: 0, Maximum: 3600 |
waitUntil | String | Event to wait for before taking the screenshot. | One of: "load" , "domcontentloaded" , "networkidle2" , "networkidle0" |
cookies | Array | Any cookies to set for the browser session. | Array of cookie objects |
fullPage | Boolean | Whether to capture the full page or just the viewport. | true or false |
window_Width | Number | Width of the browser viewport. | Minimum: 100, Maximum: 3840 |
window_Height | Number | Height of the browser viewport. | Minimum: 100, Maximum: 2160 |
scrollToBottom | Boolean | Should the browser scroll to the bottom of the page before taking a screenshot? | true or false |
distance | Number | Distance (in pixels) to scroll down for each scroll action. | Minimum: 0 |
delay | Number | Delay (in milliseconds) between scroll actions. | Minimum: 0, Maximum: 3600000 |
delayAfterScrolling | Number | Specify the delay (in milliseconds) after scrolling to the bottom of the page before taking a screenshot. | Minimum: 0, Maximum: 3600000 |
waitUntilNetworkIdleAfterScroll | Boolean | Choose whether to wait for the network to become idle after scrolling to the bottom of the page. | true or false |
waitUntilNetworkIdleAfterScrollTimeout | Number | Maximum wait time (in milliseconds) for the network to become idle after scrolling. | Minimum: 1000, Maximum: 3600000 |
For more information about the waitUntil
parameter, please refer to the Puppeteer page.goto function documentation.
Once the actor finishes executing, it will output a screenshot of each website into a file stored in the Key-Value Store associated with the run. The screenshot URLs will also be stored in a dataset for easy access.
page.setCookie()
, and the viewport is configured with specified width and height.page.goto()
, waiting for the specified waitUntil
event.scrollToBottom
option is enabled, the actor executes a scrolling script that scrolls down the page by the defined distance
in pixels.Sleep
duration before capturing the screenshot and saves it with a random filename.Actor.set_value()
, with URLs stored in the dataset.This open-source actor effectively automates the process of capturing and storing screenshots of multiple web pages, making it a valuable tool for monitoring website changes, archiving content, or generating visual reports.
To get started with this actor:
To develop this actor locally, follow these steps:
Install apify-cli
:
Using Homebrew:
brew install apify-cli
Using NPM:
npm install -g apify-cli
Pull the Actor using its unique <ActorId>
:
apify pull <ActorId>
For any inquiries, you can reach me at:
Email: fridaytechnolog@gmail.com
GitHub: https://github.com/DZ-ABDLHAKIM
Twitter: https://x.com/DZ_45Omar
Yes, if you're scraping publicly available data for personal or internal use. Always review Websute's Terms of Service before large-scale use or redistribution.
No. This is a no-code tool — just enter a job title, location, and run the scraper directly from your dashboard or Apify actor page.
It extracts job titles, companies, salaries (if available), descriptions, locations, and post dates. You can export all of it to Excel or JSON.
Yes, you can scrape multiple pages and refine by job title, location, keyword, or more depending on the input settings you use.
You can use the Try Now button on this page to go to the scraper. You’ll be guided to input a search term and get structured results. No setup needed!