BlueSky Feed Scraper

BlueSky Feed Scraper

Scrapes data from a specified BlueSky feed URL and outputs detailed information about the posts, including metadata, authors, embedded media, and statistics such as likes, replies, and reposts.

SOCIAL_MEDIAINTEGRATIONSAUTOMATIONApify

Bluesky Feed Scraper for Apify

This is an Apify actor that scrapes data from a specified Bluesky feed URL and outputs detailed information about the posts, including metadata, authors, embedded media, and statistics such as likes, replies, and reposts.

Features

  • Scrapes Bluesky feed posts from a given feed URL.
  • Extracts detailed post data, including:
    • Author details (DID, handle, display name, avatar URL, etc.).
    • Post text, tags, and languages.
    • Embedded images, with metadata (alt text, aspect ratio, URLs).
    • Engagement statistics (likes, replies, reposts, quotes).
    • Thread and reply information.
    • Record metadata, including creation and indexing timestamps.

Input

The actor requires the following input:

FieldTypeDescription
urlStringThe URL of the Bluesky feed you want to scrape. Example: https://bsky.app/profile/username/feed.

Example Input

1{
2  "url": "https://bsky.app/profile/c3rmen.bsky.social/feed"
3}

Output

The actor produces a JSON array where each object represents a post from the feed. The structure includes:

  • uri and cid: Unique identifiers for the post.
  • author: Details about the author (DID, handle, avatar, etc.).
  • record: Post text, tags, languages, and embedded media.
  • embed: View-ready image metadata (e.g., thumbnails, full-size URLs).
  • Engagement metrics (replyCount, repostCount, likeCount, quoteCount).
  • Thread and reply-related data.
  • Timestamps (createdAt, indexedAt).

Example Output

1[
2  {
3    "uri": "at://did:plc:z72i7hdynmk6r22z27h6tvur/app.bsky.feed.post/3lbsizxfxa22r",
4    "cid": "bafyreifohcetdw6e5mudaz6anigzsm5ssjpm3oreyxu4a2l665k7hpxo4q",
5    "author": {
6      "did": "did:plc:z72i7hdynmk6r22z27h6tvur",
7      "handle": "bsky.app",
8      "displayName": "Bluesky",
9      "avatar": "https://cdn.bsky.app/img/avatar/plain/did:plc:z72i7hdynmk6r22z27h6tvur/bafkreihagr2cmvl2jt4mgx3sppwe2it3fwolkrbtjrhcnwjk4jdijhsoze@jpeg",
10      "associated": {
11        "chat": {
12          "allowIncoming": "none"
13        }
14      },
15      "labels": [],
16      "createdAt": "2023-04-12T04:53:57.057Z"
17    },
18    "record": {
19      "createdAt": "2024-11-25T21:52:30.840Z",
20      "embed": {
21        "external": {
22          "description": "Bluesky is social media as it should be. Find your community among millions of users, unleash your creativity, and have some fun again. https://bsky.app",
23          "thumb": {
24            "ref": {
25              "$link": "bafkreihh7dthuxfqel6zwcmxapcu47tr34rat7thjtxlfmrwidvxfsmqne"
26            },
27            "mimeType": "image/jpeg",
28            "size": 384236,
29            "$type": "blob"
30          },
31          "title": "BlueskySocial - Twitch",
32          "uri": "https://www.twitch.tv/blueskysocial"
33        },
34        "$type": "app.bsky.embed.external"
35      },
36      "facets": [
37        {
38          "features": [
39            {
40              "did": "did:plc:qjeavhlw222ppsre4rscd3n2",
41              "$type": "app.bsky.richtext.facet#mention"
42            }
43          ],
44          "index": {
45            "byteEnd": 55,
46            "byteStart": 40
47          },
48          "$type": "app.bsky.richtext.facet"
49        },
50        {
51          "features": [
52            {
53              "did": "did:plc:ragtjsm2j2vknwkz3zp4oxrd",
54              "$type": "app.bsky.richtext.facet#mention"
55            }
56          ],
57          "index": {
58            "byteEnd": 76,
59            "byteStart": 64
60          },
61          "$type": "app.bsky.richtext.facet"
62        },
63        {
64          "features": [
65            {
66              "did": "did:plc:4ewnpnebeh7zuk5pbardaxqz",
67              "$type": "app.bsky.richtext.facet#mention"
68            }
69          ],
70          "index": {
71            "byteEnd": 226,
72            "byteStart": 203
73          },
74          "$type": "app.bsky.richtext.facet"
75        }
76      ],
77      "langs": [
78        "en"
79      ],
80      "text": "Join us for another livestream with COO @rose.bsky.team and CTO @pfrazee.com, where they'll share team updates, the story of how Bluesky began, and what’s next. 

Plus, a special guest appearance from @flavorflav.bsky.social! 🎉

Today 11/25 @ 5 pm PT / 8 pm ET / 1 am GMT / 10am JST"
,
81 "$type": "app.bsky.feed.post" 82 }, 83 "embed": { 84 "external": { 85 "uri": "https://www.twitch.tv/blueskysocial", 86 "title": "BlueskySocial - Twitch", 87 "description": "Bluesky is social media as it should be. Find your community among millions of users, unleash your creativity, and have some fun again. https://bsky.app", 88 "thumb": "https://cdn.bsky.app/img/feed_thumbnail/plain/did:plc:z72i7hdynmk6r22z27h6tvur/bafkreihh7dthuxfqel6zwcmxapcu47tr34rat7thjtxlfmrwidvxfsmqne@jpeg" 89 }, 90 "$type": "app.bsky.embed.external#view" 91 }, 92 "replyCount": 324, 93 "repostCount": 1041, 94 "likeCount": 9147, 95 "quoteCount": 84, 96 "indexedAt": "2024-11-25T21:52:35.058Z", 97 "labels": [] 98 }, 99 // ...more posts 100]

Usage

  1. Deploy the Actor: Use the Apify console to set up and deploy this actor.
  2. Provide Input: Supply the url in the input configuration.
  3. Run the Actor: Start the actor, and it will scrape the feed URL and return the posts as JSON.

Notes

  • Ensure the url is publicly accessible.
  • The actor fetches only visible posts; private or restricted feeds will not be included.

Feel free to suggest additional features or report any issues! 🚀

Frequently Asked Questions

Is it legal to scrape job listings or public data?

Yes, if you're scraping publicly available data for personal or internal use. Always review Websute's Terms of Service before large-scale use or redistribution.

Do I need to code to use this scraper?

No. This is a no-code tool — just enter a job title, location, and run the scraper directly from your dashboard or Apify actor page.

What data does it extract?

It extracts job titles, companies, salaries (if available), descriptions, locations, and post dates. You can export all of it to Excel or JSON.

Can I scrape multiple pages or filter by location?

Yes, you can scrape multiple pages and refine by job title, location, keyword, or more depending on the input settings you use.

How do I get started?

You can use the Try Now button on this page to go to the scraper. You’ll be guided to input a search term and get structured results. No setup needed!