Commit Historian Agent

Simple tool to help analyze Github repository commits. It checkouts the repository and get all relevant commit messages. It uses OpenAI to answer questions asked by the user. This is done through PydanticAI framework.

AIAGENTSOPEN_SOURCEApify

Try Now →

Commit Historian Agent

Simple tool to help analyze Github repository commits. It checkouts the repository and gets all relevant commit messages. It uses OpenAI to answer questions asked by the user. This is done through PydanticAI framework.

How to run it

You can pick this actor from Apify store and run it on the Apify platform.

Enter repository name and your question and start the Actor. Optionally you can choose a specific branch if your question is not related to the default branch of the repository.

If you do not input your own OpenAI API key then the actor will use our own API key, which will cause additional costs for running the actor. You can pass your own OpenAI API key to significantly reduce the actor run costs.

Example

Inputs:

prompt: Show several most complicated changes done last month.

repository: apify/crawlee-python

Result:

Here are some of the most complicated changes from last month in the apify/crawlee-python repository:

Status Code Handling Update: This refactor involved removing parameters and methods related to HTTP error status codes in HTTP clients, moving logic to a different class, and updating tests to ensure proper handling of session blocking status codes and error codes that require retries or retires. This was a significant change due to the impact on multiple components such as Session, SessionPool, PlaywrightCrawler, and HttpCrawler details here.

Session Cookie Management: The approach to handling cookies in a session was changed from using a plain dictionary to a more sophisticated SessionCookies class incorporating CookieJar. This supports basic cookie parameters and multiple domains, requiring extensive updates to tests and support for multi-domain scenarios details here.

Fingerprint Integration: Integration of the browserforge package to enable fingerprint and header generation in PlaywrightCrawler was implemented. This added significant functionality to enhance the crawling process by using generated fingerprints details here.

These complex changes involved substantial modifications to multiple parts of the codebase, including handling complex data structures, refactoring logic spread across different modules, and careful testing to ensure stability.

How does it work

This actor defines one main AI agent that is responsible for processing the prompt and return desired output. It uses one tool that gets the commit summaries for the main agent.

The tool for getting the commit summaries is responsible for suggesting the relevant time scope of the prompt, getting the raw commit messages in the relevant time scope and prefilter the commits based on whether they seem relevant for the main prompt or not. It is using two different AI agents through what is described in PydanticAI documentation as programatic agent hand-off:

Agent responsible for suggesting time scope of the prompt.
Agent responsible for deciding whether individual commit is relevant for the prompt.

Frequently Asked Questions

Is it legal to scrape job listings or public data?

Yes, if you're scraping publicly available data for personal or internal use. Always review Websute's Terms of Service before large-scale use or redistribution.

Do I need to code to use this scraper?

No. This is a no-code tool — just enter a job title, location, and run the scraper directly from your dashboard or Apify actor page.

What data does it extract?

It extracts job titles, companies, salaries (if available), descriptions, locations, and post dates. You can export all of it to Excel or JSON.

Can I scrape multiple pages or filter by location?

Yes, you can scrape multiple pages and refine by job title, location, keyword, or more depending on the input settings you use.

How do I get started?

You can use the Try Now button on this page to go to the scraper. You’ll be guided to input a search term and get structured results. No setup needed!