Filter 'Active On GitHub, Not Cloned' By Recency

by Jule 49 views
Filter 'Active On GitHub, Not Cloned' By Recency

Guys, let's talk about how we can make our GitHub experience more relevant and up-to-date!

Current Behavior

Currently, when we fetch active repositories, we're getting a mix of recent and old ones. This can be misleading, as it shows repos that haven't been active in a long time as 'active on GitHub, not cloned'. We have the data to filter by recency, but we're not using it.

What We Want

We want to treat 'active' as 'pushed in the last N days'. Let's default to 90 days and make it configurable via an environment variable. Setting it to 0 could disable the filter (current behavior).

Where the Filter Goes

There are two reasonable spots to apply this filter:

  1. At insert time - Drop stale rows before they hit the active_remote_repos table. This keeps the table small and the rest of the codebase oblivious to the cutoff.

  2. At query time - Filter in get_active_repo_by_full_name and whatever populates the dashboard panel. This lets the user change the recency days and have it take effect on the next page load.

Option 2 is probably nicer, as it matches the README's promise of rebuilding the schema on every process start.

Adjacent: Also Exclude Archived/Fork-Only

We should also exclude archived repos by default, as they're explicitly not active. For forks, we can keep them visible by default, maybe with a UI tag so the user knows it's a fork.

Let's make our GitHub experience more relevant and useful by implementing these changes! Remember, the goal is to show users the repos that are truly active and relevant to them.

Happy coding!