The Intersection of Language X with Tool Y

The Intersection of Language X with Tool Y

If

  • you like language X and are learning tool Y,

  • or you're learning language X, and you like tool Y

I recommend examining a successful repo that uses both. Then at least something will be familiar. Unfortunately, Github's advanced search doesn't make this an easy thing to search for.

So I'm going to hack together something awful and publish it. Then, surely, a solution will come along that makes my approach seem embarrassingly bad.

What follows is an embarrassingly bad way to search GitHub.

If at first you don't succeed...

I like nix and I'm dabbling in go, so today I want go projects with a flake.nix file in them.

GitHub claims to support searches like this. It said I should use this syntax: path:/flake.nix language:Go

Then it said,

Your search did not match any code

a dubious claim.

...hack together something awful

First, we need a large list of projects in our desired language. So I found a list of go projects and cloned it. (I did this previously with nim, and in that case, I used the repo for its package manager).

Then I found the GitHub URLs

❯ egrep -o -R 'github.com/[a-zA-Z0-9_\-]*/[a-zA-A0-9_\-]*' | grep -v awesome | sed 's/^.*://' | sort | uniq

Here they are:

github.com/0xcafed00d/joystick
github.com/0xERR0R/blocky
github.com/1set/cronrange
github.com/1set/gut
...

Except there are 2383 of them. Here's the ugly part, I'm going to send out two HTTP requests for each of these, one for main and one for master:

  • https://raw.githubusercontent.com/{owner}/{repo}/master/flake.nix

  • https://raw.githubusercontent.com/{owner}/{repo}/main/flake.nix

That's 4733 requests.

Most will fail, but some will succeed. The successes will indicate which repos contain a flake.nix file at their root--which is a good indicator that they're using nix in the way that I want.

To do this, I added a while loop which checks each of these for a flake.nix

❯ egrep -o -R 'github.com/[a-zA-Z0-9_\-]*/[a-zA-A0-9_\-]*' | grep -v awesome | sed 's/^.*://' | sort | uniq | while read pkg
do
    rawpkg=$(echo $pkg | sed 's#github.com#https://raw.githubusercontent.com#')
    echo $pkg
    echo -n '    '
    for branch in master main
    do
        curl -si $rawpkg/$branch/flake.nix  | head -1 | grep 200
    done
    echo
done

Which, after 45 minutes, produces output like this:

github.com/0xcafed00d/joystick
github.com/0xERR0R/blocky
...
github.com/JoelOtter/termloop
github.com/joerdav/xc
    HTTP/2 200
github.com/JohannesKaufmann/html-to-markdown
...
github.com/sadlil/go-trigger
github.com/sagikazarmark/modern-go-application
    HTTP/2 200
github.com/SaidinWoT/timespan
...

Of the go projects listed, I found two with flake.nix. That's two more than Github's search found.

(Conveniently, one of them had this file, which happens to solve the problem that sent me searching in the first place.)

It could be better

This sucks. It creates a lot of web traffic for a single search. It takes forever. It might even be against GitHub's rules. You probably shouldn't do it.

But it works. And so far I've found it helpful.

It could be made less bad on my end, but the "right" way to solve this problem is for GitHub to fix its search tool. 🤞 they do. Until then, I'm sharing my awful workaround. Enjoy!