Carl Sednaoui

Getting ~28,000 devs email addresses in couple hours

TLDR: It is really easy to scrape ~28,000 profile email addresses from GitHub. If you don’t want unsolicited emails then I’d recommend changing your public email address to your Twitter handle.

The backstory

New York City was set to a halt this past Sunday night due to hurricane Sandy. As it turns out, rare are the occasions when I find myself at home, with free time and with energy left… so, as any good dev-in-training would, I decided to pop open a terminal window. While refactoring some code for CourseBacon (which you should totally go visit if you’re looking to learn something new) I suddenly felt an urge to do something a bit more exciting than working on old code.

Being an avid HN follower, I’ve noticed that most devs tend to hate (almost) every recruiter out there - and with good reasons. On the same note, I recently got an email from a good friend saying that he just received a crappy recruiter email that had nothing to do with his skill set. He also mentioned that the recruiter got his email address via GitHub.

My friend was extremely surprised when I told him that he can remove his public email address directly from his GitHub profile settings by simply leaving the “Email” field blank (I choose to show my Twitter handle - more on that later).

The fun begins

As the gears started turning in my head I wondered how many devs were making the same “mistake” as my friend - leaving their email address exposed when they really didn’t mean to.

After taking a quick look at some GitHub search results and glancing at their API I came up with the following code (EDIT: the gist has been deleted). I let this bad boy run for couple hours (was grabbing drinks with the neighbors at this point) and, next thing you know, here’s the result:

Yup, that’s right, over 28,000 email address straight from GitHub. There is absolutely nothing advanced about this code and, as you can see, anyone can create a “small” database with over ~28,000 devs email addresses. If you want to get fancy you can also get each user’s number of repos and followers and so on.

This is absolutely shocking when you think about it! Since when it is so easy to get ~28,000 email addresses of people making an average 70k+? As a matter of fact, what email addresses are you looking for? Some @google.com emails? Got it! Some @facebook.com emails? You bet.

Note I will NOT be sharing this list with anyone. Also, I understand that there is a possibility that many (ok, lets say most) people on this list have chosen to leave their email address visible because they want to engage with the community. But, if you hate unsolicited mail as much as I do, I would advise that instead of having your email publicly available you leave your Twitter handle. People can engage with you through Twitter and, if needed, ask for your email address then and there. By doing this you’ll get zero unsolicited emails from the GitHub vultures.

Just like you, I really, really hate spam. But, before complaining about how annoying unsolicited emails are, I’d like to challenge you to see if there is anything YOU can do to proactively protect your inbox (that’s right, protect it).

To all my NYC folks, I hope that you’re safe and that (unlike me) you do have power and phone signal in your area.

I like to Tweet about marketing and programming, you should follow me.