Developers Keep Leaving Secret Keys to Corporate Data Vulnerable on GitHub

Gil C/

The hackers who stole data on 50,000 Uber drivers in 2014 didn’t have to do much hacking at all.

The hackers who stole data on 50,000 Uber drivers in 2014 didn’t have to do much hacking at all. They got into the company’s database using login credentials they’d found on GitHub, the code-sharing website used by more than 14 million developers. An Uber employee had uploaded the credentials to GitHub by accident, and left them on a public page for months.

For years, developers have been inadvertently publishing credentials that grant access to myriad systems, such as databases, web hosting accounts, encrypted email, and various apps. It’s an easy mistake to make that can lead to catastrophic breaches, particularly when the credentials can unlock systems that are crucial to business functions.

In a blog post published last week, the security firm Detectify said it analyzed public GitHub repositories and found more than 1,500 unique “access tokens” that could be used to retrieve private messages from Slack—the popular office messaging app that many companies rely on as their primary communication platform.

“These tokens belong to different users and companies,” the firm said in its post, adding that some of the tokens were linked to “Forbes 500 companies, payment providers, multiple Internet service providers and health care providers."

To make matters worse, many of the credentials turned up in the firm’s search functioned “exactly like having the complete username and password for the user,” according to the blog post. “Even for a user with two factor authentication enabled, you can still access Slack with nothing else but this token.”

When developers accidentally publish these kinds of credentials, they’re often working on public, open-source code projects. In the case of Slack, those projects could be things such as chat bots, which interact with users inside the messaging app. The code behind such bots might not be sensitive, but the credentials that allow the bot to connect to a particular Slack group are.

The problem is, when developers publish their bot code to GitHub in order to share it with others, they sometimes forget to remove the credentials they’ve hard-coded into the project.

All-powerful access tokens, like the ones Slack gives out to developers, are common among most modern, consumer-facing apps. Google, Facebook and Twitter all give out such tokens so that developers can access their respective APIs. The trouble is, developers may very well accidentally upload their access credentials to public pages—and they frequently do.

And at any given time, hackers may have programs running that crawl GitHub, searching for exposed credentials. So even if a developer happens to catch the mistake quickly, it can be too late.

The issue has become so widespread for Amazon’s web services division, for example, that the company now constantly monitors public repositories on GitHub, looking for access keys that its customers have inadvertently published. When Amazon finds the keys, it emails its customers to let them know they’re at risk.

Programmers have also been called out for accidentally publishing their own private encryption keys, which are created to send and receive secure messages but only work if they’re kept secret. As early as 2013, links to GitHub searches revealing published encryption keys were spreading across social media, and many are still on GitHub today.

After Uber’s data breach in 2014, in which the names and driver’s license numbers of 50,000 of its employees were stolen, the company filed a “John Doe” lawsuit against the unknown thief. By filing the suit, Uber was able to convince a California district court to issue subpoenas to GitHub and Comcast.

The GitHub subpoena ordered that the company release the IP addresses of everyone who visited any page on its website where Uber’s login credentials had been published. The Comcast subpoena required that the company turn over the names and addresses of the people linked to those IP addresses.

GitHub said it could not comment on the matter, but it did apparently turn over the IP addresses to Uber. Citing anonymous sources, Reuters reported last October that one of the IP addresses had been traced back to Chris Lambert—the chief technology officer of Uber’s biggest rival, Lyft.

By December, the US Justice Department had launched an investigation into the data breach, but has not said whether it’s looking into Lyft’s possible involvement. In a statement sent by email, Lyft said it was not in any way involved in the data breach.

“We investigated this matter long ago,” the statement said, “and there is no evidence that any Lyft employee downloaded the Uber driver information or database, or had anything to do with Uber’s May 2014 data breach.”