Researchers warned for years that Facebook's phone number lookup tools were ripe for abuse. Then 500 million users' data was stolen. (FB)

mark zuckerberg facebookSven Hoppe/picture alliance via Getty Images

Summary List Placement

Facebook fixed a security vulnerability in August 2019 that had allowed hackers to scrape more than 500 million users' data from its app.

But the company had been warned about the risks of the tools than enabled the data exfiltration years before it finally took action.

And when security researchers reported an almost identical vulnerability on Facebook-owned WhatsApp only a month after that fix, Facebook initially denied it was even a problem — before taking months to tackle the problem.

Insider spoke to security researchers and reviewed academic literature that showed a Facebook that was repeatedly cautioned as early as 2012 about the risks of data-scraping via its various phone number look-up tools. Last week, Insider reported that 533 million users had their personal data stolen from the social network through those tools and posted online. That Facebook data included phone numbers, locations, and other personally identifiable information.

These incidents illustrate how features core to Facebook's products have enabled data misuse and raise questions about how the company assesses the severity of threats to user data as well as how it communicates these issues across its various products.

In a statement, Facebook spokesperson Joe Osborne said: "Over the years, we've worked with the security community to improve measures that protect people's privacy, no matter how unlikely the threat. As shown with LinkedIn, no company can completely eliminate scraping, but we can all strengthen our defenses. WhatsApp has made server-side changes to help prevent the possibility of large-scale crawling exercises like those outlined in the report. And Facebook has made changes to the contact importer feature to make scraping more difficult."

Half a billion users' data stolen

The issue centers around how Facebook allows users to search for contacts via their phone numbers — and particularly through its "contact importer" tool.

This allows users to upload their contact books containing phone numbers of their friends and contacts, to see if they're on the app, view information about them, and add them as friends. Versions of the tool have existed for both Facebook and WhatsApp.

Attackers could, researchers said, create a fake contact book with hypothetical numbers in it, and then harvest the personal data of the real people who these numbers happened to correspond to. 

Sometime before August 2019, exactly this happened. An unidentified attacker stole personal data of 533 million people from Facebook, and Insider's Aaron Holmes first reported last week that this data is now freely available online. After the company detected malicious activity in 2019, it made changes to the feature in August of that year to prevent it, the company wrote in a blog post this week. It's unclear if this high-profile theft specifically was what sparked the fix.

But long before the 2019 attack, these security risks were well-known among the industry. Researchers had written extensively about how automated "scraping" techniques could be used to harvest personal data from online platforms. And at least as early as 2012, academics cautioned that phone number look-up tools specifically were open to abuse, and successfully demonstrated how such attacks worked on both WhatsApp and Facebook. 

An academic paper prepared for the Network and Distributed System Security Symposium in 2012 evaluated the security of WhatsApp, two years before it was acquired by Facebook. The researchers found that by generating random phone numbers, they were able to upload an address book of 10 million potential numbers — every number in the San Diego, California, area code — to the messaging app, which yielded 21,095 numbers that used WhatsApp, as well as their "About" bios. 

Five years later, a separate team of researchers warned that a different one of Facebook's tools for looking up phone numbers was vulnerable. They used the social network's search tool and hundreds of thousands of random numbers to obtain more than 80,000 users' data, including  "friends, current city, home town, education, family, work and relationship." Their findings were published in ISPEC 2017: Information Security Practice and Experience.

One way to tackle these issues is with rate-limiting: Tools that automatically detect when a user is engaged in an unusually high number of queries or searches, and then stopping or slowing them.

"The existence of scraping attacks on both WhatsApp and Facebook is not itself a surprising revelation," Christoph Hagen, a researcher who investigated WhatsApp in 2019, told Insider. "It has been done in the past."

"It's more surprising that it was still relatively easy when we tried it," he said.

Facebook still downplayed the issue, researchers say

In 2019, one month after Facebook tackled the contact lookup issue on its core app, Hagen and another researcher got in touch with the company.

They had been investigating Facebook-owned messaging app WhatsApp, and found that it was still vulnerable to large-scale contact data-scraping. Hagen, a research assistant at the University of Würzburg, and Christian Weinert, a doctoral researcher at TU Darmstadt, were able to test 50 million phone numbers — roughly 10% of all US numbers — and identify more than 4.5 million WhatsApp users. (WhatsApp had at this point set some limits on the maximum number of contact lookups in a span of time, but this was 60,000 per day, so the researchers were still able to check millions of numbers in a little over a month.)

They could obtain users' profile pictures and "About" bios, which could be mined for info about users' identities. (Unlike the 2019 Facebook data leakage, it did not include full names, dates of birth, or location, and it also did not include messages sent through WhatsApp. However, the info could be combined with other leaked datasets floating around the internet to build up comprehensive profiles on millions of people.)

They wrote to Facebook in September 2019 to alert the company to their findings. "It exposes WhatsApp's users to spam callers and fraud, especially those with public profile pictures. In many cases the profile pictures and user status will reveal additional information, such as name, gender, age, language, relationship status, preferences, faith, or nationality," they wrote in an email viewed by Insider. "This is especially troubling for any minors registered with WhatsApp, who might be exposed to malicious parties."

Facebook's initial response was to dismiss the warning. A security staffer replied to the researchers that "much of what you describe seems like intended behavior." The company then closed the report, adding: "This isn't something we'd consider valid under our bug bounty program. There are legit use cases where a user might want to upload many contacts (for example, an enterprise company may have over 200,000 employees)."

Weinert said they managed to connect with a Facebook security staffer separately, which led to Facebook re-opening its report for further investigation a month later, in late October. 

A spokesperson for the company disputed the characterization that it dismissed the German researchers' report, despite records reviewed by Insider that make clear Facebook closed the researchers' report until they chased it up.

Facebook ultimately made changes to its systems in 2020 to better detect and prevent large-scale scraping on WhatsApp. The exact timeframe for that fix isn't clear: A Facebook spokesperson said it was fixed in early 2020, but couldn't give a specific date. On February 28, 2020, Facebook's security team had told the researchers via email that "the effort is still ongoing" and it was "aiming to implement [the] changes in the next few months." The company asked for repeated delays to the publication of the researchers' planned paper on their findings, and finally told them in July 2020 they had fixed the issue, almost 10 months after their ticket was first submitted. 

'Not considered critical'

It's not clear why Facebook, when it worked to address the issue on its core app in August 2019, did not also work to fix it on the other platforms it owns, like WhatsApp, but instead downplayed researchers' concerns. Weinert said that "given Facebook's official description ... the issue is exactly the same (i.e., insufficient rate limits / data scraping protection measures on the contact discovery API)."

It's also not immediately apparent why Facebook took an extended period of time to address it on WhatsApp when it knew that attackers had already found and exploited the equivalent issue on its core app. A Facebook spokesperson said the company couldn't confirm whether malicious attackers had exploited the WhatsApp weakness, but they thought it was unlikely. They later added the company hadn't seen evidence of this.

"Given that in any case it took at least a couple of months before they deployed any fixes tells me this issue was not treated with highest priority and not considered critical," Weinert wrote in an email to Insider. But, he added, Facebook's security team may have been prioritizing more severe security threats, and "designing and deploying effective anti-scraping protections without disturbing actually legitimate use on such a large production system is not something that can happen overnight."

Weinert and Hagen went on to publish their findings in September 2020. Facebook awarded them $5,000 via its "bug bounty" program for responsible disclosure of security issues, which they donated to the advocacy group Electronic Frontier Foundation. The researchers also identified similar issues in rival messaging apps Telegram and Signal, which also made changes to their systems in response.

Data scraping is an industry-wide dilemma

Data scraping isn't a problem unique to Facebook; numerous other tech companies have struggled with how best to handle it over the years.

Still, it has caused repeated headaches for the social networking giant.

In 2019, Insider reported that one of Facebook's trusted and vetted "marketing partners," the startup Hyp3r, was exploiting vulnerabilities on Instagram to harvest millions of users' data, posts, and locations. In response, it issued a cease and desist against Hyp3r and ultimately launched a large-scale review of hundreds of its partners.

And in September 2020, Insider reported on how WhatsApp-tracking apps are exploiting the messaging app to monitor when users are active, when they're sleeping, and who they're likely talking to on the app. 

Got a tip? Contact Insider reporter Rob Price via encrypted messaging app Signal (+1 650-636-6268), encrypted email (robaeprice@protonmail.com), standard email (rprice@businessinsider.com), Telegram/Wickr/WeChat (robaeprice), or Twitter DM (@robaeprice). We can keep sources anonymous. Use a non-work device to reach out. PR pitches by standard email only, please.

Documents can also be mailed to Business Insider's San Francisco offices at: Rob Price, Business Insider, 535 Mission Street, 14th Floor, San Francisco, CA 94105, USA

NOW WATCH: Sarah McBride made history becoming the first openly trans person elected to a state Senate seat. In 2018, she explained why the Trump administration wouldn't discourage her work.

See Also:

Data & News supplied by www.cloudquote.io
Stock quotes supplied by Barchart
Quotes delayed at least 20 minutes.
By accessing this page, you agree to the following
Privacy Policy and Terms and Conditions.