The Security Ledger Podcast

The Security Ledger Podcast


Episode 211: Scrapin’ ain’t Hackin’. Or is it?

April 16, 2021

In just the last two weeks, three of the world’s most prominent social networks have been linked to stories about data leaks. Troves of information on both Facebook and LinkedIn users – hundreds of millions of them – turned up for sale in marketplaces in the cyber underground. Then, earlier this week, a hacker forum published a database purporting to be information on users of the new Clubhouse social network. 

Andrew Sellers is the Chief Technology Officer at QOMPLX Inc.

To hear Facebook, LinkedIn and Clubhouse speak, however, nothing is amiss. All took pains to explain that they were not the victims of a hack, just “scraping” of public data on their  users by individuals. Facebook went so far as to insist that it would not notify the 530 million users whose names, phone numbers, birth dates and other information were scraped from its site.

So which is it? Is scraping the same as hacking or just an example of “zealous” use of a social media platform? And if it isn’t considered hacking…should it be? As more and more online platforms open their doors to API-based access, what restrictions and security should be attached to those APIs to prevent wanton abuse? 

To discuss these issues and more, we invited Andrew Sellers into the Security Ledger studios. Andrew is the Chief Technology Officer at the firm QOMPLX* where he oversees the technology, engineering, data science, and delivery aspects of QOMPLX’s next-generation operational risk management and situational awareness products. He is also an expert in data scraping with specific expertise in large-scale heterogeneous network design, deep-web data extraction, and data theory. 

While the recent incidents affecting LinkedIn, Facebook and Clubhouse may not technically qualify as “hacks,” Andrew told me, they do raise troubling questions about the data security and data management practices of large social media networks, and beg the question of whether more needs to be done to regulate the storage and retention of data on these platforms. 

(*) QOMPLX is a sponsor of The Security Ledger.