The Most Important Exchange of the Zuckerberg Hearing

By Alexis C. Madrigal,
The Atlantic

| April 12, 2018

The Facebook CEO’s defense of data collection is slipperier than it seems.

In his second day of congressional hearings, Mark Zuckerberg began the proceedings in the House of Representatives on Wednesday with an identical opening statement to the one he gave in the Senate on Tuesday.

But from that point forward, the proceedings went in a very different direction. The House members were much more aggressive and more pointed in their questioning, repeatedly cutting off the Facebook CEO so he couldn’t “filibuster,” as Representative Marsha Blackburn put it. Representatives from both parties came back time and again to what Facebook knows, what Facebook tells users about what it knows, and what Facebook lets advertisers do with what it knows.

One particular exchange, with Representative Joe Kennedy III, got to the crux of why it was so hard to pin down Zuckerberg on the extent of Facebook’s data-gathering operation.

Throughout the hearings, Zuckerberg fell back on a standard defense about the platform: Facebook users own their data, and therefore have “complete control” over the information that Facebook holds about them. It’s true that users can shape their digital representation, including (some things about) how they are targeted on Facebook. It’s true that users can download (most of) the content they have entered (in a difficult-to-transport, mishmash format). And it also also true that users can delete their accounts, and the data in them, from Facebook.

But Zuckerberg’s standard response is slipperier than it seems. Kennedy pushed Zuckerberg on how accurate that representation really is, though struggled to frame the question precisely enough to pin Zuckerberg down: “Do the advertisers that are using your platform ... get access to information that the user doesn’t actually think is either, one, being generated, or, two, is public?” he asked. “One of the challenges with trust here is that there is an awful lot of information that’s generated that people don’t think that they’re generating and that advertisers are being able to target because Facebook collects it.”

Many people in the House asked about web-browsing data from beyond Facebook’s site and apps, for example. But Facebook collects much more than that. When website operators install a tool called the Facebook pixel, Facebook can track things with even more resolution than a single URL. This is how Facebook describes the tool to marketers:

When someone visits your website and takes an action (for example, buying something), the Facebook pixel is triggered and reports this action. This way, you’ll know when a customer took an action after seeing your Facebook ad. You’ll also be able to reach this customer again by using a custom audience.

That set of data—all the times a user has triggered a Facebook pixel—is something that Facebook stores and attaches, in some form, to a user’s profile. Zuckerberg said in a later exchange that “we only store [web logs] temporarily and we convert the web logs into a set of ad interests that you might be interested in.”

If that is an accurate description of the system, it’s quite protective of privacy. At the very least, it is not the worst-case scenario one could imagine, and might be a good compromise from Facebook’s and many users’ perspectives.

But the raw data that Facebook uses to create user-interest inferences is not available to users. It’s data about them, but it’s not their data. One European Facebook user has been petitioning to see this data—and Facebook acknowledged that it exists—but so far, has been unable to obtain it.

When he responded to Kennedy, Zuckerberg did not acknowledge any of this, but he did admit that Facebook has other types of data that it uses to increase the efficiency of its ads. He said:

My understanding is that the targeting options that are available for advertisers are generally things that are based on what people share. Now once an advertiser chooses how they want to target something, Facebook also does its own work to help rank and determine which ads are going to be interesting to which people. So we may use metadata or other behaviors of what you’ve shown that you’re interested in News Feed or other places in order to make our systems more relevant to you, but that’s a little bit different from giving that as an option to an advertiser.

Kennedy responded: “I don’t understand how users then own that data.”

This apparent contradiction relies on the company’s distinction between the content someone has intentionally shared—which Facebook mines for valuable targeting information—and the data that Facebook quietly collects around the web, gathers from physical locations, and infers about users based on people who have a similar digital profile. As the journalist Rob Horning put it, that second set of data is something of a “product” that Facebook makes, a “synthetic” mix of actual data gathered, data purchased from outsiders, and data inferred by machine intelligence.

With Facebook, the concept of owning your data begins to verge on meaningless if it doesn’t include that second, more holistic concept: not just the data users create and upload explicitly, but all the other information that has become attached to their profiles by other means.

But one can see, from Facebook’s perspective, how complicated that would be. Their techniques for placing users into particular buckets or assigning them certain targeting parameters are literally the basis for the company’s valuation. In a less techno-pessimistic time, Zuckerberg described people’s data in completely different terms. In October 2013, he told investors that this data helps Facebook “build the clearest models of everything there is to know in the world.”

Facebook puts out a series of interests for users to peruse or turn off, but it keeps the models to itself. The models make Facebook ads work well, and that means it helps small and medium-size businesses compete more effectively with megacorporations on this one particular score. Yet they introduce new asymmetries into the world. Gullible people can be targeted over and over with ads for businesses that stop just short of scams. People prone to believing hoaxes and conspiracies can be hit with ads that reinforce their most corrosive beliefs. Politicians can use blizzards of ads to precisely target different voter types.

As with all advertising, one has to ask: When does persuasion become manipulation or coercion? If Facebook advertisers crossed that line, would the company even know it? Dozens of times throughout the proceedings, Zuckerberg testified that he wasn’t sure about the specifics of his own service. It seemed preposterous, but with billions of users and millions of advertisers, who exactly could know what was happening?

Most of the ways that people think they protect their privacy can’t account for this new and more complex reality, which Kennedy recognized in his closing remark.

“You focus a lot of your testimony ... on the individual privacy aspects of this, but we haven’t talked about the societal implications of it ... The underlying issue here is that your platform has become a mix of ... news, entertainment, and social media that is up for manipulation,” he said. “The changes to individual privacy don’t seem to be sufficient to address that underlying issue.”

NEXT STORY: Bots Share the Majority of Links on Twitter

Future-Ready Workforce

Health Tech

The Facebook CEO’s defense of data collection is slipperier than it seems.