Skip to main content

Quantifying the user value of social media data (2021)

November 01, 2021

authors:

  • Avinash Collis, The University of Texas at Austin
  • Alex Moehring, Massachusetts Institute of Technology
  • Ananya Sen, Carnegie Mellon University
  • Alessandro Acquisti, Carnegie Mellon University

This blog article is derived from the authors’ paper titled “Information Frictions and Heterogeneity in Valuations of Personal Data” (updated in September 2022), a project of the Economics of Digital Services (EODS) research initiative led by Penn’s Center for Technology, Innovation & Competition (CTIC) and The Warren Center for Network & Data Services. CTIC and The Warren Center are grateful to the John S. and James L. Knight Foundation for its major support of the EODS initiative.

__________

Personal data create significant value for digital platforms as input facilitating algorithmic targeting or as general source of revenue. Due to concerns over the potentially unequal allocation of that value between data holders (the platforms) and data subjects (the users), proposals to share data-based revenues with consumers have emerged. California Governor Gavin Newsom has proposed data “dividends” to compensate the consumers who create the online footprints. Academics have argued that users’ online data should be viewed as “labor” and compensated accordingly.

The design of feasible and efficient frameworks for data dividends or data markets, however, faces a challenge. Whereas platforms can quantify and assess the value of users’ data to them, users face a much more significant hurdle in pinpointing the value of data to themselves. Two decades of empirical research on consumer valuation of data and data privacy across disciplines such as economics, marketing, and computer science have highlighted that individuals’ valuations of personal data are highly uncertain, context-dependent, and—crucially—marred by endemic problems of asymmetric information: not only do consumers rarely know how their data are used, but they often lack information on the value that other entities extract from their data or the costs they may ultimately bear when their data are misused.

In this study, we looked at how much consumers value their personal data on Facebook and how their valuations change under the influence of real-world informational interventions. We found that the median Facebook user values their data at $750. Through a randomized experiment, we also found evidence that consumers respond to information interventions giving consumers a signal about the value of their data from previous settlements and Facebook’s revenue projections.

Methodology and analysis

We examined the way consumers’ valuations of personal data change under the influence of real-world informational interventions by using an incentive compatible mechanism to capture experimental participants’ willingness to share their actual social media data for monetary compensation. First, we documented how users value their Facebook data and how these valuations differ across user characteristics by asking individuals how much money we would have to pay for them to share all of their Facebook data with us. This includes not only their public profile but also their pictures and private messages that contain potentially more sensitive information. Next, we analyzed whether these valuations are “sticky” or whether they can respond to real world information. We randomly assigned individuals to information treatments based on accurate data about Facebook revenue projections as well as data breach settlements and analyzed how such information affects valuations as a function of individuals’ traits. The revenue treatment highlights Facebook’s expected revenue in the coming years and the settlement treatment highlights payments from a recent data breach settlement involving Facebook. After providing these information treatments, we allowed individuals to revise their valuations if they wish to do so.

We used two samples of respondents to measure data valuations. The first came from a nationally representative sample of Internet users in the United States recruited in collaboration with YouGov (“YouGov sample”). The second came from the Data Dividend Project (DDP), a data advocacy group started by former Democratic presidential candidate Andrew Yang (“DDP sample”). Members of the DDP are those who are interested in being part of a movement that ensures that technology companies share a part of their revenue when they monetize data and believe in digital privacy as a fundamental right. This sample provides a rare insight into a select sample of individuals who would not be found in online survey samples exactly because of their views on data and privacy. The two samples complement each other, allowing us to paint a broad and comprehensive picture about data valuations across different types of individuals.

We found that the distribution of valuations before the information treatments is bi-modal, with valuations clustered at less than $250 and over $10,000, and a median Willingness to Accept (WTA) of $750 for Facebook data in the YouGov sample (Figure 1). Not only are valuations highly dispersed, but there is substantial heterogeneity along demographic traits. We documented a clear divide in data valuations across race, gender, and income. For instance, the median valuation for a Black user is $500 while it is $1000 for a White user; the median is $600 for a female user relative to $1000 for a male user. Additionally, the WTA is monotonically increasing in income.

Figure 1

The median valuation for the DDP sample is higher at $1000. The consistency across the two samples provides external validity and credibility to our estimates. Additionally, the variance in valuations is much lower for the DDP sample than the YouGov users, suggesting that more information and knowledge associated with this issue could reduce dispersion in consumers’ valuations of personal data.

Following the information treatments, close to one third of participants in the YouGov sample revised their data valuations. The probability of revision is highly asymmetric, with individuals whose WTA < $400 driving the effect with a 53% probability of revision. Furthermore, 98.2% of the individuals who updated their valuations did so by revising up to a higher valuation. This led to a reduction in dispersion in data valuations but only by increasing the valuations of low valuation individuals—in which women, low income, and Black participants are overrepresented (Figure 2). Further, we found that this revision of valuations is significantly larger in response to the data settlement treatment relative to the revenue treatment suggesting that information frictions related to privacy could partially explain low valuations in the literature. We found very similar results for the DDP sample with 28.8% of individuals revising their valuations, which was again predominantly driven by those with baseline WTA < $400 revising up. A textual analysis of participant responses to a question asking why they did or did not revise their valuations provided suggestive evidence that the information treatments induced people to revise by updating beliefs about the value of their data. Moreover, the settlement treatment also induced slightly more responses suggesting concern about misuse of data by firms.

Figure 2

The findings highlight that while consumer valuations of personal data are highly heterogeneous and dispersed, heterogeneity can partly be decreased by informational treatments leading to an overall reduction in dispersion. Dispersion in valuations, however, persists following informational treatments, suggesting that consumers’ valuations of their data are a composite of objective factors (including knowledge of the market value of data) and inherently and deeply subjective ones. Information frictions matter especially for low valuation individuals and interventions aimed at countering information asymmetry in the markets for data can help individuals coordinate on a price.

Policy implications

The valuations elicited in this study are important from a policy and managerial perspective. Policymakers need to account for how users perceive their own digital footprint and how these might vary systematically across different demographic, especially under-represented, groups. Firms also need to price products to consumers or to provide access to data brokers and advertising services. This has become increasingly important in the face of personalized pricing and advertising strategies by digital firms. In addition, to the best of our knowledge, we are among the first to understand how individuals’ data valuations respond to real information about data valuations that are increasingly available.