
authors:
- Tin Cheuk Leung, Associate Professor of Economics, Wake Forest University
- Koleman Strumpf, Professor of Economics, Wake Forest University
__________
This blog article is derived from the authors’ research paper titled All the Headlines that Are Fit to Change, a project of the Economics of Digital Services initiative led by Penn’s Center for Technology, Innovation & Competition (CTIC) and The Warren Center for Network & Data Services. CTIC and the Warren Center are grateful to the John S. and James L. Knight Foundation for its generous support of the EODS initiative.
__________
There are growing concerns about increased polarization of U.S. voters. Many have traced the growing divide to online sources such as social media or recommendation algorithms that create echo chambers that reinforce political beliefs. Yet another channel may be news providers. Some have argued that news sites seek specific political niches of readers and that this focus has led to more slanted coverage and less exposure to countering viewpoints.
In this project, we explore the second channel through a novel tactic that became possible when news moved onto the Internet (digitization). Newspapers can now test and alter the headline of an article without changing any of the story text or the web address. A concern is that this tool could be used to politicize a story or could occur when there is some sort of social pressure because the initial headline is not strident enough. For example, in March 2020 the headline of a New York Times article discussing the failure of President Trump’s COVID relief bill to get Congressional approval was changed three times with each iteration being more favorable to Democrats (in response the president unleashed a series of admonishing tweets).
The two goals in the project are understanding the reasons for headline changes (which articles? can they be linked to external pressure?) and the implications of the changes (do they lead to more biased articles? do they impact the popularity of the articles?).
We investigate headline change strategies by focusing on one of the leading U.S. newspapers, The New York Times (NYT), and show it extensively uses headline changes. Unlike with changes in text such as corrections for factual errors, the newspaper does not provide any way for readers to know such a headline change occurred (though it has disclosed that it has used the general practice of headline changes). While our work focuses on one newspaper, such headline changes are a common industry strategy thanks to external companies that provide technical and consulting expertise to a large proportion of both large and small media sites.
From February through June of 2021, we scraped the NYT homepage every five minutes, collecting a list of articles on the homepage, the headline it uses, and other article attributes. We also downloaded data from the NYT API, which provides additional metadata and a ranking of the popularity of articles on the site. For every article in our database, we use Twitter’s API to collect data for every tweet mentioning the article. We use the Twitter data to calculate an absolute measure of article popularity, the political beliefs and geographic location of users tweeting the articles, each tweet’s sentiment (favorable or unfavorable) and slant (pro-Democrat or pro-Republican), as well as the social pressure that the newspaper receives.
The NYT employs two kinds of headline change strategies. One approach is to simply change to a new headline. A second approach is to use an A/B test, in which two headlines are compared by showing them separately to observationally equivalent readers and then after a test period picking a headline that is more successful under some criterion. In our data 14% of articles get an immediate headline change, and 6% get an A/B test (the randomization used to determine which headline a reader gets is thought to be based on the user-agent sent to the site, so our scraping software uses a random user-agent in each scraping pass; we identify an A/B test when we observe more than one headline switch within an hour).
We found that articles with or without headline changes (of both types) are different in various ways. First, articles that had headline changes are more popular, both in terms of the likelihood of being in the NYT’s top 20 in views, shares, or emails, and in terms of the number of tweets mentioning the article. These articles are more unfavorable, in terms of sentiment, in their headlines. They also generate more unfavorable tweets than articles with no headline change. Headline change articles are more likely to be news as opposed to softer topics and tend to be in the U.S. section of the newspaper.
The Twitter data give a sense of the geographic coverage of the newspaper. Almost two-thirds of tweets mentioning a NYT article are from a U.S.-based user (the next ranked countries are UK, Canada, India, and France, which together account for an eighth of all NYT tweets). Within the United States, New York and California account for almost a third of the tweets. At the city-level, users in New York City, Los Angeles, and Chicago are about a third of the tweets. But some cities have an outsize role relative to their population, most notably Washington, D.C., which is seven percent of all tweets (the city-statistics are limited to relatively large cities due to challenges in identifying smaller ones).
Our preliminary results suggest that external pressure might be a factor for headline changes. In particular, we observe that in the hours leading up to a headline change, an article would generate more unfavorable tweets and more slanted tweets from accounts that the reporter of the article follows. This is consistent with the newspaper responding to and changing headlines when there is extensive external pressure, as measured by unfavorable tweets. Headline changes are also more likely when there is greater divergence between the sentiment of the initial headline and the article abstract, suggesting the change may help correct for the misalignment of headlines and content. Finally, we find that more articles that attract more partisan Twitter users (those leaning heavily to the left or the right) are more likely to get headline changes.
We also used an instrumental-variable approach to estimate the causal impact of a headline change. We found that after a headline change an article tends to perform better and generate more polarizing tweets. Interestingly, the impact of an A/B test is much bigger than that of a simple headline switch. In the hour after a simple headline switch, the total number of tweets and the number of pro-Democrat tweets that mention the article increase by between 0.4 to 0.6 standard deviations, while the corresponding numbers after an A/B test increase by 3.2 to 4.7 standard deviations.
To the extent that tweet shares proxy revenue, these preliminary estimates suggest that the NYT is utilizing headline changes for both economic and ideological or persuasive reasons. Digitization, via headline changes, can create echo chambers and worsen polarization. And it can also increase a newspaper’s profits.