- Joan Calzada, Associate Professor, Universitat de Barcelona
- Nestor Duch-Brown, Joint Research Centre (JRC)
- Ricard Gil, Associate Professor, Smith School of Business, Queen’s University
This blog article is derived from the authors’ paper titled Do Search Engines Increase Concentration in Media Markets?, a project of the Economics of Digital Services (EODS) research initiative led by Penn’s Center for Technology, Innovation and Competition (CTIC) and The Warren Center for Network & Data Services. CTIC and The Warren Center are grateful to the John S. and James L. Knight Foundation for its generous support of the EODS initiative.
Search engines are essential intermediaries to access news contents available in the Internet. In European countries, around 45% of news outlets’ visits come from consumers that directly browse the sites’ addresses when looking for news contents, 35% from search engines (mostly from Google), and around 12% from social network (see Figure 1). Consumers frequently look for the latest news in Google, Bing, or Yahoo rather than directly visiting online newspapers. They expect search engines to answer their queries with links to the latest breaking news and information on the top stories, weather, business, entertainment, and politics.
This situation raises the question of whether search engines can affect consumers’ access to a diversity of high-quality news, opinion-based editorials, and information analyses through different sources of information. The concern is not just about the possibility that search engines can bias the visits received by news outlets, but also about the risk that some publishers can become too large and therefore too influential. Our project addresses this question by examining how changes in Google’s indexation algorithms can affect news outlets’ search visits. Specifically, we investigated the effects of Google’s core algorithm updates on the concentration of the European media markets.
What we found is that in the period 2018-2020 Google core algorithm updates had an overall negative effect in the number of keywords that news outlets had in top search results. This finding has important implications, as the number of keywords in the top search positions is positively related with the search visits that receive news outlets. Our analysis also showed that core updates can have important effects in the concentration of the media market, although the effects for each update differ across countries.
Google’s indexation of web pages
Google Search controls around 95% of the search market in most of the European countries. It uses bots to crawl pages on the web, going from site to site, collecting information about these pages, and indexing them. When a consumer makes a query on some keywords or on a phrase, Google uses algorithms to analyze the web pages it has indexed and rank them according to multiple factors that determine the order in which the links to the sites appear in its results pages.
Figure 1: Share of Desktop Direct, Search and Social Networks Daily Visits
(October 2017-December 2020)
Google Search ranks web pages on the base of their relevance for the query (dynamic ranking) and on the authoritativeness of the pages or the domains (static ranking). Dynamic ranking is calculated at search time and depends on the search query, the user’s location, the location of page, day, time, and query history, among other aspects. Static ranking reflects features of the pages that are independent of the query (length of the page, frequency in which the keywords appear, number of images, compression ratio of text, among others), and it is calculated before the time of indexing. Considering this, news outlets with a low static ranking (low domain authority) might find it difficult to obtain traffic for largely requested keywords, but they can rank high in specific queries that affect their region or their niche market.
The success of news outlets in the search market depends on how well they rank relatively to their closest competitors. They need to find the balance between maintaining the high standard of their news contents and facilitating the indexation of their contents by Googlebots. News outlets can increase their search audience competing for the keywords that generate more traffic and investing resources to optimize their search results (they gather information on keyword volume and trends, keywords targeted by competitors, and combinations of keywords that target specific queries). Their objective is to maximize the number of keywords that they place in Google’s organic top search results. Indeed, it is considered that Google’s first page of results captures 71% of search traffic clicks and the second page less than 5.5% of the clicks.
More generally, the effect of Google Search in the concentration of the media market depends on how its algorithms weight domain authority and content accuracy. The empirical literature has shown that digital search increases the traffic going to sites that are relatively less visited, a situation known as the “the long tail.” It is unclear, however, whether Google Search thickens the long tail in the news market and favors information diversity.
Google’s core updates
Every day Google introduces changes in its algorithm and systems to improve the search results for consumers and to correct different types of bugs. But a few times per year Google makes large “core algorithm updates” that generate significant modifications in the way it ranks and indexes search results. According to Google, these changes “are designed to ensure that overall, we’re delivering on our mission to present relevant and authoritative content to searchers.”
The rollout of core updates is global. It affects all Google search regions and languages, and it is not focused on specific types of search queries or on particular web sites characteristics. The updates generate fluctuations in the news sites’ search rankings throughout the next days and weeks after their implementation, which modify their search traffic. Google notifies the launch of its core updates because “they typically produce some widely notable effects. Some sites may note drops or gains during them. We know those with sites that experience drops will be looking for a fix, and we want to ensure they don’t try to fix the wrong things. Moreover, there might not be anything to fix at all.” In the period 2018-2020, Google confirmed the launch of nine core algorithm updates, which are the ones we use in our study (see the red vertical lines in Figures 1 and 2).
News outlet indexation and search visits
A crucial aspect for understanding how Google Search can shape the media market is to examine the effects of its indexation algorithms in news outlets’ search visits. One difficulty for addressing this question is that the number of keywords that news outlets obtain in top search results can be correlated with relevant but unobserved characteristics of their webpages or with their contents. To deal with this endogeneity problem our empirical model adopted an instrumental variable identification strategy. Specifically, considering that algorithm updates have a direct effect in news sites’ indexation and are a source of exogenous variation for news outlet visits, we use Google’s core algorithm updates as an instrument for the number of keywords that news outlets have on top search results.
Our study draws from a rich data set obtained from SimilarWeb containing information for 606 news outlets in 15 European countries. This data set includes daily information about direct, search, and social network visits to news outlets and can distinguish between desktop and mobile traffic. We complement these data with information on keywords ranking distribution from Ahrefs.
Two main conclusions follow from our estimations. First, Google core algorithm updates have a significant effect on the number of keywords that news outlets obtain in top search results. The nine core updates rolled out in the 2018-2020 period affected news outlets in different directions and magnitudes, but they had an overall negative effect in the number of keywords that news outlets have in top search results. The core update with the larger negative impact for news outlets took place in March 2019, but this was later compensated with the update of June 2019, and especially with the update of September 2019, which increased the number of keywords in top 10 positions by more than 2%. Second, the number of keywords that news outlets have in the top search positions have a positive effect on their desktop search visits and on the total desktop and total mobile visits. Specifically, we obtain that a 1% increase in the number of keywords in top 10 search results generates a 6.3% increase in the number of search visits and a 3.8% increase in the total number of desktop and mobile visits. These results are robust when we replicate the analysis for different types of news outlets (national, regional, business, sports, tv/radio) or when we group them according to different features (national rank, domestic traffic, traffic from Google). Results are less conclusive when we examine national markets separately due to the heterogeneous effect of core updates across countries.
The combination of our findings suggests that in the period 2018-2020 Google’s core updates reduced the visibility of news outlets in Google’s results pages. Notice that this has occurred in a period in which Google searches are increasing and in which total visits to news outlets are expanding, especially from mobile devices (see Figure 2).
Core updates and market concentration
We have seen so far that one consequence of Google’s recent core updates is the reduction of news outlets’ keywords in top search results. The second contribution of the project has been to examine whether this reduction has been more important for large news outlets than for small ones. Specifically, we analyzed whether Google core updates are reinforcing the skewness of the distribution of search traffic across news outlets or if they are making the “long tail” thicker.
Our analysis showed that Google’s core updates are sufficiently important to modify competition in the media market, although each specific core update has affected national markets in different directions. We classified the nine core updates rolled out in this period in two groups. On the one hand, we considered the three “biggest” core updates in terms of traffic impact, according to SEO experts. On the other hand, we considered the rest of “non-big” updates. We find that the “big” updates released in this period implied a 1% reduction of market concentration. This effect, however, was mostly compensated by a 0.08% increase of market concentration due to the effect of the “non-big” core updates. In addition, we explain that non-big updates increased the market concentration of search visits for generalist national news outlets and that they reduced the concentration for sports news outlets. Finally, when we considered the effects of the updates at the country level, we found that results are quite heterogeneous. Big core updates reduced market concentration in Finland, Germany, and Greece but increased it in Portugal. Non-big core updates increased concentration in Finland and the Netherlands.
Figure 2: Desktop and Mobile Daily Visits (January 2018 - November 2020)
Search engines are a very important channel to access the contents of traditional news outlets, and changes in their indexation algorithms can alter the competition in the media market. In the last years, policy concerns have emerged about the growing market power of digital platforms that are based on indexation or recommendations algorithms. It is unclear how these platforms can affect market competition and whether they can bias the algorithms to their own benefit. Google Search has been subject to intense antitrust scrutiny from the U.S. and European competition authorities. At the beginning of the 2010s, the U.S. Federal Trade Commission (FTC) investigated several antitrust allegations including the use of bias in search results, but the FTC ultimately closed its investigation. In 2015, the European Commission (EC) also investigated Google alleging search bias, and in 2017, the EC fined Google $2.7 billion for abuse of dominance in Google Shopping.
Of particular importance is the role of search engines in media markets. Several studies have analyzed the effects of digitalization on competition in the media market and in the development of democratic institutions. Very little is known about the effects that search engines and social networks might have in the development and future prospects of media markets, however. Our project constitutes one exploratory first step in that direction.