Research

Can AI Make Economic Predictions by Reading the Newspaper?

Media coverage is full of snippets of information about the current state of the economy. Can those fragments be assembled into a usable forecast? In a new study, a team led by Yale SOM researchers devised a way to distill the text of the Wall Street Journal into numerical indicators, which could help policymakers predict how the business cycle will unfold over the coming months and years.

Sean David Williams

Leland Bybee
PhD Student, Financial Economics
Bryan T. Kelly
Professor of Finance & Associate Director, International Center for Finance

Written by Roberta Kwok

November 28, 2022

Researchers have come up with plenty of indicators to track various aspects of the economy, such as GDP, employment, and industrial production. But extracting a clear picture from these signals, and predicting whether the country is heading into a recession, is still a challenge.

“We have a data problem,” says Leland Bybee, a PhD student in financial economics at Yale SOM. “It’s very difficult to get really good, high-quality data to understand what is going on in the economy at any given point in time.”

In a recent study, Bybee, Yale SOM professor of finance Bryan Kelly, and their co-authors explored another possible source of information: news articles. The researchers ran software to scour hundreds of thousands of Wall Street Journal stories and quantify the amount of media attention paid to specific topics.

They found that the prevalence of recession-related articles, in particular, performed well at predicting certain measures of economic performance months or years later. In other words, when the reporters wrote more about that topic, troubled economic times were more likely to follow.

“We should be paying attention to the information that’s contained in news text,” Bybee says.

Of course, policymakers already read the Wall Street Journal to get this type of information. But if they’re arguing for a new policy, they need a rigorous quantification of news content rather than simply relying on the latest headlines, Bybee suggests. A policymaker wouldn’t want to say, “‘I read that the market’s not doing great, so we should start pumping out a bunch of stimulus checks,’” he says. “You want to have some sort of statistical basis for that.”

Bybee’s team reasoned that the news might contain valuable predictive data because the editors’ job is to give readers information about the state of the economy. The Wall Street Journal is supposed to be a “one-stop shop” to learn about big-picture issues such as the GDP, specific industry news, people’s concerns about economic problems, and experts’ opinions.

The editors’ incentives are “to report the news that matters for people who care about the economy,” Bybee says. “As a subscriber to the Wall Street Journal, that’s what I pay for.”

Bybee, Kelly, and their team—which also included Asaf Manela of Washington University in St. Louis and Dacheng Xiu of the University of Chicago—wondered if they could quantify that information and turn it into a new economic indicator. Policymakers could use that metric as another piece of evidence to make decisions, such as whether to take steps to stimulate the economy.

The researchers gathered about 763,000 WSJ articles, published from 1984 to 2017, and ran software to count the number of times that specific one- or two-word terms were used in each article. A machine-learning algorithm then identified broad “topics”: clusters of terms that often appeared together. The team examined the word clusters manually and labelled each topic.

For instance, one set of words included “Greenspan,” “Yellen,” “federal-funds rate,” “raise rate,” and so on; the topic the software had identified was clearly the Federal Reserve. Other topics included health insurance, China, natural disasters, airlines, and elections.

Next, the researchers measured the amount of “news attention” paid to each topic, defined as the percentage of words in the WSJ about that topic each month. They could analyze how attention waxed and waned over time.

The news attention metrics seemed to track well with existing economic indicators. For instance, when attention to the “recession” topic went up, industrial production growth and employment growth tended to go down.

And surprisingly, news attention to topics such as “recession” and “problems” (a general category that included terms such as “big problem,” “major problem,” “mess,” “debacle,” and so on) could explain 25% of the variation in stock market returns. In contrast, a set of 101 other economic measures could explain only 9%.

“What’s happening with the market and what’s happening with the Wall Street Journal are very similar. They’re both ways of aggregating an extremely rich set of information.”

The reason might be that “what’s happening with the market and what’s happening with the Wall Street Journal are very similar,” Bybee says. “They’re both ways of aggregating an extremely rich set of information.” In other words, market returns on a given day tell you a lot about the state of the economy, as does the content of the newspaper.

But these analyses only checked whether news attention was correlated with other indicators at the same point in time; the articles could simply be describing recent events. The researchers wanted to know if their new metrics could help predict the economy’s future performance.

So they analyzed whether “recession” news attention was linked to changes in industrial production and employment over the next three years. The team controlled for those two indicators and other standard metrics such as the S&P 500 index and Federal Reserve funds rate. In other words, could news attention predict changes beyond what those existing indicators provided?

They found that an increase in the “recession” attention measure, from the 5th to the 95th percentile, was correlated to a 1.99% drop in industrial production 17 months later and a 0.92% drop in employment 20 months later. Shorter-term forecasts worked too; for instance, two months after the bump in news attention, industrial production dropped by about 0.3%.

The “recession” attention metric wasn’t just redundant with the information in stock market prices; it appeared to provide additional forecasting ability. “There’s stuff in the news beyond what the market captures,” Bybee says.

The team also devised a way to identify the most critical articles that policymakers should read. When the forecasting software makes a prediction—for instance, that employment will tank in the next several months—it also pulls out the WSJ articles that devoted the most attention to the “recession” topic. Policymakers who are overwhelmed with massive amounts of information could focus on those stories to get the most relevant details.

This method filters out the noise to extract “the stuff that actually matters,” Bybee says. “It gives you a tool to process that information.”

Department: Research

Topics: