# An Implementation of Lazy Prices

Lazy Prices, a recently updated paper by Cohen, Lauren and Malloy offer a novel way to use regularly scheduled reports of publically traded companies to invest. A link to the paper on SSRN is here and the abstract with citation is below.

Using the complete history of regular quarterly and annual filings by U.S. corporations from 1995-2014, we show that when firms make an active change in their reporting practices, this conveys an important signal about future firm operations. Changes to the language and construction of financial reports also have strong implications for firms’ future returns: a portfolio that shorts “changers” and buys “non-changers” earns up to 188 basis points in monthly alphas (over 22% per year) in the future. Changes in language referring to the executive (CEO and CFO) team, regarding litigation, or in the risk factor section of the documents are especially informative for future returns. We show that changes to the 10-Ks predict future earnings, profitability, future news announcements, and even future firm-level bankruptcies. Unlike typical underreaction patterns in asset prices, we find no announcement effect associated with these changes–with returns only accruing when the information is later revealed through news, events, or earnings–suggesting that investors appear to be ignoring these simple changes across the universe of public firms.
Cohen, Lauren and Malloy, Christopher J. and Nguyen, Quoc, Lazy Prices (February 22, 2018). Available at SSRN: https://ssrn.com/abstract=1658471 or http://dx.doi.org/10.2139/ssrn.1658471

Annual and quarterly reports are a critical source of information for investors. They hold financials and other information that many find useful when deciding to invest in a publically traded company. These reports are filled with text and are a popular target for data scientists and quants who are comfortable with natural language processing (NLP). I have seen applications of NLP to reports that focus on the sentiment of the most recent filing. Trading signals are then derived from the interpreted sentiment in the report. Usually, these are some variant of buying stocks that have positive sentiment in their reports and selling stocks that have negative sentiment.

The approach in Lazy Prices looks at the most recent filing, that same filing from last year (Q1 2018 vs Q1 2017), and compares the two to determine a level of change in specific sections. They find that reports with large changes are associated with lower forward returns than reports that change less. In this post, I develop a basic implementation of the paper’s approach in R.

The code to pull reports, separate them, and construct similarity measures can be found here.

# Getting Data

The data for this experiment consists of two parts. One is asset return data, which I will obtain using the getSymbols() function in quantmod. By default, this will look to yahoo finance. I will use the prices to derive monthly returns.

In addition, we need annual and quarterly reports. Thanks to the SEC’s EDGAR site, we can go to one central location for all financial reports. I use a handy package called edgarWebR to obtain and process these reports.

edgarWebR allows me to:

• Search for all SEC filings for a stock ticker
• Extract the 10-K or 10-Q text from the full submission text file
• Parse the text of the 10-K or 10-Q

As a starting point, I obtain 4 years of returns and quarterly/annual reports for a subset of the blue-chip stocks in the Dow 30. MMM, AXP, AAPL, CAT, CSCO, KO, DIS, XOM, GE, GS, HD, IBM, INTC, JNJ, MCD, NKE, PFE, PG, TRV, UTX, UNH, VZ, and WMT. The paper is more comprehensive, but in the interest of time, I am starting with a small selection.

# Processing Data

A lot of cleaning is necessary to go from full submission text file to usable text. I’ll use edgarWebR to take care of most of the cleaning/parsing/separating.


#library(edgarWebR)
#library(dplyr)
#Get the reports
filings <- company_filings('WMT', type = "10-", count = 20)
#Isolate for 10-K and 10-Q
filings <- filings[filings$type == "10-K" | filings$type == "10-Q",]
#Get useful information from each report URL (filing date, period date,...)
filing_infos <- purrr::map_df(filings$href, filing_information) filings <- bind_cols(filings, filing_infos) #Take a url from the list and pull the report docs <- filing_documents(filings$href[1])
#Get the complete submission
doc <- docs[docs$description == 'Complete submission text file', ] #Parse it parsed_docs <- parse_submission(doc$href)
#Extract the text of the 10-Q or 10-K from all the documents
doc <- parse_filing(parsed_docs[which(stringr::str_detect(parsed_docs$TYPE, "10-")),'TEXT'])  A parsed 10-K/10-Q using edgarWebR results in a data frame with three columns: • text: A paragraph, line, or section (separations determined by HTML) after removing all tags and non-text elements • part.name: Identification of which part of the report the paragraph belongs to, for example, PART I. FINANCIAL INFORMATION • item.name: Identification of which item of the report the paragraph belongs to, for example, Item 4. Controls and Procedures This is very convenient. As report contents can be very different company to company, the authors suggest isolating certain common items from each report to measure change across. Using the item.name column, I can separate and extract the text for the sections “Management’s Discussion and Analysis”, “Risk Factors”, “Legal Proceedings”, “Quantitative and Qualitative Disclosures about Market Risk”, “Controls and Procedures” and “Other Information”. #MDA MDA <- doc[grepl("management", doc$item.name, ignore.case = TRUE), ]

Then, as an additional step, I strip out any paragraphs that contain more than 15% numeric characters. These are likely to be tables, short section indicators, or other things that aren't critical to the text of the report.

MDA <- MDA[(stringr::str_count(MDA$text, pattern = '[0-9]')/nchar(MDA$text)) < .15,1]

Then I can combine all paragraphs of each section together. The result is a string of text for each section, for each report, for each stock. We are ready to compare.

# Measuring Change

Lazy Prices uses several different metrics to measure the change in sections of annual reports year over year. I will focus on two here, cosine similarity and Jaccard similarity.

### Cosine Similarity

This is a similarity measure computed for two documents. Each document is defined as a set of terms, and we can create a vocabulary, which is the union of the two sets. With a vocabulary, we create term frequency vectors for each document. These vectors will contain counts of each word from the vocabulary that appears in the document, and 0’s when the term in the vocabulary does not appear. Normalizing these vectors and taking a cross product will give us a decimal value from 0 to 1 of similarity. The closer to 1, the more similar the documents.

The equation for cosine similarity is:

$Sim\_Cos = \dfrac{D_A^{TF} \cdot D_B^{TF}}{\lVert D_A^{TF} \rVert \times \lVert D_B^{TF} \rVert}$

where $D_A^{TF}$ and $D_B^{TF}$ are the term frequency vectors of documents A and B.

### Jaccard Similarity

This similarity measure uses term frequency vectors as well but in a different fashion. The Jaccard similarity measure looks at the intersection of the of the term frequency sets and divides that by the union of the term frequency sets. Words in both divided by words in all.

The equation for jaccard similarity is:

$Sim\_Jaccard = \dfrac{\lvert D_A^{TF} \cap D_B^{TF} \rvert}{\lvert D_A^{TF} \cup D_B^{TF} \rvert}$

where $D_A^{TF}$ and $D_B^{TF}$ are the term frequency vectors of documents A and B.

Intuitive examples that use real text are shown in the paper. In addition, they use two variants of edit distance as alternative measures of similarity. For brevity’s sake, I only use the two.

# Generating Signals

The authors propose a portfolio that shorts “big changers” and buys “little to no changers” by sorting similarity scores and buying the 5th quintile (highest similarity) and sells the 1st quintile (lowest similarity). The authors don’t seem to mention if this is based off an aggregate similarity measure, an average across sections, or weighted somehow. I decide to rank similarity (low to high) within each report item, sum the ranks across sections, and determine quartiles based on the sums of ranks. Quartiles work nicely with the 24 stocks I’ve selected for this simple analysis.

The paper specifies that stocks enter the portfolio the month after the release of their latest report and are held for three months. I will take a simpler approach. At the start of each quarter, I will analyze reports released within the last quarter against the reports from the year prior, create quartiles, and hold for a quarter. The authors found little to no announcement effect when analyzing the phenomenon, and that the price movement happens over a period of nearly 6 months. Therefore, I am not concerned with some delay in investing.

In Summary:

1. With each report, calculate the Jaccard and cosine similarity for the items below for this report, and the same report from last year (Q1 2018 vs Q1 2017)
• Management’s Discussion and Analysis
• Risk Factors
• Legal Proceedings
• Quantitative and Qualitative Disclosures about Market Risk
• Controls and Procedures
2. Identify reports released over the last quarter
3. Rank the similarity measures for each item (choose either cosine or Jaccard here)
4. Sum the ranks
5. Separate the ranks into quartiles
6. Mark the high similarity quartile for longs and mark the low similarity quartile for shorts
7. Obtain returns for these positions over the next quarter

# Results

Given my far smaller universe, shorter time horizon, use of only 2 similarity metrics, and assumptions about similarity ranking, I expect different results than the authors. As the stocks I mentioned above are all blue-chip with much more attention on them, I assumed that the effect would be smaller as more attention is paid to these reports. I still managed to see some fair returns, but nothing special as the Dow rose around 40% during this period.

Looking away from a L/S portfolio, I want to compare how these quartiles perform relative to one another, and if more similarity does seem to lead to higher performance.

Quartile Mean of Quarterly Ret. Std. Dev.
Fourth 0.0287 0.0441
Third 0.0193 0.0397
Second 0.0124 0.0493
First 0.0101 0.0502

This is great, we see that the stocks from our small selection that have higher year over year similarity often outperform others. This could be a useful added screen for long-only investors. Let’s expand the universe, use quintiles instead of quartiles as the authors do, and see if our findings continue.

Note – the odd number of midcap stocks I reference is due to the parser that I use. For certain reports (I believe size is a factor) the parser crashes as the system’s stack grows too big for R to allow. In addition, some stocks did not have reports available for the full date range.

Again, the L/S portfolio is weak but the longs perform well.

Quintile Mean of Quarterly Ret. Std. Dev.
Fifth 0.0392 0.0741
Fourth 0.0343 0.0759
Third 0.0342 0.0729
Second 0.0276 0.0876
First 0.0285 0.0787

A look at performance by quintile shows that our highest quintile performs 50 bps better than the next-highest quintile on average. It performs over 1% better on average than the first quintile. Recall that the fifth quintile changes the least, and the first quintile changes the most.

The charts thus far all reference data that compares similarities across the 6 items mentioned in Generating Signals. To make a brief comparison, I’ll plot the cumulative returns of the fifth and first quintiles, each determined by only one of the six items (using cosine similarity).

### Fifth Quintile

Item Mean of Quarterly Ret. Std. Dev.
Control 0.0379 0.0787
Legal 0.0365 0.077
Quant. and Qual. 0.0365 0.0783
Risk Factors 0.0406 0.0809
Other 0.0385 0.0754
MDA 0.041 0.0734

### First Quintile

Item Mean of Quarterly Ret. Std. Dev.
Control 0.0305 0.0781
Legal 0.0371 0.0681
Quant. and Qual. 0.0285 0.0798
Risk Factors 0.0329 0.0749
Other 0.0393 0.0817
MDA 0.0318 0.0792

When looking at both charts, what draws my eye is the change in the spread of performances. The fifth quintile stays tight together while the first quintile shows deviation. This is fairly intuitive. A report that changes little from year to year can easily stay in the fifth quintile no matter the ranking criteria, however, a stock that makes a big change in one section will show up in one item’s first quintile, but not other items, leading to the deviation. The “Other” first quintile performs well but is more volatile than its fifth quintile partner that performs only 10 bps worse.

After examining preliminary results, it seems entirely plausible that investors could eek out some alpha by utilizing this metric as a screen, a weight, or a factor in a greater stock selection process.

# Items for Future Research

• Experiment with a larger universe
• Small, mid and large-cap together
• Experiment with more similarity measures
• Experiment with value-weighting portfolios
• Experiment with different holding periods
• Experiment with starting investment the month after earnings information is released (as the paper states)
• Fine-tune the parser
• This is an important point if one wanted to implement this strategy. The parser I used is essentially a black box to me. I’d rather build one from scratch or at least look under the hood in the edgarWebR processing to understand its mechanisms. Unfortunately, a promising footnote that alludes to parsing methodology includes a dead link. Still, this package was excellent for getting the ball rolling
• Experiment with combinations of Jaccard and cosine similarity
• Incorporate sentiment measures, and weight changes based on the change in sentiment