## Apr 30

### Piled Higher and Deeper

The business press is reporting on a recently published paper, Quantifying Trading Behavior in Financial Markets using Google Trends, by Tobias Preis, Helen Susanannah Moat, and H. Eugene Stanley. This paper has not had as much impact as that of Bollen et al., probably because it does not make such outlandish claims, but likely also because Google Trends is not as sexy as Twitter.

The Preis et al. paper has the dubious distinction of being the worst paper I’ve read in the last month. Here are the problems I found with this paper before giving up on it:

1. It is not entirely clear that Google Trends data is causal: the historical data you retrieve now may not represent what one would have (or even could have) observed at point in time. Google’s help pages make some vague reference to data normalization, but neither confirm nor deny causality. If time trends are removed using all the data, the entire exercise is utterly pointless.
2. The authors do not understand how shorting works! They claim that the changes in ‘cumulative returns’ from a short position are $$\log(p(t)) - \log(p(t+1))$$. Under this formulation, a short position can experience unlimited gains but limited losses, when, in fact, the opposite is the case. The proper expression is $$\log(2 - p(t+1)/p(t))$$, which could be undefined if $$p(t+1)/p(t)$$ is two or larger. This bungled backtest accounting introduces a positive bias of order $$(1 - p(t+1)/p(t))^2$$, which can be large for the weekly hold periods considered in the paper. The upshot is that short-biased strategies get a tailwind which is pure ‘backtest arb’.
3. I am unable to replicate the backtest presented in Figure 2. Note the paper is ambiguous regarding how one should act if the change in trend data is exactly zero (this occurs around 5% of the time for ‘debt’ data, using a three week normalization window), but breaking the tie in any of the three ways, and backtesting using the ‘corrupt’ method for shorts and a correct method never gives 326% cumulative returns as quoted in the paper. The ‘corrupt’ method does indeed boost total returns and Sharpe ratio. However, under none of the tested configurations, including the suspect ones, does the Sharpe ratio achieve 95% significance.
4. If I am to understand Figure 3 correctly, the ‘debt’ signal achieves returns which are 2.31 standard deviations above the returns of a ‘random strategy’. Presumably the random strategies do not have the shorting bias that the ‘debt’ signal does. However, given that approximately 100 different search terms are tested, a 2.3 sigma event is not statistically significant when a Bonferroni correction is applied.
5. Since the authors (or the paper’s reviewers, if there indeed were any) are apparently aware of the pitfalls of multiple hypothesis testing, they do not draw much attention to the 2.3 sigma event. Rather, they compute the mean ‘Sharpe’ over the 98 strategies, then quote the t-statistic (a whopping 8.6) and p-value. Back in the eighties when professional statisticians bemoaned the coming availability of statistical software which would allow hoi polloi to misuse statistical techniques, this is what they were warning us about. Because the search term time series measure latent ‘interest’ with correlated errors, and because they are all backtested on the same Dow Jones time history, the errors in the 98 backtests’ returns are correlated. One cannot perform a t-test on the aggregate statistics without dealing with this correlation, otherwise one is rejecting the (composite) null for the wrong reason: i.e. because independence of errors is violated. (The Leung and Wong test for paired Sharpe seems more appropriate in this case.)

In all, this paper teaches me nothing about the world other than the low standards of the journal ‘Scientific Reports’, which, I am horrified to find, is somehow associated with the journal ‘Nature’.

Disclaimer The information provided does not constitute investment advice.

## Mar 11

### Did you ever think your tweets might predict the future?

Not wanting to get be left behind on all this ‘social media’ stuff, Fox Business News trotted out Johan Bollen for an interview regarding his research. Bollen notes that his system is designed for hedge funds looking for a little extra alpha, not retail clients. This displays shrewd market positioning on his part, since Derwent’s experiment with bringing social media trading to the masses appears to have deflated—their recent ‘innovative’ self auction earned a non-binding bid of 120K GBP for the company, a ROI of perhaps negative 65 percent on the initial 350K invested.

I would like to believe that Bollen is giving me a shoutout at 2:11, when he notes:

It’s absolutely clear that there’s communities out there whose purpose is simply to spread misinformation or to … throw a wrench into … the gears of this algorithm.

Disclaimer The information provided does not constitute investment advice.

## Feb 07

### The Sentiment Trading Platform is for Sale

Derwent Capital, the former hedge fund turned retail broker announced that they are auctioning themselves to the highest bidder. At the moment, the highest bid id 100K GBP, far lower than the 350K over/under number for profitability, according to Paul Hawtin, Derwent’s CEO. The ‘guidance figure’ (read: anchor) is 5M GBP, and as part of the deal you take ownership of the ‘Sentiball’ trademark.

As Hawtin notes:

The beauty of an auction is that you get a true valuation of the company.

And so I will be greatly amused for the next ten days.

Disclaimer The information provided does not constitute investment advice.

## Jul 25

### You had me at the third significant digit

I have, in the past, been rather harsh on Bollen, Mao and Zheng for their Twitter paper, which boggles the imagination with its naïveté. However, to their credit, theirs is not clearly the most ridiculous ‘quant’ paper I have ever seen. A recent contender for that distinction is Limited Attention, Salience, and Stock Returns, by A. Subrahmanyam, J. Wei, and H-Y. Yu, dated March 25, 2012. Here is the abstract:

We show that a long-short portfolio based on stocks that have just arrived to and left from extreme winner and loser deciles materially outperforms a conventional momentum portfolio. A 6-month-ranking and 6-month-holding portfolio based on the standard Jegadeesh and Titman (1993, 2001) momentum strategy commands an average monthly return of 1.20% and a Sharpe ratio of 0.262 over the past four decades; the corresponding numbers for our long-short portfolio are 10.30% and 1.035, respectively. For the 2001-2010 period, our monthly return is even higher at 16.38%, compounding to an annual return of 517.36%. The sheer size of these profits poses a further, significant challenge to the asset pricing literature and the market efficiency hypothesis. We propose that arrival to an extreme decile is a salient signal that attracts retail investor attention, and stimulates strong buying, boosting returns. Supporting this explanation, we show that there is significantly abnormal buying pressure in extreme decile arrivals that reverses in the longer run.

This paper was formerly posted at SSRN, but was mysteriously removed less than two weeks after the publication date (and after receiving some attention at CXO Advisory.)

Some relevant facts about their purported “challenge” to the Efficient Markets Hypothesis which are omitted from the abstract: their strategy rebalances monthly; they delay their signal by a month; the quoted Sharpe ratio numbers are monthly; no mention is made of leverage.

So as a recap, the claim is that if one trades once a month, on a month-old signal, based on a 12 month moving average of publicly available price and volume data, on U. S. equities, one can capture a Sharpe around 3.5$$\mbox{yr}^{-1/2}$$ and annualized returns over 500 percent. Moreover, the returns have been measured with no fewer than five significant digits.

If your error-checking sense is not properly calibrated, you should have goosebumps right now. If so, I am going to marsh your mellow by revealing that these results are, indeed, too good to be true. There is no conceivable way such a large effect could have lurked, unnoticed, within the landscape of technical strategies for five years, much less for four decades. Moreover, to suggest that the returns of U.S. equities could be predicted with such certainty based on a month-old highly autocorrelated signal is ludicrous.

Luckily for the world, someone must have notified the authors of their mistake, and the paper went down the memory hole. The alternative explanation is that Derwent inked a deal with Subrahmanyam, Wei, and Yu to license their technology, and they went into stealth mode.

## Jul 11

### Converting Timing Edge to Sharpe

Let $$x_t$$ be the time series of relative returns of some instrument. As a very rudimentary market timing model, suppose you have a signal $$s_{t-1}$$ which equals $$\mbox{sign}\left(x_t\right)$$ with probability $$p = \frac{1}{2} + g$$, and otherwise equals $$-\mbox{sign}\left(x_t\right)$$. Here $$g \in \left[-0.5,0.5\right]$$ is one’s timing edge over a coin flip. Note that I find this model somewhat weird because the probability of correctly guessing tomorrow’s returns is independent of the absolute magnitude of the return. This is, however, the model evidently envisioned by Bollen, Mao and Zeng, in their Twitter market timing study.

Previously I had performed a Monte Carlo experiment to convert $$g$$, which Bollen et. al. claim equals $$11/30$$ for their model, into an annualized Sharpe ratio. There is a ‘direct’ computation, however, which only requires us to estimate a normalized ‘spread’ of the market returns.

Suppose that you hold a long position in the instrument when your signal is positive, and otherwise hold a short position. The expected returns of your portfolio is, after a little math, $$2g\mathbf{E}\left[|x|\right]$$. Because one is always either entirely long or entirely short the market, the second moment of one’s returns is $$\mathbf{E}\left[x^2\right]$$. Then the true, population, Sharpe ratio of one’s strategy is $\psi = \frac{2g\mathbf{E}\left[|x|\right]}{\sqrt{\mathbf{E}\left[x^2\right] - 4g^2\mathbf{E}\left[|x|\right]^2}} = \frac{g}{\sqrt{\frac{\mathbf{E}\left[x^2\right]}{4\mathbf{E}\left[|x|\right]^2} - g^2}} = \frac{g}{\sqrt{\kappa - g^2}}.$

It remains only to compute or estimate $$\kappa$$ for the given market returns. Note that when $$g$$ is reasonably smaller than $$\sqrt{\kappa}$$, a linear approximation is pretty good. Assuming we trade daily, the annualized Sharpe ratio has the linear approximation $\psi \approx \sqrt{\frac{253}{\kappa}} g \mbox{ yr}^{-1/2}.$

Here is a table showing, for a few different distributions, the computed, or estimated, value of $$\kappa$$, as well as the annualized linear approximation constant, $$\sqrt{253 / \kappa}$$. In the last column, I give the annualized Sharpe ratio corresponding to an edge of 0.367, the value claimed by Bollen for the ‘Twitter predictor’. The distributions are: zero mean Gaussian (the variance does not matter), a Student t with 10 d.f., a Student t with 4 d.f., and the daily relative returns of DJIA from 1930-01-02 to 2012-07-10.

distribution $$\kappa$$ $$\sqrt{253/\kappa}$$ SR for $$g = 0.367$$
Gaussian 0.39 25 11 $$\mbox{yr}^{-1/2}$$
t(10) 0.42 25 11 $$\mbox{yr}^{-1/2}$$
t(4) 0.5 22 9.6 $$\mbox{yr}^{-1/2}$$
DJIA from 1930-01-02 to 2012-07-10 0.59 21 8.6 $$\mbox{yr}^{-1/2}$$

Some takeaways:

1. The ‘blackjack’ rule of thumb for market timing strategies of this type is: Annualized Sharpe is twenty-one times the daily edge. This confirms your suspicions that a five percent edge is pretty good.
2. The annualized Sharpes for an edge of 0.367 is around 9$$\mbox{yr}^{-1/2}$$, consistent with the previous Monte Carlo findings. I do not believe a Sharpe as high as 2.5$$\mbox{yr}^{-1/2}$$ has ever been realized for a market timing strategy (assuming at least three years of trading). Bollen’s strategy, were it real and not a statistical ghost, would represent the greatest quantitative strategy ever discovered.
3. Fatter tailed distributions appear to have larger $$\kappa$$, and thus a constant edge has lower Sharpe in these markets.

## A More Powerful Yeti Detector

Given $$n$$ observations of such a market timing signal, along with the leading market returns, the standard error on the estimated edge, assuming the edge is near zero, is around $$\left(4n\right)^{-1/2}$$. Using the linear approximation to convert edge into a daily Sharpe ratio, the standard error on Sharpe should be around $$\left(4n\kappa\right)^{-1/2}$$.

If, on the other hand, you backtested your timing strategy and computed the sample Sharpe ratio, the standard error1 is around $$n^{-1/2}$$. Noting that, by Jensen’s Inequality, $$\kappa \ge 0.25$$, it appears that the standard error on Sharpe as computed via edge estimation is smaller than that computed by a backtest. Which is to say you should get somewhat tighter estimates (modulo the uncertainty in $$\kappa$$!) of Sharpe based on the edge method.

Unfortunately, this is something like having a more powerful Yeti detector: If there were Yetis in the wild, you would be looking pretty smart; however, since the incidence rate is effectively zero, you’re just making type I errors. So it goes with market timing.

1. This assumes that the daily Sharpe is modest, less than 0.15, say, and the market is not terribly skewed. These are rough comparisons.

## Jul 05

Bollen, who I lump with his co-authors, is an academic. In the long term, his reputation is, or should be, worth more than a short term deal with a hedge fund. He would be better served, and less harmed, than Derwent, by admitting that ‘mistakes were made’ and moving on. After all, this is how the scientific process is supposed to work: when a theory is inconsistent with facts, we throw it away. We should applaud Bollen as a real scientist if he retracts his paper.

On the other hand, sentiment analysis is Derwent Capital’s raison d’etre; they have no other gimmick to set themselves apart. They simply cannot renounce Bollen’s paper or ‘Twitter market sentiment’.

I imagine that the sentiment analysis ‘advice’ that Derwent provides on their platform is given in a way that is consistent with British securities law1. My guess is that stock tips from sentiment analysis are no worse, and probably no better, than stock tips one might get from a human broker, and this latter process is allowed in the United States, subject to certain limitations. The overstated certainty of Bollen’s claims, if advertised by Derwent, might be grounds for a fraud case, but I am no lawyer.

I am, however, a statistician, working as a ‘quant’ at a hedge fund. Bollen’s paper, in my opinion, has been harmful to the field of quantitative finance for many reasons:

1. The appalling statistical and logical errors in this highly visible publication make a mockery of what little standards the field has.
2. His ‘results’ raise a false bar. If I were to go to my employer with a market timing model that I honestly believed2 had a predictive accuracy of 56%, why would my employer take that over Bollen’s 3 day old tweets model?
3. Bollen’s paper is often cited as ‘proof’ that sentiment analysis presents profitable trading strategies. The result has been a massive misallocation of time, money, and human capital into chasing a statistical ghost.

I also rather suspect that gambling on Twitter sentiment will make a whole lot of people slightly less wealthy on average, while making a fair amount of money in fees for the brokers and sentiment peddlers. I want to believe that, deep down, Bollen is a nice guy, and would rather not have somebody’s Aunt Tilly lose her pension on his mistake.

1. I am not familiar enough with either Derwent’s platform or British securities law to say for certain.

2. This is a huge hypothetical; I do not, in general, believe in market timing strategies.

## Jul 03

nancefinance asked: do you have the screenshots for the 2012 managed account reports at derwent? thanks.

Yes, I do. And now the internets have it too:

## Jul 01

### An open letter to Johan Bollen

Johan,

You may be wondering why you are not living on your own Carribean island by now. I had the same feeling once, a long time ago, after my first hedge fund launch. You will get over it. I am guessing that Paul Hawtin is no longer returning your calls, since Derwent somehow fumbled in implementing your ideas. Well, you do not need them: I am going to offer you a chance to redeem your research.

Your paper claims that using two- or three-day old, publicly available data from Twitter feeds allows one to predict the “daily up and down changes in the closing values of the DJIA” with “87.6% accuracy.” I remain skeptical of this claim; however, I will allow you to prove, to me and the world, the validity of your findings. Here is my proposal: over 80 market days, a 4 month period, you will send to me, before the start of every market session, a file containing the predictions for that day’s movement of the DJIA. This should be a simple “up” or “down”, unambiguously coded. We will define ‘daily movement’ in such a way that one could act on the information in a meaningful way: you are to tell me the future, not the past. You may encrypt the file if you wish, but you must provide the key at the end of the 4 month period. At the end of that period, we will determine the accuracy of your method.

To make this project worth your time, I will wager you the sum of ten thousand U.S. dollars1, at even odds, that your predictions are true for not more than 57 of the 80 market days. That is, you must achieve 72% accuracy in your predictions. This should be fairly easy to achieve given an “87.6% accuracy”: the probability that you would lose this wager is less than 0.1%. However, if you are flipping a fair, or nearly fair coin, you will very likely lose.

Of course, we would have to formalize this wager by setting the terms very clearly, agreeing on how we publicize the results, appointing a third party for dispute resolution, defining the definitive source of DJIA marks, putting the money in an escrow account, etc. You can save face by claiming that you have neither the time, money or lawyers for this kind of charade; that it would taint the integrity of your work; that you have nothing to prove, etc.

Thus I expect that you will decline, or more likely, ignore my offer. Sadly, then, I must plow on in my crusade to discredit your paper. While that means continuing my rants on this obscure blog, I will be forced to escalate my campaign. I am drafting a letter2 to the editors of the Journal of Computational Science urging them, in no uncertain terms, to retract your paper3.

The choice is yours, Johan: we can test your hypothesis in public, or I publicize your failings.

You can contact me on Twitter, although I predict you will not.

1. Yes, I stole this idea from Mitt Romney.

2. Readers of this blog: I am looking for co-signers. Contact me on twitter, if you are interested.

3. This is maybe not as bad as it sounds, since there is evidence that retracted papers never really die.

## Jun 22

### Derwent closes shop

In May the Financial Times reported that Derwent Capital, the hedge fund that partnered with Johan Bollen and Huina Mao to trade the “Twitter Predictor” Strategy “shut down”. The official story is that Derwent’s Capital Markets’ Absolute Return fund opened for investments in July 2011, and shuttered after a single month, with reported returns of 1.86%.

There are a few oddities here:

1. Why is the FT reporting in May 2012 that a hedge fund closed in August 2011?1 It would seem this is no longer news. To confirm this is not an error on the part of the Financial Times, I quote a ‘weekly sentiment email’ sent by Derwent Capital on June 6, 2012: “Some of you may have read about our Hedge Fund closing last year in press articles this week.” What? I just caught up on the news of this ‘moon landing’, and now you’re telling me there are more events happening in the world?
2. As late as the end of March 2012, Derwent was posting performance numbers for managed accounts on their webpage. The reported performance was generally positive, but not consistent, with the spectacular performance promised by Johan Bollen. This period of Derwent’s existence has gone down the memory hole.
3. Derwent’s founder, Paul Hawtin, speaking in the FT, claimed that, “… [a hedge fund] is a very difficult product to market and there’s a very small clientele who can even know about it.”2 If we take Bollen’s research at face value, however, Derwent is sitting on a gold mine; they do not need clients, rather, they need a loan, a payday loan will do even. As long as they are paying less than 400 percent annually, they should borrow money and plow it into the ‘Twitter Predictor’.

Note that the main thrust of the FT story is that Derwent is re-inventing itself as a retail trading platform with ‘sentiment measures’ baked into it somehow. That is, they are democratizing the process of gambling on Johan Bollen’s faulty statistical practice.

1. And why am I re-reporting it a month later? Because I have been busy.

2. Note, however, that a Google search for “Derwent Capital” gives 28K links, and a news search yields several dozen stories (or, rather, echoes of stories) linking Derwent to Twitter.

## May 03

### The ‘Twitter Hedge Fund’ has an out-of-sample experience.

Derwent Capital, the Hedge fund which is working with Johan Bollen and Huina Mao to implement their ‘Twitter Predictor’ strategy, had, until recently, been publishing their monthly returns on the web. This is fairly irregular: hedge funds typically do not release this data due to regulatory concerns and performance anxiety. Even more irregular, as of May 3rd, 2012, the monthly returns were removed from the trading performance webpage. The page now states, “We are going through some exciting changes…more soon,” as does Derwent’s homepage; they no longer appear to be taking new investments.

You can see the last published monthly return values, as of April 27, 2012, in the google cache. I replicate the data here1:

 Period: Jan 12 Feb 12 Mar 12 Apr 12 Return: 2.04% 3.18% 1.89% Na

I added the Na for April 2012. My suspicion is that April was a down month and Derwent panicked, but concede there are numerous alternative explanations. The total return over the period is 7.3%. While this is better than a sharp stick in the eye, is it consistent with the spectacular performance implied by the claims of Bollen’s original paper?

## “That 86.7% figure is widely quoted in the media. What people forget is that … you might lose all your money in the 13 or 15% where you’re wrong”

Here is the (composite) null hypothesis that I intend to reject the hell out of:

Derwent is trading the model described by Bollen et. al., longing and shorting the DJIA at the close, at unit leverage or greater, capturing the indicative value of the index, not incurring costs, and the ‘Twitter Predictor’ model has the advertised forecast accuracy of 86.7%.

To reject this null hypothesis, we would have to conclude one of the following:

1. Derwent is not trading the Twitter Predictor model because of technical difficulties. Given that they inked a deal with Bollen over a year ago and could be trading on a single ETF once daily, on a signal based on three day old tweets, they would have to be hopelessly incompetent. Since Bollen’s claims imply that Derwent is sitting on a gold mine, one would expect them to spare no expense bringing the strategy to market. How they have failed to do so is inexplicable.
2. Derwent is not trading the Twitter Predictor model because they do not believe it is profitable. This seems dishonest to me, since stories on the web link Derwent to Bollen’s paper. Although Derwent’s offering material probably allow them to trade whatever the hell they want, it would be odd indeed if Bollen’s business partner didn’t buy his story, while licensing his technology (and renting his reputation).
3. Derwent is trading the Twitter Predictor model, but at less than unit leverage, i.e. they are keeping some part of their money in a ‘risk-free’ instrument, like cash. This would imply that Derwent is experiencing less risk than the strawman market-timer analyzed below. This is very odd behaviour for a hedge fund, especially one collecting only performance fees. Moreover, since backtests indicate little chance of massive drawdowns and astronomical upside potential for the Twitter Predictor, Derwent should be borrowing money to trade at leverage if possible.
4. Derwent is trading the Twitter Predictor model and the forecast accuracy quoted in Bollen’s paper did not materialize in the real world.

None of these possibilities is attractive for Bollen; they all point to the ‘Twitter Predictor’ strategy being a statistical phantom, the product of smoke, mirrors and data dredging, amplified by a credulous media which has drank deep the social media Kool-Aid on Twitter.

## Testing the Null

It would be difficult to infer much from only 3 months of data even if you were given the daily marks, less from the monthly marks. However, under the null hypothesis quoted above, we have a toehold. There are a number of technical tricks we can employ (and I do so below), but the simplest test is to compare the live performance to that of a simulated market-timer trading on DJIA with the Bollen’s fabled forecast accuracy.

### Monte Carlo simulations

This is a rerun of the previous simulations, but applied to the live trading period. To recap, for one realization of the experiment, I simulate a trader who correctly guesses the sign of the log return of DJIA tomorrow with probability 86.7%, and trades at the close, at unit leverage, long or short based on that guess, over the period 2012-01-01 to 2012-03-31. I perform 10000 such Monte Carlo realizations with different random seeds, recording the total returns of each experiment.

Of the 10000 simulations, exactly 8 did as ‘poorly’ as Derwent, approximately 0.08%. The worst simulation experienced a total return over the period of 3.24%, compared with Derwent’s achieved total return of 7.3%. The median total return over the 10000 simulations is 20.83%.

### Estimating the forecast accuracy

Assuming the null hypothesis, we know the absolute value of Derwent’s simple returns on every day in the trading period, although not the sign. We know their total log return over the period, and so we know their mean daily log return, which allows us to estimate their experienced forecast accuracy.

Suppose that, for $$i=1,2,\ldots,n, w_i$$ are the absolute values of the daily returns of DJIA in the period in question. Let $$s_i$$ be a $$\pm 1$$ random variable that equals 1 with probability equal to the forecast accuracy. The total return experienced over the period would then be $r_t = \prod_i (1 + s_i w_i).$ The fact that simple returns compound in this way is rather an annoyance. We know their total return, thus their total log return, and use this as an estimate of the sum of daily returns as follows: $\log r_t = \log \prod_i (1 + s_i w_i) = \sum_i \log\left(1 + s_i w_i\right) \approx \sum_i s_i w_i.$ This follows because for $$x$$ small we have $$\log (1+x) \approx x$$.

The sum on the right above could be used to estimate the forecast accuracy $$p$$, which is the probability that the random variable $$s$$ equals 1. This is basically just a weighted mean computation, conditional on the weights $$w_i$$ being fixed: $\mathbf{E}\left(\frac{\sum_i w_i s_i}{\sum_i w_i}\right) = 2p - 1.$ This means the statistic $\hat{p} = \frac{1}{2} + \frac{\log r_t}{2 \sum_i w_i}$ can be used to estimate $$p$$.

Performing this calculation, I get the estimate $$\hat{p} = 0.64$$. A fairly rough 95% confidence interval for the true forecast accuracy is $$\left[0.47,0.8\right]$$. Note that at this type I rate we cannot reject the possibility that $$p = 0.5$$, i.e. Derwent has no market timing ability. We can reject the hypothesis that $$p = 0.87$$.

### Sharpe Ratio

If we can estimate the volatility of Derwent’s daily returns, we can compute their achieved Sharpe ratio, based on log returns. Under the null hypothesis, based on historical simulations, this Sharpe ratio should be on the order of $$9\mbox{yr}^{-1/2}$$.

Let $$r_i$$ be Derwent’s daily simple returns on each day in the period in question, and let $$l_i = \log(1 + r_i)$$ be the log returns. We do not have Derwent’s daily returns, so cannot compute their log returns on each day. However, under the null hypothesis we assume that $$|r_i| = w_i,$$ the absolute simple returns of DJIA on each day.

We can also get a lower bound on the volatility of log returns as follows. First note that $\log(1 + |r_i|) \le |\log(1 + r_i)| = |l_i|.$ The sample variance, $$\hat{\sigma}^2$$, of the log returns can then be bounded by $(n-1) \hat{\sigma}^2 = \sum_i l_i^2 - n \left(\frac{\sum_i l_i}{n}\right)^2 \ge \sum_i \left(\log(1 + |r_i|)\right)^2 - n \left(\frac{\sum_i l_i}{n}\right)^2.$ Note that we have Derwent’s mean returns, $$\sum_i l_i / n$$ by transforming their total returns and using the ‘telescoping property.’

Using this lower bound on volatility, we get an upper bound on their Sharpe ratio, calculated on log returns. The value I get is $$3.3\mbox{yr}^{-1/2}$$, with 95% confidence interval $$[-0.8\mbox{yr}^{-1/2},7.2\mbox{yr}^{-1/2}]$$. So I am confident that Derwent does not have a Sharpe ratio of $$9\mbox{yr}^{-1/2}$$, and I reject the null hypothesis.

Note that while $$3.3\mbox{yr}^{-1/2}$$ should be considered ‘very good’, the sample size is so small we cannot be sure this was not just a fluke.

## “We are going through some exciting changes…”

Based on the Monte Carlo simulations and the analysis based on inferred volatility, we can soundly reject the null; Derwent is not trading the Twitter Predictor, or is massively underlevered, or the forecast accuracy failed to materialize.

Note that this says relatively little about whether Derwent is a ‘good’ investment or not. If we cannot assume they are trading on the DJIA in the simple way outlined in the null hypothesis, then the Monte Carlo experiments, the forecast accuracy and Sharpe estimations become meaningless, and we are left with only 3 monthly returns numbers, from which we can infer very little. My goal is merely to show that the forecast accuracy touted in Bollen’s paper has yet to be seen in the real world.

Disclaimer The information provided does not constitute investment advice.

Disclosure author has no holdings in Twitter, holds broad market ETFs which intersect with the DJIA.

1. I believe these returns are gross, i.e. do not reflect performance fees.