Several Key PerformanceAnalytics Functions From R Now In Python (special thanks to Vijay Vaidyanathan)

So, thanks to my former boss, and head of direct indexing at BNY Mellon, Vijay Vaidyanathan, and his Coursera course, along with the usual assistance from chatGPT (I officially see it as a pseudo programming language), I have some more software for the Python community now released to my github. As wordpress now makes it very difficult to paste formatted code, I’ll be linking more often to github files.

In today’s post, the code in question is a python file containing quite a few functions from R’s PerformanceAnalytics library outside of Return.portfolio (or Return_portfolio, as it may be in Python, because the . operator is reserved in Python…ugh).

So, here is the “library” of new functions, for now called edhec_perfa.py, as a nod to Vijay Vaidyanathan’s Coursera course(s).

Here’s the link to the latest github file.

Now, while a lot of the functions should be pretty straightforward (annualized return, VaR, CVaR, tracking error/annualized tracking error, etc.), two functions that I think are particularly interesting that I ported over from R with the help of chatGPT (seriously, when prompted correctly, an LLM is quite helpful in translating from one programming language to another, provided the programmer has enough background in both languages to do various debugging) are charts_PerformanceSummary, which does exactly what you might think it does from seeing me use the R function here on the blog, and table_Drawdowns, which gives the highest N drawdowns by depth.

Here’s a quick little Python demo (once you import my file from the github):

import yfinance as yf
import pandas as pd
import numpy as np

symbol = "SPY"
start_date = "1990-01-01"
end_date = pd.Timestamp.today().strftime("%Y-%m-%d")
df_list = []

data = pd.DataFrame(yf.download(symbol, start=start_date, end=end_date))
data = data["Adj Close"]
data.name = symbol
data = pd.DataFrame(data)

returns = Return_calculate(data)

charts_PerformanceSummary(returns)
table_Drawdowns(returns)

This results in the following:

And the following table:

The NaT in the “To” column means that the drawdown is current, as does the NaN in the Recovery–they’re basically Python’s equivalent of NAs, though NaT means an NA on *time*, while an NaN means an NA on a numerical value. So that’s interesting in that Python gives you a little more information regarding your column data types, which is kind of interesting.

Another function included in this library, as a utility function is the infer_trading_periods function. Simply, plug in a time series, and it will tell you the periodicity of the data. Pandas was supposed to have a function that did this, but due to differences in frequency of daily data caused by weekends and holidays, this actually doesn’t work on financial data, so I worked with chatGPT to write a new function that works more generally. R’s checkData function is also in this library (translated over to Python), but I’m not quite sure where it was intended to be used, but left it in for those that want it.

So, for those interested, just feel free to read through some of the functions, and use whichever ones you find applicable to your workflow. That said, if you *do* wind up using it, please let me know, since there’s more value I can add.

Lastly, I’m currently in the job market *and* have a volatility trading signal subscription available to offer for those interested. You can email me at ilya.kipnis@gmail.com or find me on my LinkedIn profile here.

Thanks for reading.

This function VITAL for portfolio backtesting is now in Python, written with the help of chatGPT4

So, it’s been a little while. But after a couple of years of some grunt work analytics jobs *and* consulting for a $1B AUM fund, I’ve decided that I had a bit more in the tank to share as far as quant content creation–quantent creation (?)–goes.

And a function I’ve searched for in Python for a long time now, but never finding it in a proper capacity is one that we’ve seen used time and again on this blog–R’s Return.portfolio.

For those uninitiated, Return.portfolio in R is essentially the workhorse of portfolio-level asset allocation backtests. What it does can be summarized in a sentence: given an xts/Python series dataframe of weights with m (E.G. 10) assets, and an xts/Python series dataframe with returns of m assets, Return.portfolio will compute drifted portfolio returns, along with beginning of period weights, end of period weights, beginning/end of period values, contribution to returns, and in the case of the Python variant that I wrote, two-way turnover.

Now, why is this function vital to have in Python, in my opinion?

Simple: because onto my experience with the language, if you want to do any financial research with Python, you need to learn some arcane and convoluted syntax such as zipline or quantconnect, and in some cases, that depends on working through another website altogether (in the case of quantconnect), or in the case of zipline, a library developed by a company that no longer exists (quantopian), so in either case, using either of these libraries may not be the best of ideas, even *if* you put in the who-knows-how-many-hours to learn a bunch of extra syntax fighting for real estate in your head.

Instead, what porting R’s Return.portfolio into Python does is to allow the same exact numpy/scipy/pandas ye-olde-classic-data-science syntax to not only work with backtesting, but also to seamlessly link up with whatever other Python library you may want to use, such as cvxpy for your portfolio optimization solutions (having used this library on past consulting arrangements, I am *very* impressed).

Now, here’s the link to my github for the code for Return.portfolio in Python–or, rather, since the dot operator has a special use in Python, the Return_portfolio function.

But for the record, here’s the Python code, including the endpoints function.

def endpoints(df, on = "M", offset = 0):
    """
    Returns index of endpoints of a time series analogous to R's endpoints
    function.
    Takes in:
        df -- a dataframe/series with a date index
         
        on -- a string specifying frequency of endpoints
         
        (E.G. "M" for months, "Q" for quarters, and so on)
         
        offset -- to offset by a specified index on the original data
        (E.G. if the data is daily resolution, offset of 1 offsets by a day)
        This is to allow for timing luck analysis. Thank Corey Hoffstein.
    """
     
    # to allow for familiarity with R
    # "months" becomes "M" for resampling
    if len(on) > 3:
        on = on[0].capitalize()
     
    # get index dates of formal endpoints
    ep_dates = pd.Series(df.index, index = df.index).resample(on).max()
     
    # get the integer indices of dates that are the endpoints
    date_idx = np.where(df.index.isin(ep_dates))
     
    # append last day to match R's endpoints function
    # remember, Python is indexed at 0, not 1
    #date_idx = np.insert(date_idx, 0, 0)
    date_idx = np.append(date_idx, df.shape[0]-1)
    if offset != 0:
        date_idx = date_idx + offset
        date_idx[date_idx < 0] = 0
        date_idx[date_idx > df.shape[0]-1] = df.shape[0]-1
    out = np.unique(date_idx)
    return out  

def compute_weights_and_returns(subset_returns, weights):
  rep_weights = np.tile(weights, (len(subset_returns), 1))
  cum_subset_weights = np.cumprod(1 + subset_returns, axis=0) * rep_weights
  EOP_Weight = cum_subset_weights / np.sum(cum_subset_weights, axis=1).to_numpy()[:, np.newaxis]
  cum_subset_weights_bop = cum_subset_weights / (1 + subset_returns)
  BOP_Weight = cum_subset_weights_bop / np.sum(cum_subset_weights_bop, axis=1).to_numpy()[:, np.newaxis]
  portf_returns_subset = pd.DataFrame(np.sum(subset_returns.values * BOP_Weight, axis=1), 
                                        index=subset_returns.index, columns=['Portfolio.Returns'])
  return [portf_returns_subset, BOP_Weight, EOP_Weight]


# Return.portfolio.geometric from R
def Return_portfolio(R, weights=None, verbose=True, rebalance_on='months'):
  """

    Parameters
    ----------
    R : a pandas series of asset returns
    weights : a vector or pandas series of asset weights.
    verbose : a boolean specifying a verbose output containing:
        portfolio returns,
        beginning of period weights and values, end of period weights and values,
        asset contribution to returns, and two-way turnover calculation
    rebalance_on : a string specifying rebalancing frequency if weights are passed in as a vector.

    Raises
    ------
    ValueError
        Number of asset weights must be equal to the number of assets.

    Returns
    -------
    TYPE
        See verbose parameter for True value, otherwise just portfolio returns.

    """
  
  # make sure original object isn't overridden
  R = R.copy()
  weights = weights.copy()
  
  # impute NAs in returns
  if R.isna().sum().sum() > 0:
    R.fillna(0, inplace=True)
    wn.warn("NAs detected in returns. Imputing with zeroes.")
    
    # if no weights provided, create equal weight vector
  if weights is None:
    weights = np.repeat(1/R.shape[1], R.shape[1])
    wn.warn("Weights not provided, assuming equal weight for rebalancing periods.")
    
    # if weights aren't passed in as a data frame (they're probably a list)
    # turn them into a 1 x num_assets data frame
  if type(weights) != pd.DataFrame:
    weights = pd.DataFrame(weights)
    weights = weights.transpose()
    
    # error checking for same number of weights as assets
  if weights.shape[1] != R.shape[1]:
        raise ValueError("Number of weights is unequal to number of assets. Correct this.")
    
    
  # if there's a row vector of weights, create a data frame with  the desired 
  # rebalancing schedule --  also add in the very first date into the schedule
  if weights.shape[0] == 1:
        if rebalance_on is not None:
            ep = endpoints(R, on = rebalance_on)
            first_weights = pd.DataFrame(np.array(weights), index=[R.index[0]], columns=R.columns)
            weights = pd.concat([first_weights, pd.DataFrame(np.tile(weights, 
                        (len(ep), 1)), index=R.index[ep], columns=R.columns)], axis=0)
            #weights = pd.DataFrame(np.tile(weights, (len(ep), 1)), 
            #                       index = ep.index, columns = R.columns)
            weights.index.name = R.index.name
        else:
            weights = pd.DataFrame(weights, index=[R.index[0]], columns=R.columns)
    
  original_weight_dim = weights.shape[1] # save this for computing two-way turnover

    
  if weights.isna().sum().sum() > 0:
    weights.fillna(0, inplace=True)
    wn.warn("NAs detected in weights. Imputing with zeroes.")
    
  residual_weights = 1 - weights.sum(axis=1)
  if abs(residual_weights).sum() != 0:
    print("One or more periods do not have investment equal to 1. Creating residual weights.")
    weights["Residual"] = residual_weights
    R["Residual"] = 0
    
  if weights.shape[0] != 1:
    portf_returns, bop_weights, eop_weights = [], [], []
    for i in range(weights.shape[0]-1):
      subset = R.loc[(R.index >= weights.index[i]) & (R.index <= weights.index[i+1])]
      if i >= 1: # first period include all data
                # otherwise, final period of previous period is included, so drop it
        subset = subset.iloc[1:]
      subset_out = compute_weights_and_returns(subset, weights.iloc[i,:])
      subset_out[0].columns = ["Portfolio.Returns"]
      portf_returns.append(subset_out[0])
      bop_weights.append(subset_out[1])
      eop_weights.append(subset_out[2])
    portf_returns = pd.concat(portf_returns, axis=0)
    bop_weights = pd.concat(bop_weights, axis=0)
    eop_weights = pd.concat(eop_weights, axis=0)
  else: # only one weight allocation at the beginning and just drift the portfolio
    out = compute_weights_and_returns(R, weights)
    portf_returns = out[0]; portf_returns.columns = ['Portfolio.Returns']
    bop_weights = out[1]
    eop_weights = out[2]
    
    
  pct_contribution = R * bop_weights
  cum_returns = (1 + portf_returns).cumprod()
  eop_value = eop_weights * pd.DataFrame(np.tile(cum_returns, (1, eop_weights.shape[1])), 
                                        index=eop_weights.index, columns=eop_weights.columns)
  bop_value = bop_weights * pd.DataFrame(np.tile(cum_returns/(1+portf_returns), (1, bop_weights.shape[1])), 
                                        index=bop_weights.index, columns=bop_weights.columns)

  turnover = (np.abs(bop_weights.iloc[:, :(original_weight_dim-1)] - eop_weights.iloc[:, :(original_weight_dim-1)].shift(1))).sum(axis=1).dropna()
  turnover = pd.DataFrame(turnover, index=eop_weights.index[1:], columns=['Two-way turnover'])

  out = [portf_returns, pct_contribution, bop_weights, eop_weights, bop_value, eop_value, turnover]
  out = {k: v for k, v in zip(['returns', 'contribution', 'BOP.Weight', 'EOP.Weight', 'BOP.Value', 'EOP.Value', 'Two.Way.Turnover'], out)}
  if verbose:
        return out
  else:
        return portf_returns

Rather than go into it line by line, those interested can read the code, but ultimately, a lot of it is basically seeing if the weights were passed in as a data frame of individually customized weights (E.G. on January, my weights were such and such, and on February, they shifted to some other such and such), or if the weights were passed in as a vector, or Python list, if someone just wants to rebalance a portfolio (E.G. a classic buy-and-hold 60/40), and if some portfolio weights don’t add up to 1. Essentially, a fair bit of bookkeeping. (Speaking of, I’m not sure what happened to the programming language customization block here on wordpress, so, I sincerely encourage readers to check my github for this code).

Now, here’s the interesting part which motivated this post:

I didn’t write most of this code in Python. Or rather, not initially.

I wrote it in R, first, to understand it, with R’s vectorization, where I’m a bit more comfortable. (As an aside, as far as languages for quantitative finance go, I think R has the much more advanced buy-side libraries such as PortfolioAnalytics or Quantstrat compared to Python, as they were written by high-level industry practitioners, whereas Python’s finance ecosystem seems to be…fairly threadbare, consisting of various islands of libraries that don’t really play together all that well.)

Here’s the R code that I did to rewrite Peter Carl’s Return.portfolio.geometric (the underlying function that runs Return.portfolio that I’ve used all these years):

compute_weights_and_returns <- function(subset, weights) {
  rep_weights <- matrix(nrow=nrow(subset), ncol = ncol(subset), weights, byrow = TRUE)
  cum_subset_weights <- cumprod(1+subset) * rep_weights
  EOP.Weight <- cum_subset_weights/rowSums(cum_subset_weights)
  cum_subset_weights_bop <- cum_subset_weights/(1+subset)
  BOP.Weight <- cum_subset_weights_bop/rowSums(cum_subset_weights_bop)
  portf_returns_subset <- xts(rowSums(subset * BOP.Weight), order.by=index(subset))
  return(list(portf_returns_subset, BOP.Weight, EOP.Weight))
}

Return.portfolio.geometric.ilya <- function(R, weights, verbose = TRUE, rebalance_on = 'months') {
  
  if(sum(is.na(R)) > 0) {
    R[is.na(R)] <- 0
    warning("NAs detected in returns. Imputing with zeroes.")
  }
  
  if(missing(weights)) {
    weights <- rep(1/ncol(R), ncol(R))
    warning("Weights not provided, assuming equal weight for rebalancing periods.")
  }
  
  # if weights passed in as vector
  if(is.null(dim(weights))) {
    if(!is.null(rebalance_on)) {
      ep <- endpoints(R)
      first_weights <- xts(t(matrix(weights)), order.by=index(R)[1])
      weights <- rbind(first_weights,
                           xts(matrix(nrow=length(index(R)[ep]), 
                                      ncol = length(weights), weights, byrow=TRUE),
                             order.by = index(R)[ep]))
      # get the first day of all the weights, make sure it's unique
      # weights <- weights[!duplicated(index(weights)),]
    } else {
      weights <- xts(t(matrix(weights)), order.by=index(R)[1])
    }
    colnames(weights) <- colnames(R)
  } 
  
  original_weight_dim <- ncol(weights) # save this for computing two-way turnover
  
  if(original_weight_dim != ncol(R)) {
    stop("Number of weights is unequal to number of assets. Correct this.")
  }
  
  if(sum(is.na(weights)) > 0) {
    weights[is.na(weights)] <- 0
    warning("NAs detected in weights. Imputing with zeroes.")
  }
  
  residual_weights <- 1-rowSums(weights)
  if(sum(abs(residual_weights)) != 0) {
    warning("One or more periods do not have investment equal to 1. Creating residual weights.")
    weights$Residual <-  residual_weights
    R$Residual <- 0
  }
  if(nrow(weights) != 1) {
    portf_returns <- bop_weights <- eop_weights <- list()
    for(i in 1:(nrow(weights)-1)) {
      subset <- R[paste((index(weights)[i]),index(weights)[i+1], sep = "::"),]
      if(i >= 2) { # first period include all data
        # otherwise, final period of previous period is included, so drop it
        subset <- subset[-1,]
      }
      subset_out <- compute_weights_and_returns(subset, weights[i,])
      colnames(subset_out[[1]]) <- "Portfolio.Returns"
      portf_returns[[i]] <- subset_out[[1]]
      bop_weights[[i]] <- subset_out[[2]]
      eop_weights[[i]] <- subset_out[[3]]
    }
    portf_returns <- do.call(rbind, portf_returns)
    bop_weights <- do.call(rbind, bop_weights)
    eop_weights <- do.call(rbind, eop_weights)
  } else { # only one weight allocation at the beginning and just drift the portfolio
    out <- compute_weights_and_returns(R, weights)
    portf_returns <- out[[1]]; colnames(portf_returns) <- "Portfolio.Returns"
    bop_weights <- out[[2]]
    eop_weights <- out[[3]]
  }
  pct_contribution <- R * bop_weights
  cum_returns <- cumprod(1+portf_returns)
  eop_value <- eop_weights * matrix(nrow=nrow(eop_weights), ncol = ncol(eop_weights), 
                                    cum_returns, byrow= FALSE) 
  bop_value <- bop_weights * matrix(nrow=nrow(bop_weights), ncol = ncol(bop_weights),
                                    cum_returns/(1+portf_returns), byrow = FALSE)
  
  # add turnover computation because that's what this whole exercise is about
  turnover <- na.omit(xts(rowSums(abs(bop_weights[,1:original_weight_dim] - 
                                        lag(eop_weights[,1:original_weight_dim]))), 
                          order.by = index(eop_weights)))
  
  colnames(turnover) <- "Two-way turnover"
  
  out <- list(portf_returns, pct_contribution, bop_weights, 
              eop_weights, bop_value, eop_value, turnover)
  names(out) <- c("returns", "contribution", "BOP.Weight", 
                  "EOP.Weight", "BOP.Value", "EOP.Value", "Two.Way.Turnover")
  if(verbose) {
    return(out)
  } else {
    return(portf_returns)
  }
}

Now, how did I go from the above to the code at the beginning of this post?

ChatGPT4.

That is, I translated my R code into Python code using ChatGPT4 by simply having it translate, block by block, my R code into Python.

Now, was it perfect? No, actually. I had to run the code block by block, and do a little bit of debugging here and there–including adding the functionality for the Return_portfolio function to make deep copies of the returns and weights, instead of modifying the original data structure passed into the function, since Python by default is a pass by reference language, not pass by value (ugh). However, this sort of “Rosetta Stone” functionality inherent in ChatGPT4 is occasionally useful.

Now, when I first tried prompting chatGPT4 to just translate R’s functions, without guiding it step by step, it was a total disaster. Contrary to popular belief, ChatGPT4 is *not* some all-powerful, all-perfect, all-knowing, and all-wise (as the late, great, George Carlin once said) system. But occasionally, it can be useful.

Furthermore, as someone that’s used AI to try my hand at image generation, I also consider those systems (such as StableDiffusion, or Leonardo.AI) to be…somewhat impressive, if in their infancy right now (it gets much more difficult to have multiple subjects interact on one image–I.E. if you want to describe one person in an image, that’s one thing, but if you want to describe two, it isn’t like you can make an image of one–say, save it with a name, such as Alice.jpeg, make an image of a second, save it to another name, such as Bob.jpeg, and then say “Alice being hugged by Bob”). Of course, I’ve definitely been aware about the copyright hubbub brewing over the image generating AI space, but as someone that’s given away plenty of code over the past 8 or so years, including code that has been augmented to run $87 million at a $1B AUM fund, I’m pretty squarely on the open source side, even if seeing my code being responsible for running more than $100 million in AUM, and not getting paid residuals on the management fees, has discouraged me from continuing to share strategy research (I was always of the opinion that anything I shared using freely available Yahoo data was something that large investment houses had already passed over and run a more sophisticated variant of–apparently, this isn’t the case, so giving away strategy code for free nowadays seems like a very suspect thing to do).

In any case, using image-generating AI has allowed me to make artwork the likes of which I never dreamed of creating before, so I’m of the opinion that a few people taking the side of IP hoarding corporations (such as big pharma hoarding insulin patents, or Disney continuing to milk Mickey Mouse) can go kick rocks, and that IP law has long overreached. That said, I already have StableDiffusion running on my machine, coupled with some more sophisticated models built on top of it (Dreamshaper v5, Deliberate v2, etc.), and still have a lot to learn about it–though I’ll probably need a more powerful machine than a gaming laptop to really crank images out.

Now, my thoughts on AI? So far, I don’t think that AI alone is going to pull a South Park and DERK YERRR JERBZ. At least if you’re a skilled professional.

Not alone, anyway. And for what it’s worth, as a quantitative research analyst, I found that I’d fire ChatGPT4 within a few hours, owing to the artifact of AI hallucinations. As the phrase goes, trust *but verify*. I think we’re still far, far away from ChatGPT4 being able to read a strategy paper on Quantpedia, replicating it in R or Python, running the code, and replicating the results. For that matter, I think that any strategy paper submission or blog post is incomplete without data and code, for immediate replication, ready to go. It’s the 21st century. Github or git out. A minimum reproducible example (MRE) as a demo of the concept (even a toy one), in my opinion, is vastly more useful than 25 pages of mathematical derivation. Though that’s just me speaking given my background of hands-on engineering and actually *doing things* rather than merely theorizing about them.

Okay, so…this post seems to have gotten long in the tooth. In other news, I’m in the job market, currently, and as I’ve shown, I have a fairly solid command of Python, having several years of off and on experience with it. And I’m also wondering if a written blog is the best way to actually communicate so many of my findings, or if YouTube is the way to go (though I don’t know much about editing videos).

So…yeah. Follow if you like the post, and most importantly:

Try to lose money…less than I do.

Thanks for reading.

My LinkedIn.

A quick example on using next day open-to-open returns for Tactical Asset Allocation.

First off, for the hiring managers out there, after about a one-year contracting role at Bank of America doing some analytical reporting coding for them in Python, I am on the job market. Feel free to find my LinkedIn here.

This post will cover how to make tactical asset allocation strategies a bit more realistic with regards to execution. That is, by using next-day open-to-open rather than observe-the-close-get-the-close, it’s possible to see how much this change affects a strategy, and potentially, something that I think a site like AllocateSmartly could implement to actively display in their simulations (E.G. a checkbox that says “display next-day open to open returns”).

Now, onto the idea of the post: generally, when doing tactical asset allocation backtests, we tend to just use one set of daily returns. Get the adjusted close data, and at the end of every month, allocate to selected holdings. That is, most TAA backtests we’ve seen generally operate under the assumption of:

“Run the program at 3:50 PM EST. on the last day of the month, scrape the prices for the assets, allocate assets in the next ten minutes.”

This…can generally work for small accounts, but for institutions interested in scaling these strategies, maybe not so much.

So, the trick here, first, in English is this:

Compute open-to-open returns. Lag them by negative one. While this may sound really spooky at first, because a lag of negative one implies the potential for lookahead bias, in this case, it’s carefully done to use future returns. That is, the statement in English is (for example): “given my weights for data at the close of June 30, what would my open-to-open returns have been if I entered on the open of July 1st, instead of the June 30 close?”. Of course, this sets the return dates off by one day, so you’ll have to lag the total portfolio returns by a positive one to readjust for that.

This is how it works in code, using my KDA asset allocation algorithm. :


require(quadprog)

# compute strategy statistics
stratStats <- function(rets) {
  stats <- rbind(table.AnnualizedReturns(rets), maxDrawdown(rets))
  stats[5,] <- stats[1,]/stats[4,]
  stats[6,] <- stats[1,]/UlcerIndex(rets)
  rownames(stats)[4] <- "Worst Drawdown"
  rownames(stats)[5] <- "Calmar Ratio"
  rownames(stats)[6] <- "Ulcer Performance Index"
  return(stats)
}

# required libraries
require(quantmod)
require(PerformanceAnalytics)
require(tseries)

# symbols
symbols <- c("SPY", "VGK",   "EWJ",  "EEM",  "VNQ",  
"RWX",  "IEF",  "TLT",  "DBC",  "GLD", "VWO", "BND")  


# get data
rets <- list()
prices <- list()
op_rets <- list()
for(i in 1:length(symbols)) {
  tmp <- getSymbols(symbols[i], from = '1990-01-01', auto.assign = FALSE, use.adjusted = TRUE)
  price <- Ad(tmp)
  returns <- Return.calculate(Ad(tmp))
  op_ret <- Return.calculate(Op(tmp))
  colnames(returns) <- colnames(price) <- colnames(op_ret) <- symbols[i]
  prices[[i]] <- price
  rets[[i]] <- returns
  op_rets[[i]] <- op_ret
}
rets <- na.omit(do.call(cbind, rets))
prices <- na.omit(do.call(cbind, prices))
op_rets <- na.omit(do.call(cbind, op_rets))

# algorithm
KDA <- function(rets, offset = 0, leverageFactor = 1, momWeights = c(12, 4, 2, 1),
                op_rets = NULL, use_op_rets = FALSE) {
  
  # get monthly endpoints, allow for offsetting ala AllocateSmartly/Newfound Research
  ep <- endpoints(rets) + offset
  ep[ep < 1] <- 1
  ep[ep > nrow(rets)] <- nrow(rets)
  ep <- unique(ep)
  epDiff <- diff(ep)
  if(last(epDiff)==1) { # if the last period only has one observation, remove it
    ep <- ep[-length(ep)]
  }
  
  # initialize vector holding zeroes for assets
  emptyVec <- data.frame(t(rep(0, 10)))
  colnames(emptyVec) <- symbols[1:10]
  
  
  allWts <- list()
  # we will use the 13612F filter
  for(i in 1:(length(ep)-12)) {
    
    # 12 assets for returns -- 2 of which are our crash protection assets
    retSubset <- rets[c((ep[i]+1):ep[(i+12)]),]
    epSub <- ep[i:(i+12)]
    sixMonths <- rets[(epSub[7]+1):epSub[13],]
    threeMonths <- rets[(epSub[10]+1):epSub[13],]
    oneMonth <- rets[(epSub[12]+1):epSub[13],]
    
    # computer 13612 fast momentum
    moms <- Return.cumulative(oneMonth) * momWeights[1] + Return.cumulative(threeMonths) * momWeights[2] + 
      Return.cumulative(sixMonths) * momWeights[3] + Return.cumulative(retSubset) * momWeights[4]
    assetMoms <- moms[,1:10] # Adaptive Asset Allocation investable universe
    cpMoms <- moms[,11:12] # VWO and BND from Defensive Asset Allocation
    
    # find qualifying assets
    highRankAssets <- rank(assetMoms) >= 6 # top 5 assets
    posReturnAssets <- assetMoms > 0 # positive momentum assets
    selectedAssets <- highRankAssets & posReturnAssets # intersection of the above
    
    # perform mean-variance/quadratic optimization
    investedAssets <- emptyVec
    if(sum(selectedAssets)==0) {
      investedAssets <- emptyVec
    } else if(sum(selectedAssets)==1) {
      investedAssets <- emptyVec + selectedAssets 
    } else {
      idx <- which(selectedAssets)
      # use 1-3-6-12 fast correlation average to match with momentum filter  
      cors <- (cor(oneMonth[,idx]) * momWeights[1] + cor(threeMonths[,idx]) * momWeights[2] + 
                 cor(sixMonths[,idx]) * momWeights[3] + cor(retSubset[,idx]) * momWeights[4])/sum(momWeights)
      vols <- StdDev(oneMonth[,idx]) # use last month of data for volatility computation from AAA
      covs <- t(vols) %*% vols * cors
      
      # do standard min vol optimization
      minVolRets <- t(matrix(rep(1, sum(selectedAssets))))
      
      n.col = ncol(covs)
      zero.mat <- array(0, dim = c(n.col, 1))
      one.zero.diagonal.a <- cbind(1, diag(n.col), 1 * diag(n.col), -1 * diag(n.col))
      min.wgt <- rep(.05, n.col)
      max.wgt <- rep(1, n.col)
      bvec.1.vector.a <- c(1, rep(0, n.col), min.wgt, -max.wgt)
      meq.1 <- 1
      
      mv.port.noshort.a <- solve.QP(Dmat = covs, dvec = zero.mat, Amat = one.zero.diagonal.a,
                                    bvec = bvec.1.vector.a, meq = meq.1)
      
      min_vol_wt <- mv.port.noshort.a$solution
      names(min_vol_wt) <- rownames(covs)
      
      #minVolWt <- portfolio.optim(x=minVolRets, covmat = covs)$pw
      #names(minVolWt) <- colnames(covs)
      investedAssets <- emptyVec
      investedAssets[,selectedAssets] <- min_vol_wt
    }
    
    # crash protection -- between aggressive allocation and crash protection allocation
    pctAggressive <- mean(cpMoms > 0)
    investedAssets <- investedAssets * pctAggressive 
    
    pctCp <- 1-pctAggressive
    
    # if IEF momentum is positive, invest all crash protection allocation into it
    # otherwise stay in cash for crash allocation
    if(assetMoms["IEF"] > 0) {
      investedAssets["IEF"] <- investedAssets["IEF"] + pctCp
    }
    
    # leverage portfolio if desired in cases when both risk indicator assets have positive momentum
    if(pctAggressive == 1) {
      investedAssets = investedAssets * leverageFactor
    }
    
    # append to list of monthly allocations
    wts <- xts(investedAssets, order.by=last(index(retSubset)))
    allWts[[i]] <- wts
    
  }
  
  # put all weights together and compute cash allocation
  allWts <- do.call(rbind, allWts)
  allWts$CASH <- 1-rowSums(allWts)
  
  # add cash returns to universe of investments
  investedRets <- rets[,1:10]
  
  investedRets$CASH <- 0
  
  # compute portfolio returns
  out <- Return.portfolio(R = investedRets, weights = allWts)
  if(use_op_rets) {
    if(is.null(op_rets)) {
      stop("You didn't provide open returns.")
    } else {
      # cbind a cash return of 0 -- may not be necessary in current iterations of PerfA
      investedRets <- cbind(lag(op_rets[,1:10], -1), 0)
      out <- lag(Return.portfolio(R = investedRets, weights = allWts))
    }
  }
  return(list(allWts, out))
}


Essentially, the salient part of the code is at the start, around line 32, when the algorithm gets the data from Yahoo, in that it creates a new set of returns using open adjusted data, and at the end, at around line 152, when the code lags the open returns by -1–I.E. lag(op_rets[,1:10], -1), and then lags the returns again to realign the correct dates–out <- lag(Return.portfolio(R=investedRets, weights = allWts))

And here is the code for the results:

KDA_100 <- KDA(rets, leverageFactor = 1)
KDA_100_open <- KDA(rets, leverageFactor = 1, op_rets = op_rets, use_op_rets = TRUE)

compare <- na.omit(cbind(KDA_100[[2]], KDA_100_open[[2]]))
charts.PerformanceSummary(KDA_100[[2]])
compare <- na.omit(cbind(KDA_100[[2]], KDA_100_open[[2]]))
colnames(compare) <- c("Obs_Close_Buy_Close", "Buy_Open_Next_Day")
charts.PerformanceSummary(compare)
stratStats(compare)

With the following results:

> stratStats(compare)
                          Obs_Close_Buy_Close Buy_Open_Next_Day
Annualized Return                   0.1069000        0.08610000
Annualized Std Dev                  0.0939000        0.09130000
Annualized Sharpe (Rf=0%)           1.1389000        0.94300000
Worst Drawdown                      0.0830598        0.09694208
Calmar Ratio                        1.2870245        0.88815920
Ulcer Performance Index             3.8222753        2.41802066

As one can see, there’s definitely a bit of a performance deterioration to the tune of about 2% per year. While the strategy may still be solid, a loss of 20% of the CAGR means that the other risk/reward statistics suffer proportionally as well. In other words, this is a strategy that is fairly sensitive to the exact execution due to its fairly high turnover.

However, the good news is that there is a way to considerably reduce turnover as suggested by AllocateSmartly, which would be to reduce the impact of relative momentum on turnover. A new post on how to do *that* will be forthcoming in the near future.

One last thing–as I did pick up some Python skills, as evidenced by the way I ported the endpoints function into Python, and the fact that I completed an entire Python data science bootcamp in four instead of six months, I am also trying to port over the Return.portfolio function into Python as well, since that would allow a good way to compute turnover statistics as well. However, as far as I’ve seen with Python, I’m not sure there’s a well-maintained library similar to PerformanceAnalytics, and I do not know that there are libraries in Python that compare with rugarch, PortfolioAnalytics, and Quantstrat, though if anyone wants to share the go-to generally-accepted Python libraries to use beyond the usual numpy/pandas/matplotlib/cvxpy/sklearn (AKA the usual data science stack).

Thanks for reading.

NOTE: Lastly, some other news: late last year, I was contacted by the NinjaTrader folks to potentially become a vendor/give lectures on the NinjaTrader site about how to create quantitative/systematic strategies. While I’m not a C# coder, I can potentially give lectures on how to use R (and maybe Python in the future) to implement them.

Finally, if you wish to get in touch with me, my email is ilya.kipnis@gmail.com, and I can be found on my LinkedIn. Additionally, if you wish to subscribe to my volatility strategy, that I’ve been successfully trading since 2017, feel free to check it out here.

A Review of Modern Asset Allocation For Wealth Management, by David M. Berns, PhD

This post will be a review of the book Modern Asset Allocation for Wealth Management, by Dr. David Berns, PhD. The long story short is that I think the book is a must-read for a new and different perspective on asset management, though there are some things I’d like to see that could be very easily covered with a second edition.

In my opinion, rather than provide a single how-to portfolio like some other books, such as Meb Faber’s Ivy Portfolio, or Adaptive Asset Allocation (both of which are fairly good reads), MAAWM submits a completely new way of thinking about portfolio construction–namely by incorporating a quantitative way to gauge a prospective investor’s risk appetite, and to incorporate behavioral finance into systematic portfolio construction. To me, who’s completely quantitative, the idea of incorporating something that is much more nebulous to quantify, such as individual risk preference, was something I’ve never even thought about, but after reading this book, think should be mandatory for any financial adviser to think about. (Note: as I was never part of a client-facing role at a buy-side firm, I was never sponsored for a series 65, so I’m not an official financial adviser.) The three types of behavioral risk traits are risk aversion (would you accept a bet that paid $3.85 or $0.10 with 50% chance each? 60/40? 70/30? 80/20? 90/10? Any of them?), loss aversion (would you accept a lottery with 50% chance to lose you $3 or win $6? What about if the loss was $4? $5? $6?), and reflection (would you rather take a sure $20 or a 1/3rd chance at $60? What about if it was a loss? A guaranteed loss of $20? Or a 1/3rd chance to lose $60?). Depending on how a prospective client answers such questions (with dollar amounts scaled in proportion to their annual income), one can formulate a multi-parameter utility function with a more nuanced shape than a simple log-scaling utility function as a function of gain or loss, in order to incorporate more subtle potential client risk preferences. For me, this is the first time I’ve seen the idea of quantitatively incorporating behavioral finance into systematic portfolio construction. I also think this is an absolutely fantastic insight. If someone’s younger and just wants the highest expected return, that’s a much different client than one who can’t risk a large drawdown, and if there are multiple offered strategies, such measurements mean a much more customized approach for different individual clients.

Beyond that, the book also brings to mind the idea of higher portfolio moments that can affect a client’s utility function–namely, skewness and kurtosis. Namely the idea that ideally, one wants to maximize portfolio skew (namely, cutting losers and letting winners run–which is one reason among others that I swear by momentum and trend-following), while minimizing kurtosis (tail risk is painful!). The idea is that a simple mean-variance portfolio optimizer doesn’t account for these higher moments, and that they’re important. However, this book doesn’t really present a mathematical way to tie the empirical calculation and optimization of skewness and kurtosis back into the 3-parameter utility function, so much as just building up the idea that these moments are important, and for very good reason. While I fully agree with the assertion of the importance of higher moments, unto my experience, incorporating the third and fourth moments into portfolio optimization is *hard*.

Whereas a mean-variance optimization (or rather, momentum selection and minimum variance optimization) backtests are relatively easy to run in terms of computing time, incorporating co-skewness and co-kurtosis calculations gets *very* messy, *very* quickly, and if I recall correctly, demands global optimizers such as those found in R’s PortfolioAnalytics package (I could be mistaken here). This has very real computational costs. That is, performing optimization on third and fourth portfolio moments *ex ante* is most likely much more computationally expensive in runtime than deploying a heuristic on it *ex post*. To understand just how complicated co-skewness and co-kurtosis get, I recommend looking at the paper “Estimation and Decomposition of Downside Risk for Portfolios with Non-Normal Returns”, written by Kris Boudt, Brian Peterson, and Christophe Croux. It’s a *lot* more complicated than minimizing a matrix product of weights and a covariance matrix subject to full investment and long-only weights for each asset, which means far fewer simulations to measure portfolio robustness to perturbations in parameter settings (EG lookback periods, rebalance dates, etc.)

Next, the book talks about asset selection. Here, it presents yet *another* fantastic idea I’ve seen nowhere else. The idea of the Mimicking Portfolio Tracking Error (MPTE). That is, given your current universe of assets, can an asset you’re considering adding to the universe be replicated by a combination of the others? The way to check that is with *constrained* least-squares regression, with the constraint of the weights of the other assets adding up to 100%. I’ve never seen this done before, but it seems both R and Python have ways to do this. A critique I have of this chapter, though, is that the assets in question aren’t ETFs that one can go and buy on the open market during trading hours, but rather, academic asset index classes from places like Kenneth French’s data website. And while that’s certainly fantastic as far as getting more data for analysis for further back in time, that there isn’t translation for these academic/illustrative asset classes to ETFs and mutual fund proxies for longer histories felt slightly disappointing.

The punchline here is that if the tracking error is lower than 5% for the asset considered to be added to the universe as expressed through a linear combination of assets in the existing universe, then there’s probably a great deal of collinearity between the asset in question, and the assets in the existing universe, which has a chance to confuse an optimization algorithm. The other fantastic idea presented here is the idea of performance assets (high return at the cost of high volatility, low skew, high kurtosis), and diversifying assets (lower return but that smoothen the portfolio trajectory by reducing overall portfolio volatility and kurtosis). Basically, high returns don’t mean much if a client can’t stick with them because of the emotional roller coaster ride that the portfolio is on, so a better risk-adjusted portfolio is better, especially if one can leverage the portfolio up to get better returns.

Next comes the idea with which I have a philosophical disagreement with–the idea of being able to assess returns by testing for stationarity on several decades of monthly return data using the Kolmogorov-Smirnov test. That is, the idea that what an asset class has done over a very long period of time (about a single investing lifetime), it should continue to do in aggregate for another prolonged period of time, in aggregate.

By far the best counterexample I can think of is the idea that for 50 years, coming out of the end of the second World War, the US stock market (I.E. the S&P 500) was on a tear (after all, the rest of the world was rebuilding). Then, if you invested in the S&P at the top of the dot-com boom, right as George W. Bush took office, for the next ten years, your actual return was a *negative* annualized 1% (I.E. a CAGR of -1%). For trivia’s sake, had one invested in the S&P at the top before the crash of the great depression, it would have taken until 1954 just to recover one’s initial stake. Had one invested at the top of the dot com bubble, aside from a minor new equity high in 2007, it would have taken until basically 2013 or 2014 to make new significant equity highs.

That’s…painful, to say the least, when U.S. equities are supposed to be a return-driving asset. Again, to those that extol the virtues of a permanent portfolio type of buy/hold/rebalance approach, the approach in this book is second to none. However, for those that have read my blog since the beginning, you’ll know that I swear by momentum and trend-following trading. My volatility trading strategy is a trend-follower (I simply think that there’s no other way to trade volatility, since events like Feb. 5, 2018, that caused XIV to lose 95% in a day, and be terminated, mandate it), and various tactical asset allocation strategies I’ve blogged about on this blog are *also* momentum-based trading strategies (all of which can be found on AllocateSmartly). In fact, for all the rightful condemnation that the theoretical maximum Sharpe Ratio Markowitz portfolio gets, a global minimum variance optimization on a set of assets selected *by* momentum is the practical form of this, which is the Adaptive Asset Allocation algorithm. So, if one swears by buy and hold, I think the approach outlined in this book is terrific, but if active trading is more one’s speed, then take just this one chapter with a grain of salt and understand that the approach this book recommends is more of a permanent portfolio style buy, hold, and rebalance approach with the belief that asset returns will work themselves out over a long enough time horizon. In my opinion, rebalancing a five-asset portfolio like in Adaptive Asset Allocation (or KDA, which uses the same universe) isn’t too much to ask for once a month (though should be done in a tax-free account if possible).

Speaking of which, one last issue this part of the book touches upon is taxes, which very few asset allocation books consider. I certainly don’t consider it in the formulation of my strategies, as I assume that an asset allocation firm knows how to legally avoid taxes, place their clients’ funds into various tax-free/less taxable retirement accounts, and so on. I’m a strategist, not a tax accountant, so, not an expert there. However, this book does touch upon the topic, so if an individual is running a one-man office, well, this is most likely required reading.

The last topic the book touches on is to combine everything into optimized portfolios. Prospect theory and risk tolerance parameters were established, assets selected, returns estimated, then there are various portfolios recommended for given risk profiles. In my opinion, this section of the book was slightly lacking in that there wasn’t an appendix that had a portfolio allocation for all 60 permutations of the three risk parameters presented earlier in the book, but that’s a very minor nitpick.

So, that’s the book. Long story short, there are quite a few groundbreaking ideas presented here that make this book a must-read, no questions asked. If someone’s a quantitative strategist, they should *also* read this book immediately. That said, it does lack an out-of-the-box “if you want an easy-to-implement ETF-based translation of this strategy, here’s what you do”, which I’d love to see in a second edition of this book (ideally with mutual fund proxies for backtesting). Furthermore, there may be some ideas that can be taken with a grain of salt.

All in all, a wholeheartedly recommended fantastic read that any modern-day advisor should pick up, and a source for some very interesting ideas for quantitative strategists.

Thanks for reading.

NOTE: I am always looking to hear about interesting opportunities that can make use of my skillset. To contact me, feel free to reach out to me on my LinkedIn.

Two Different Methods to Apply Some Corey Hoffstein Analysis to your TAA

So, first off: I just finished a Thinkful data science in python bootcamp program that was supposed to take six months, in about four months. All of my capstone projects I applied to volatility trading; long story short, the more advanced data science techniques underperformed more quant-specific techniques. Is there a place for data science in Python in the world? Of course. Some firms swear by it. However, R currently has many more libraries developed specifically for quantitative finance, such as PerformanceAnalytics, quantstrat, PortfolioAnalytics, and so on. Even for more basic portfolio management tasks, I use functions such as Return.Portfolio and charts.PerformanceSummary in R, the equivalent for which I have not seen in Python. While there are some websites with their own dialects built on top of Python, such as quantConnect and quantopian, I think that’s more their own special brand of syntax, as opposed to being able to create freeform portfolio backtesting strategies from pandas.

In any case, here’s my Python portfolio from the bootcamp I completed. Also, for those that wish to run my supervised and unsupervised learning notebooks, please replace the Yahoo data fetching block with the one from my final capstone, because at some point, Yahoo stopped providing the SHORTVOL index data before 2020. You can look at the notebooks to see exactly what I tried, but to cut to the chase, none of the techniques worked. Random forests, SVMs, XG boosting, UMAP…they don’t really apply to predicting returns. The features I used were those I use in my own trading strategy, at least some of them, so it wasn’t a case of “garbage in, garbage out”. And the more advanced the technique, the worse the results. In the words of one senior quant trading partner: “Auto-ML = auto-bankrupt”. So when people say “we use AI and machine learning to generate superior returns”, keep in mind that they’re probably not using supervised learning models to predict returns (because there are multiple obstacles to that). After all, even linear regression can be thought of as a learning model.

Even taking PCAs of various term structure features did a worse job than my base volatility trading strategy. Of course, it’s gotten better since then as I added more risk management to the strategy, and caught a nice chunk of the coronavirus long vol move in March. You can subscribe to it here.

So yes, I code in Python now (if the previous post wasn’t any indication, so those who need some Python development for quant work, if it uses the usual numpy/scipy/pandas stack, feel free to reach out to me).

Anyway, this post is about adding some Corey Hoffstein style analysis to asset allocation strategies, this time in R, because this is a technique I used for a very recent freelance project for an asset allocation firm that I currently freelance for (off and on). I call it Corey Hoffstein style, because on twitter, he’s always talking about analyzing the impact of timing luck. His blog at Newfound Research is terrific for thinking about elements one doesn’t see in many other places, such as analyzing trend-following strategies in the context of option payoffs, the impact of timing luck and various parameters of lookback windows, and so on.

The quick idea is this: when you rebalance a portfolio every month, you want to know how changing the various trading day affects your results. This is what Walter does over at AllocateSmartly.

But a more interesting question is what happens when a portfolio is rebalanced on longer timeframes–that is, what happens when you rebalance a portfolio only once a quarter, once every six months, or once a year? What if instead of rebalancing quarterly on January, April, and so on, you rebalance instead on February, May, etc.?

This is a piece of code (in R, so far) that does exactly this:

offset_monthly_endpoints <- function(returns, k, offset) {
  
  # because the first endpoint is indexed to 0 and is the first index, add 1 to offset
  mod_offset = (offset+1)%%k # make sure we don't have 7 month offset on 6 month rebalance--that's just 1.
  eps <- endpoints(returns, on = 'months') # get monthly endpoints
  indices <- (1:length(eps)) # create indices from 1 to number of endpoints
  selected_eps <- eps[indices%%k == mod_offset] # only select endpoints that have proper offset when modded by k
  selected_eps <- unique(c(0, selected_eps, nrow(returns))) # append start and end of data
  return(selected_eps)
}

Essentially, the idea behind this function is fairly straightforward: given that we want to subset on monthly endpoints at some interval (that is, k = 3 for quarterly, k = 6 for every 6 months, k = 12 for annual endpoints), we want to be able to offset those by some modulo, we use a modulo operator to say “hey, if you want to offset by 4 but rebalance every 3 months, that’s just the same thing as offsetting by 1 month”. One other thing to note is that since R is a language that starts at index 1 (rather than 0), there’s a 1 added to the offset, so that offsetting by 0 will get the first monthly endpoint. Beyond that, it’s simply creating an index going from 1 to the length of the endpoints (that is, if you have around 10 years of data, you have ~120 monthly endpoints), then simply seeing which endpoints fit the criteria of being every first, second, or third month in three.

So here’s how it works, with some sample data:

require(quantmod)
require(PerformanceAnalytics)

getSymbols('SPY', from = '1990-01-01')
> head(SPY[offset_monthly_endpoints(Return.calculate(Ad(SPY)), 3, 1)])
           SPY.Open SPY.High  SPY.Low SPY.Close SPY.Volume SPY.Adjusted
1993-01-29 43.96875 43.96875 43.75000  43.93750    1003200     26.29929
1993-04-30 44.12500 44.28125 44.03125  44.03125      88500     26.47986
1993-07-30 45.09375 45.09375 44.78125  44.84375      75300     27.15962
1993-10-29 46.81250 46.87500 46.78125  46.84375      80700     28.54770
1994-01-31 48.06250 48.31250 48.00000  48.21875     313800     29.58682
1994-04-29 44.87500 45.15625 44.81250  45.09375     481900     27.82893
> head(SPY[offset_monthly_endpoints(Return.calculate(Ad(SPY)), 3, 2)])
           SPY.Open SPY.High  SPY.Low SPY.Close SPY.Volume SPY.Adjusted
1993-02-26 44.43750 44.43750 44.18750  44.40625      66200     26.57987
1993-05-28 45.40625 45.40625 45.00000  45.21875      79100     27.19401
1993-08-31 46.40625 46.56250 46.34375  46.56250      66500     28.20059
1993-11-30 46.28125 46.56250 46.25000  46.34375     230000     28.24299
1994-02-28 46.93750 47.06250 46.81250  46.81250     333000     28.72394
1994-05-31 45.73438 45.90625 45.65625  45.81250     160000     28.27249
> head(SPY[offset_monthly_endpoints(Return.calculate(Ad(SPY)), 3, 3)])
           SPY.Open SPY.High  SPY.Low SPY.Close SPY.Volume SPY.Adjusted
1993-03-31 45.34375 45.46875 45.18750  45.18750     111600     27.17521
1993-06-30 45.12500 45.21875 45.00000  45.06250     437600     27.29210
1993-09-30 46.03125 46.12500 45.84375  45.93750      99300     27.99539
1993-12-31 46.93750 47.00000 46.56250  46.59375     312900     28.58971
1994-03-31 44.46875 44.68750 43.53125  44.59375     788800     27.52037
1994-06-30 44.82812 44.84375 44.31250  44.46875     271900     27.62466

Notice how we get different quarterly rebalancing end dates. This also works with semi-annual, annual, and so on. The one caveat to this method, however, is that when doing tactical asset allocation analysis in R, I subset by endpoints. And since I usually use monthly endpoints in intervals of one (that is, every monthly endpoint), it’s fairly simple for me to incorporate measures of momentum over any monthly lookback period. That is, 1 month, 3 month, etc. are all fairly simple when rebalancing every month. However, for instance, if one were to rebalance every quarter, and take only quarterly endpoints, then getting a one-month momentum measure every quarter would take a bit more work, and if one wanted to do quarterly rebalancing, tranche it every month, but also not simply rebalance at the end of the month, but rebalanace multiple times *throughout* the month, that would require even more meticulousness.

However, one sort of second, “kludge-y” method to go about this, would be to run the backtest to find all the weights, and then apply a similar coding methodology to the *weights*. For instance, if you have a time series of monthly weights, just create an index ranging from 1 to the length of the weights, then depending on how often you want to rebalance, subset for every mod 3 == 0, 1, or 2. More generally, if you rebalance once every k months, you create an index ranging from 1 to the length of your index if the language is base 1 (R), or 0 to length n-1, if Python. Then, you simply see which indices give a remainder of 0 to k-1 when taking the modulo K, and that’s it. This will allow you to get k different rebalancing tranches by taking the indices of those endpoints. And you can still offset those endpoints daily as well. The caveat here, of course, is that you need to run the backtest for all of the individual months, and if you have a complex optimization routine, this can take an unnecessarily long time. So which method you use depends on the task at hand. This second method, however, is what I would use as a wrapper to a monthly rebalancing algorithm that already exists, such as my KDA asset allocation algorithm.

That’s it for this post. In terms of things I want to build going forward: I’d like to port over some basic R functionality to Python, such as Return.Portfolio, and charts.PerformanceSummary, and once I can get that working, to demonstrate how to do a lot of the same asset allocation work I’ve done in R…in Python, as well.

Thanks for reading.

NOTE: I am currently searching for a full-time role to make use of my R (and now Python) skills. If you are hiring, or know of someone that is, don’t hesitate to reach out to me on my LinkedIn.

A Python Investigation of a New Proposed Short Vol ETF–SVIX

This post will be about analyzing SVIX–a proposed new short vol ETF that aims to offer the same short vol exposure as XIV used to–without the downside of, well, blowing up in 20 minutes due to positive feedback loops. As I’m currently enrolled in a Python bootcamp, this was one of my capstone projects on A/B testing, so, all code will be in Python (again).

So, first off, with those not familiar, there was an article about this proposed ETF published about a month ago. You can read it here. The long story short is that this ETF is created by one Stuart Barton, who also manages InvestInVol. From conversations with Stuart, I can vouch for the fact that he strikes me as very knowledgeable in the vol space, and, if I recall correctly, was one of the individuals that worked on the original VXX ETF at Barclay’s. So when it comes to creating a newer, safer vehicle for trading short-term short vol, I’d venture to think he’s about as good as any.

In any case, here’s a link to my Python notebook, ahead of time, which I will now discuss here, on this post.

So first off, we’ll start by getting the data, and in case anyone forgot what XIV did in 2018, here’s a couple of plots.

import numpy as np
import pandas as pd
import scipy.stats as stats
import matplotlib.pyplot as plt
from pandas_datareader import data
import datetime as dt
from datetime import datetime

# get XIV from a public dropbox -- XIV had a termination event Feb. 5 2018, so this is archived data.

xiv = pd.read_csv("https://dl.dropboxusercontent.com/s/jk6der1s5lxtcfy/XIVlong.TXT", parse_dates=True, index_col=0)

# get SVXY data from Yahoo finance
svxy = data.DataReader('SVXY', 'yahoo', '2016-01-01')
#yahoo_xiv = data.DataReader('XIV', 'yahoo', '1990-01-01')

# yahoo no longer carries XIV because the instrument blew up, need to find it from historical sources
xiv_returns = xiv['Close'].pct_change()
svxy_returns = svxy['Close'].pct_change()

xiv['Close'].plot(figsize=(20,10))
plt.show()
xiv['2016':'2018']['Close'].plot(figsize=(20,10))

Yes, for those new to the blog, that event actually happened, and in the span of 20 minutes (my trading system got to the sideline about a week before, and even had I been in–which I wasn’t–I would have been in ZIV), during which time XIV blew up in after-hours trading. Immediately following, SVXY (which survived), deleveraged to a 50% exposure.

In any case, here’s the code to get SVIX data from my dropbox, essentially to the end of 2019, after I manually did some work on it because the CBOE has it in a messy format, and then to combine it with the combined XIV + SVXY returns data. (For the record, the SVIX hypothetical performance can be found here on the CBOE website)

# get formatted SVIX data from my dropbox (CBOE has it in a mess)

svix = pd.read_csv("https://www.dropbox.com/s/u8qiz7rh3rl7klw/SHORTVOL_Data.csv?raw=1", header = 0, parse_dates = True, index_col = 0)
svix.columns = ["Open", "High", "Low", "Close"]
svix_rets = svix['Close'].pct_change()

# put data set together

xiv_svxy = pd.concat([xiv_returns[:'2018-02-07'],svxy_returns['2018-02-08':]], axis = 0)
xiv_svxy_svix = pd.concat([xiv_svxy, svix_rets], axis = 1).dropna()
xiv_svxy_svix.tail()

final_data = xiv_svxy_svix
final_data.columns = ["XIV_SVXY", "SVIX"]

One thing that can be done right off the bat (which is a formality) is check if the distributions of XIV+SVXY or SVIX are normal in nature.

print(stats.describe(final_data['XIV_SVXY']))
print(stats.describe(final_data['SVIX']))
print(stats.describe(np.random.normal(size=10000)))

Which gives the following output:

DescribeResult(nobs=3527, minmax=(-0.9257575757575758, 0.1635036496350366), mean=0.0011627123490346562, variance=0.0015918321320673623, skewness=-4.325358554250933, kurtosis=85.06927230848028)
DescribeResult(nobs=3527, minmax=(-0.3011955533480766, 0.16095949898733686), mean=0.0015948970447533636, variance=0.0015014216189676208, skewness=-1.0811171524703087, kurtosis=4.453114992142524)
DescribeResult(nobs=10000, minmax=(-4.024990382591559, 4.017237262611489), mean=-0.012317646021121993, variance=0.9959681097965573, skewness=0.00367629631713188, kurtosis=0.0702696931810931)

Essentially, both of them are very non-normal (obviously), so any sort of statistical comparison using t-tests isn’t really valid. That basically leaves the Kruskal-Wallis test and Wilcoxon signed rank test to see if two data sets are different. From a conceptual level, the idea is fairly straightforward: the Kruskal-Wallis test is analogous to a two-sample independent t-test to see if one group differs from another, while the Wilcoxon signed rank test is analogous to a t-test of differences, except both use ranks of the observations rather than the actual values themselves.

Here’s the code for that:

stats.kruskal(final_data['SVIX'], final_data['XIV_SVXY'])
stats.wilcoxon(final_data['SVIX'], final_data['XIV_SVXY'])

With the output:

KruskalResult(statistic=0.8613306385456933, pvalue=0.3533665896055551)
WilcoxonResult(statistic=2947901.0, pvalue=0.0070668195307847575)

Essentially, when seen as two completely independent instruments, there isn’t enough statistical evidence to reject the idea that SVIX has no difference in terms of the ranks of its returns compared to XIV + SVXY, which would make a lot of sense, considering that for both, Feb. 5, 2018 was their worst day, and there wasn’t much of a difference between the two instruments prior to Feb. 5. In contrast, when considering the two instruments from the perspective of SVIX becoming the trading vehicle for what XIV used to be, and then comparing the differences against a 50% leveraged SVXY, then SVIX is the better instrument with differences that are statistically significant at the 1% level.

Basically, SVIX accomplishes its purpose of being an improved take on XIV/SVXY, because it was designed to be just that, with statistical evidence of exactly this.

One other interesting question to ask is when exactly did the differences in the Wilcoxon signed rank test start appearing? After all, SVIX is designed to have been identical to XIV prior to the crash and SVXY’s deleveraging. For this, we can use the endpoints function for Python I wrote in the last post.

# endpoints function

def endpoints(df, on = "M", offset = 0):
    """
    Returns index of endpoints of a time series analogous to R's endpoints
    function. 
    Takes in: 
        df -- a dataframe/series with a date index
         
        on -- a string specifying frequency of endpoints
         
        (E.G. "M" for months, "Q" for quarters, and so on)
         
        offset -- to offset by a specified index on the original data
        (E.G. if the data is daily resolution, offset of 1 offsets by a day)
        This is to allow for timing luck analysis. Thank Corey Hoffstein.
    """
     
    # to allow for familiarity with R
    # "months" becomes "M" for resampling
    if len(on) > 3:
        on = on[0].capitalize()
     
    # get index dates of formal endpoints
    ep_dates = pd.Series(df.index, index = df.index).resample(on).max()
     
    # get the integer indices of dates that are the endpoints
    date_idx = np.where(df.index.isin(ep_dates))
     
    # append zero and last day to match R's endpoints function
    # remember, Python is indexed at 0, not 1
    date_idx = np.insert(date_idx, 0, 0)
    date_idx = np.append(date_idx, df.shape[0]-1)
    if offset != 0:
        date_idx = date_idx + offset
        date_idx[date_idx < 0] = 0
        date_idx[date_idx > df.shape[0]-1] = df.shape[0]-1
    out = np.unique(date_idx)
    return out   

ep = endpoints(final_data)

dates = []
pvals = []
for i in range(0, (len(ep)-12)):
  data_subset = final_data.iloc[(ep[i]+1):ep[i+12]]
  pval = stats.wilcoxon(data_subset['SVIX'], data_subset['XIV_SVXY'])[1]
  date = data_subset.index[-1]
  dates.append(date)
  pvals.append(pval)
wilcoxTS = pd.Series(pvals, index = dates)
wilcoxTS.plot(figsize=(20,10))

wilcoxTS.tail(30)

The last 30 points in this monthly time series looks like this:

2017-11-29    0.951521
2017-12-28    0.890546
2018-01-30    0.721118
2018-02-27    0.561795
2018-03-28    0.464851
2018-04-27    0.900470
2018-05-30    0.595646
2018-06-28    0.405771
2018-07-30    0.228674
2018-08-30    0.132506
2018-09-27    0.085125
2018-10-30    0.249457
2018-11-29    0.230020
2018-12-28    0.522734
2019-01-30    0.224727
2019-02-27    0.055854
2019-03-28    0.034665
2019-04-29    0.019178
2019-05-30    0.065563
2019-06-27    0.071348
2019-07-30    0.056757
2019-08-29    0.129120
2019-09-27    0.148046
2019-10-30    0.014340
2019-11-27    0.006139
2019-12-26    0.000558
dtype: float64

And the corresponding chart looks like this:

Essentially, about six months after Feb. 5, 2018–I.E. about six months after SVXY deleveraged, we see the p-value for yearly rolling Wilcoxon signed rank tests (measured monthly) plummet and stay there.

So, the long story short is: once SVIX starts to trade, it should be the way to place short-vol, near-curve bets, as opposed to the 50% leveraged SVXY that traders must avail themselves with currently (or short VXX, with all of the mechanical and transaction risks present in that regard).

That said, from having tested SVIX with my own volatility trading strategy (which those interested can subscribe to, though in fair disclosure, this should be a strategy that diversifies a portfolio, and it’s a trend follower that was backtested in a world without Orange Twitler creating price jumps every month), the performance improves from backtesting with the 50% leveraged SVXY, but as I *dodged* Feb. 5, 2018, the results are better, but the risk is amplified as well, because there wasn’t really a protracted sideways market the likes of which we’ve seen the past couple of years for a long while.

In any case, thanks for reading.

NOTE: I am currently seeking a full-time opportunity either in the NYC or Philadelphia area (or remotely). Feel free to reach out to me on my LinkedIn, or find my resume here.

A Tale of an Edgy Panda and some Python Reviews

This post will be a quickie detailing a rather annoying…finding about the pandas package in Python.

For those not in the know, I’ve been taking some Python courses, trying to port my R finance skills into Python, because Python is more popular as far as employers go. (If you know of an opportunity, here’s my resume.) So, I’m trying to get my Python skills going, hopefully sooner rather than later.

However, for those that think Python is all that and a bag of chips, I hope to be able to disabuse people of that.

First and foremost, as far as actual accessible coursework goes on using Python, just a quick review of courses I’ve seen so far (at least as far as DataCamp goes):

The R/Finance courses (of which I teach one, on quantstrat, which is just my Nuts and Bolts series of blog posts with coding exercises) are of…reasonable quality, actually. I know for a fact that I’ve used Ross Bennett’s PortfolioAnalytics course teachings in a professional consulting manner before, quantstrat is used in industry, and I was explicitly told that my course is now used as a University of Washington Master’s in Computational Finance prerequisite.

In contrast, DataCamp’s Python for Finance courses have not particularly impressed me. While a course in basic time series manipulation is alright, I suppose, there is one course that just uses finance as an intro to numpy. There’s another course that tries to apply machine learning methodology to finance by measuring the performance of prediction algorithms with R-squareds, and saying it’s good when the R-squared values go from negative to zero, without saying anything of more reasonable financial measures, such as Sharpe Ratio, drawdown, and so on and so forth. There are also a couple of courses on the usual risk management/covariance/VaR/drawdown/etc. concepts that so many reading this blog are familiar with. The most interesting python for finance course I found there, was actually Dakota Wixom’s (a former colleague of mine, when I consulted for Yewno) on financial concepts, which covers things like time value of money, payback periods, and a lot of other really relevant concepts which deal with longer-term capital project investments (I know that because I distinctly remember an engineering finance course covering things such as IRR, WACC, and so on, with a bunch of real-life examples written by Lehigh’s former chair of the Industrial and Systems Engineering Department).

However, rather than take multiple Python courses not particularly focused on quant finance, I’d rather redirect any reader to just *one*, that covers all the concepts found in, well, just about all of the DataCamp finance courses–and more–in its first two (of four) chapters that I’m self-pacing right now.

This one!

It’s taught by Lionel Martellini of the EDHEC school as far as concepts go, but the lion’s share of it–the programming, is taught by the CEO of Optimal Asset Management, Vijay Vaidyanathan. I worked for Vijay in 2013 and 2014, and essentially, he made my R coding (I didn’t use any spaces or style in my code.) into, well, what allow you, the readers, to follow along with my ideas in code. In fact, I started this blog shortly after I left Optimal. Basically, I view that time in my career as akin to a second master’s degree. Everyone that praises any line of code on this blog…you have Vijay to thank for that. So, I’m hoping that his courses on Python will actually get my Python skills to the point that they get me more job opportunities (hopefully quickly).

However, if people think that Python is as good as R as far as finance goes, well…so far, the going isn’t going to be easy. Namely, I’ve found that working on finance in R is much easier than in Python thanks to R’s fantastic libraries written by Brian Peterson, Josh Ulrich, Jeff Ryan, and the rest of the R/Finance crew (I wonder if I’m part of it considering I taught a course like they did).

In any case, I’ve been trying to replicate the endpoints function from R in Python, because I always use it to do subsetting for asset allocation, and because I think that being able to jump between yearly, quarterly, monthly, and even daily indices to account for timing luck–(EG if you rebalance a portfolio quarterly on Mar/Jun/Sep/Dec, does it have a similar performance to a portfolio rebalanced Jan/Apr/Jul/Oct, or how does a portfolio perform depending on the day of month it’s rebalanced, and so on)–is something fairly important that should be included in the functionality of any comprehensively-built asset allocation package. You have Corey Hoffstein of Think Newfound to thank for that, and while I’ve built in daily offsets into a generalized asset allocation function I’m working on, my last post shows that there are a lot of devils hiding in the details of how one chooses–or even measures–lookbacks and rebalancing periods.

Moving on, here’s an edge case in Python’s Pandas package, regarding how Python sees weeks. That is, I dub it–an edgy panda. Basically, imagine a panda in a leather vest with a mohawk. The issue is that in some cases, the very end of one year is seen as the start of a next one, and thus the week count is seen as 1 rather than 52 or 53, which makes finding the last given day of a week not exactly work in some cases.

So, here’s some Python code to get our usual Adaptive Asset Allocation universe.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from pandas_datareader import data
import datetime as dt
from datetime import datetime

tickers = ["SPY", "VGK",   "EWJ",  "EEM",  "VNQ",  "RWX",  "IEF",  "TLT",  "DBC",  "GLD"]

# We would like all available data from 01/01/2000 until 12/31/2016.
start_date = '1990-01-01'
end_date = dt.datetime.today().strftime('%Y-%m-%d')

# Uses pandas_reader.data.DataReader to load the desired data. As simple as that.

adj_prices = []
for ticker in tickers:
    tickerData = data.DataReader(ticker, 'yahoo', start_date)
    adj_etf = tickerData.loc[:,'Adj Close']
    adj_prices.append(adj_etf)

adj_prices = pd.concat(adj_prices, axis = 1)
adj_prices.columns = tickers
adj_prices = adj_prices.dropna()
rets = adj_prices.pct_change().dropna()

df = rets

Anyhow, here’s something I found interesting, when trying to port over R’s endpoints function. Namely, in that while looking for a way to get the monthly endpoints, I found the following line on StackOverflow:

tmp = df.reset_index().groupby([df.index.year,df.index.month],as_index=False).last().set_index('Date')

Which gives the following ouptut:

tmp.head()
Out[59]: 
                 SPY       VGK       EWJ  ...       TLT       DBC       GLD
Date                                      ...                              
2006-12-29 -0.004149 -0.003509  0.001409  ... -0.000791  0.004085  0.004928
2007-01-31  0.006723  0.005958 -0.004175  ...  0.008408  0.010531  0.009499
2007-02-28  0.010251  0.010942 -0.001353  ... -0.004528  0.015304  0.016358
2007-03-30  0.000211  0.001836 -0.006817  ... -0.001923 -0.014752  0.001371
2007-04-30 -0.008293 -0.003852 -0.007644  ...  0.010475 -0.008915 -0.006957

So far, so good. Right? Well, here’s an edgy panda that pops up when I try to narrow the case down to weeks. Why? Because endpoints in R has that functionality, so for the sake of meticulousness, I simply decided to change up the line from monthly to weekly. Here’s *that* input and output.

tmp = df.reset_index().groupby([df.index.year, df.index.week],as_index=False).last().set_index('Date')

tmp.head()
Out[61]: 
                 SPY       VGK       EWJ  ...       TLT       DBC       GLD
Date                                      ...                              
2006-12-22 -0.006143 -0.002531  0.003551  ... -0.007660  0.007736  0.004399
2006-12-29 -0.004149 -0.003509  0.001409  ... -0.000791  0.004085  0.004928
2007-12-31 -0.007400 -0.010449  0.002262  ...  0.006055  0.001269 -0.006506
2007-01-12  0.007598  0.005913  0.012978  ... -0.004635  0.023400  0.025400
2007-01-19  0.001964  0.010903  0.007097  ... -0.002720  0.015038  0.011886

[5 rows x 10 columns]

Notice something funny? Instead of 2007-01-07, we get 2007-12-31. I even asked some people that use Python as their bread and butter (of which, hopefully, I will be one of soon) what was going on, and after some back and forth, it was found that the ISO standard has some weird edge cases relating to the final week of some years, and that the output is, apparently, correct, in that 2007-12-31 is apparently the first week of 2008 according to some ISO standard. Generally, when dealing with such edge cases in pandas (hence, edgy panda!), I look for another work-around. Thanks to help from Dr. Vaidyanathan, I got that workaround with the following input and output.

tmp = pd.Series(df.index,index=df.index).resample('W').max()
tmp.head(6)
Out[62]: 
Date
2006-12-24   2006-12-22
2006-12-31   2006-12-29
2007-01-07   2007-01-05
2007-01-14   2007-01-12
2007-01-21   2007-01-19
2007-01-28   2007-01-26
Freq: W-SUN, Name: Date, dtype: datetime64[ns]

Now, *that* looks far more reasonable. With this, we can write a proper endpoints function.

def endpoints(df, on = "M", offset = 0):
    """
    Returns index of endpoints of a time series analogous to R's endpoints
    function. 
    Takes in: 
        df -- a dataframe/series with a date index
        
        on -- a string specifying frequency of endpoints
        
        (E.G. "M" for months, "Q" for quarters, and so on)
        
        offset -- to offset by a specified index on the original data
        (E.G. if the data is daily resolution, offset of 1 offsets by a day)
        This is to allow for timing luck analysis. Thank Corey Hoffstein.
    """
    
    # to allow for familiarity with R
    # "months" becomes "M" for resampling
    if len(on) > 3:
        on = on[0].capitalize()
    
    # get index dates of formal endpoints
    ep_dates = pd.Series(df.index, index = df.index).resample(on).max()
    
    # get the integer indices of dates that are the endpoints
    date_idx = np.where(df.index.isin(ep_dates))
    
    # append zero and last day to match R's endpoints function
    # remember, Python is indexed at 0, not 1
    date_idx = np.insert(date_idx, 0, 0)
    date_idx = np.append(date_idx, df.shape[0]-1)
    if offset != 0:
        date_idx = date_idx + offset
        date_idx[date_idx < 0] = 0
        date_idx[date_idx > df.shape[0]-1] = df.shape[0]-1
    out = np.unique(date_idx)
    return out    

Essentially, the function takes in 3 arguments: first, your basic data frame (or series–which is essentially just a time-indexed data frame in Python to my understanding).


Next, it takes the “on” argument, which can take either a string such as “months”, or just a one-letter term for immediate use with Python’s resample function (I forget all the abbreviations, but I do know that there’s W, M, Q, and Y for weekly, monthly, quarterly, and yearly), which the function will convert a longer string into. That way, for those coming from R, this function will be backwards compatible.


Lastly, because Corey Hoffstein makes a big deal about it and I respect his accomplishments, the offset argument, which offsets the endpoints by the amount specified, at the frequency of the original data. That is, if you take quarterly endpoints using daily frequency data, the function won’t read your mind and offset the quarterly endpoints by a month, which *is* functionality that probably should be *somewhere*, but currently exists neither in R nor in Python, at least not in the public sphere, so I suppose I’ll have to write it…eventually.

Anyway, here’s how the function works (now in Python!) using the data in this post:

endpoints(rets, on = "weeks")[0:20]
Out[98]: 
array([ 0,  2,  6,  9, 14, 18, 23, 28, 33, 38, 42, 47, 52, 57, 62, 67, 71,
       76, 81, 86], dtype=int64)

endpoints(rets, on = "weeks", offset = 2)[0:20]
Out[99]: 
array([ 2,  4,  8, 11, 16, 20, 25, 30, 35, 40, 44, 49, 54, 59, 64, 69, 73,
       78, 83, 88], dtype=int64)

endpoints(rets, on = "months")
Out[100]: 
array([   0,    6,   26,   45,   67,   87,  109,  130,  151,  174,  193,
        216,  237,  257,  278,  298,  318,  340,  361,  382,  404,  425,
        446,  469,  488,  510,  530,  549,  571,  592,  612,  634,  656,
        677,  698,  720,  740,  762,  781,  800,  823,  844,  864,  886,
        907,  929,  950,  971,  992, 1014, 1034, 1053, 1076, 1096, 1117,
       1139, 1159, 1182, 1203, 1224, 1245, 1266, 1286, 1306, 1328, 1348,
       1370, 1391, 1412, 1435, 1454, 1475, 1496, 1516, 1537, 1556, 1576,
       1598, 1620, 1640, 1662, 1684, 1704, 1727, 1747, 1768, 1789, 1808,
       1829, 1850, 1871, 1892, 1914, 1935, 1956, 1979, 1998, 2020, 2040,
       2059, 2081, 2102, 2122, 2144, 2166, 2187, 2208, 2230, 2250, 2272,
       2291, 2311, 2333, 2354, 2375, 2397, 2417, 2440, 2461, 2482, 2503,
       2524, 2544, 2563, 2586, 2605, 2627, 2649, 2669, 2692, 2712, 2734,
       2755, 2775, 2796, 2815, 2836, 2857, 2879, 2900, 2921, 2944, 2963,
       2986, 3007, 3026, 3047, 3066, 3087, 3108, 3130, 3150, 3172, 3194,
       3214, 3237, 3257, 3263], dtype=int64)

endpoints(rets, on = "months", offset = 10)
Out[101]: 
array([  10,   16,   36,   55,   77,   97,  119,  140,  161,  184,  203,
        226,  247,  267,  288,  308,  328,  350,  371,  392,  414,  435,
        456,  479,  498,  520,  540,  559,  581,  602,  622,  644,  666,
        687,  708,  730,  750,  772,  791,  810,  833,  854,  874,  896,
        917,  939,  960,  981, 1002, 1024, 1044, 1063, 1086, 1106, 1127,
       1149, 1169, 1192, 1213, 1234, 1255, 1276, 1296, 1316, 1338, 1358,
       1380, 1401, 1422, 1445, 1464, 1485, 1506, 1526, 1547, 1566, 1586,
       1608, 1630, 1650, 1672, 1694, 1714, 1737, 1757, 1778, 1799, 1818,
       1839, 1860, 1881, 1902, 1924, 1945, 1966, 1989, 2008, 2030, 2050,
       2069, 2091, 2112, 2132, 2154, 2176, 2197, 2218, 2240, 2260, 2282,
       2301, 2321, 2343, 2364, 2385, 2407, 2427, 2450, 2471, 2492, 2513,
       2534, 2554, 2573, 2596, 2615, 2637, 2659, 2679, 2702, 2722, 2744,
       2765, 2785, 2806, 2825, 2846, 2867, 2889, 2910, 2931, 2954, 2973,
       2996, 3017, 3036, 3057, 3076, 3097, 3118, 3140, 3160, 3182, 3204,
       3224, 3247, 3263], dtype=int64)

endpoints(rets, on = "quarters")
Out[102]: 
array([   0,    6,   67,  130,  193,  257,  318,  382,  446,  510,  571,
        634,  698,  762,  823,  886,  950, 1014, 1076, 1139, 1203, 1266,
       1328, 1391, 1454, 1516, 1576, 1640, 1704, 1768, 1829, 1892, 1956,
       2020, 2081, 2144, 2208, 2272, 2333, 2397, 2461, 2524, 2586, 2649,
       2712, 2775, 2836, 2900, 2963, 3026, 3087, 3150, 3214, 3263],
      dtype=int64)

endpoints(rets, on = "quarters", offset = 10)
Out[103]: 
array([  10,   16,   77,  140,  203,  267,  328,  392,  456,  520,  581,
        644,  708,  772,  833,  896,  960, 1024, 1086, 1149, 1213, 1276,
       1338, 1401, 1464, 1526, 1586, 1650, 1714, 1778, 1839, 1902, 1966,
       2030, 2091, 2154, 2218, 2282, 2343, 2407, 2471, 2534, 2596, 2659,
       2722, 2785, 2846, 2910, 2973, 3036, 3097, 3160, 3224, 3263],
      dtype=int64)

So, that’s that. Endpoints, in Python. Eventually, I’ll try and port over Return.portfolio and charts.PerformanceSummary as well in the future.

Thanks for reading.

NOTE: I am currently enrolled in Thinkful’s python/PostGresSQL data science bootcamp while also actively looking for full-time (or long-term contract) opportunities in New York, Philadelphia, or remotely. If you know of an opportunity I may be a fit for, please don’t hesitate to contact me on my LinkedIn or just feel free to take my resume from my DropBox (and if you’d like, feel free to let me know how I can improve it).

How You Measure Months Matters — A Lot. A Look At Two Implementations of KDA

This post will detail a rather important finding I found while implementing a generalized framework for momentum asset allocation backtests. Namely, that when computing momentum (and other financial measures for use in asset allocation, such as volatility and correlations), measuring formal months, from start to end, has a large effect on strategy performance.

So, first off, I am in the job market, and am actively looking for a full-time role (preferably in New York City, or remotely), or a long-term contract. Here is my resume, and here is my LinkedIn profile. Furthermore, I’ve been iterating on my volatility strategy, and given that I’ve seen other services with large drawdowns, or less favorable risk/reward profiles charge $50/month, I think following my trades can be a reasonable portfolio diversification tool. Read about it and subscribe here. I believe that my body of work on this blog speaks to the viability of employing me, though I am also learning Python to try and port over my R skills over there, as everyone seems to want Python, and R much less so, hence the difficulty transferring between opportunities.

Anyhow, one thing I am working on is a generalized framework for tactical asset allocation (TAA) backtests. Namely, those that take the form of “sort universe by momentum, apply diversification weighting scheme”–namely, the kinds of strategies that the folks over at AllocateSmartly deal in. I am also working on this framework and am happy to announce that as of the time of this writing, I will happily work with individuals that want more customized TAA backtests, as the AllocateSmartly FAQs state that AllocateSmartly themselves do not do custom backtests. The framework I am currently in the process of implementing is designed to do just that. However, after going through some painstaking efforts to compare apples to apples, I came across a very important artifact. Namely, that there is a fairly large gulf in performance between measuring months from their formal endpoints, as opposed to simply approximating months with 21-day chunks (E.G. 21 days for 1 month, 63 for 3, and so on).

Here’s the code I’ve been developing recently–the long story short, is that the default options essentially default to Adaptive Asset Allocation, but depending on the parameters one inputs, it’s possible to get to something as simple as dual momentum (3 assets, invest in top 1), or as complex as KDA, with options to fine-tune it even further, such as to account for the luck-based timing that Corey Hoffstein at Newfound Research loves to write about (speaking of whom, he and the awesome folks at ReSolve Asset Management have launched a new ETF called ROMO–Robust Momentum–I recently bought a bunch in my IRA because a buy-it-and-forget-it TAA ETF is pretty fantastic as far as buy-and-hold investments go). Again, I set a bunch of defaults in the parameters so that most of them can be ignored.

require(PerformanceAnalytics)
require(quantmod)
require(tseries)
require(quadprog)

stratStats <- function(rets) {
  stats <- rbind(table.AnnualizedReturns(rets), maxDrawdown(rets))
  stats[5,] <- stats[1,]/stats[4,]
  stats[6,] <- stats[1,]/UlcerIndex(rets)
  rownames(stats)[4] <- "Worst Drawdown"
  rownames(stats)[5] <- "Calmar Ratio"
  rownames(stats)[6] <- "Ulcer Performance Index"
  return(stats)
}


getYahooReturns <- function(symbols, return_column = "Ad") {
  returns <- list()
  for(symbol in symbols) {
    getSymbols(symbol, from = '1990-01-01', adjustOHLC = TRUE)
    if(return_column == "Ad") {
      return <- Return.calculate(Ad(get(symbol)))
      colnames(return) <- gsub("\\.Adjusted", "", colnames(return))
    } else {
      return <- Return.calculate(Op(get(symbol)))
      colnames(return) <- gsub("\\.Open", "", colnames(return))
      
    }
    returns[[symbol]] <- return
  }
  returns <- na.omit(do.call(cbind, returns))
  return(returns)
}

symbols <- c("SPY", "VGK",   "EWJ",  "EEM",  "VNQ",  "RWX",  "IEF",  "TLT",  "DBC",  "GLD")  

returns <- getYahooReturns(symbols)
canary <- getYahooReturns(c("VWO", "BND"))

# offsets endpoints by a certain amount of days (I.E. 1-21)
dailyOffset <- function(ep, offset = 0) {
  
  ep <- ep + offset
  ep[ep < 1] <- 1
  ep[ep > nrow(returns)] <- nrow(returns)
  ep <- unique(ep)
  epDiff <- diff(ep)
  if(last(epDiff)==1) { 
    # if the last period only has one observation, remove it
    ep <- ep[-length(ep)]
  }
  return(ep)
}

# computes total weighted momentum and penalizes new assets (if desired)
compute_total_momentum <- function(yearly_subset, 
                                   momentum_lookbacks, momentum_weights,
                                   old_weights, new_asset_mom_penalty) {
  
  empty_vec <- data.frame(t(rep(0, ncol(yearly_subset)))) 
  colnames(empty_vec) <- colnames(yearly_subset)
  
  total_momentum <- empty_vec
  for(j in 1:length(momentum_lookbacks)) {
    momentum_subset <- tail(yearly_subset, momentum_lookbacks[j])
    total_momentum <- total_momentum + Return.cumulative(momentum_subset) * 
      momentum_weights[j]  
  }
  
  # if asset returns are negative, penalize by *increasing* negative momentum
  # this algorithm assumes we go long only
  total_momentum[old_weights == 0] <- total_momentum[old_weights==0] * 
    (1-new_asset_mom_penalty * sign(total_momentum[old_weights==0]))
  
  return(total_momentum)
}

# compute weighted correlation matrix
compute_total_correlation <- function(data, cor_lookbacks, cor_weights) {
  
  # compute total correlation matrix
  total_cor <- matrix(nrow=ncol(data), ncol=ncol(data), 0)
  rownames(total_cor) <- colnames(total_cor) <- colnames(data)
  for(j in 1:length(cor_lookbacks)) {
    total_cor = total_cor + cor(tail(data, cor_lookbacks[j])) * cor_weights[j]
  }
  
  return(total_cor)
}

# computes total weighted volatility
compute_total_volatility <- function(data, vol_lookbacks, vol_weights) {
  empty_vec <- data.frame(t(rep(0, ncol(data))))
  colnames(empty_vec) <- colnames(data)
  
  # normalize weights if not already normalized
  if(sum(vol_weights) != 1) {
    vol_weights <- vol_weights/sum(vol_weights)
  }
  
  # compute total volrelation matrix
  total_vol <- empty_vec
  for(j in 1:length(vol_lookbacks)) {
    total_vol = total_vol + StdDev.annualized(tail(data, vol_lookbacks[j])) * vol_weights[j]
  }
  
  return(total_vol)
}

check_valid_parameters <- function(mom_lookbacks, cor_lookbacks, mom_weights, cor_weights, vol_weights, vol_lookbacks) {
  if(length(mom_weights) != length(mom_lookbacks)) {
    stop("Momentum weight length must be equal to momentum lookback length.") }
  
  if(length(cor_weights) != length(cor_lookbacks)) {
    stop("Correlation weight length must be equal to correlation lookback length.")
  }
  
  if(length(vol_weights) != length(vol_lookbacks)) {
    stop("Volatility weight length must be equal to volatility lookback length.")
  }
}


# computes weights as a function proportional to the inverse of total variance
invVar <- function(returns, lookbacks, lookback_weights) {
  var <- compute_total_volatility(returns, lookbacks, lookback_weights)^2
  invVar <- 1/var
  return(invVar/sum(invVar))
}

# computes weights as a function proportional to the inverse of total volatility
invVol <- function(returns, lookbacks, lookback_weights) {
  vol <- compute_total_volatility(returns, lookbacks, lookback_weights)
  invVol <- 1/vol
  return(invVol/sum(invVol))
}

# computes equal weight portfolio
ew <- function(returns) {
  return(StdDev(returns)/(StdDev(returns)*ncol(returns)))
}

# computes minimum volatility portfolio
minVol <- function(returns, cor_lookbacks, cor_weights, vol_lookbacks, vol_weights) {
  vols <- compute_total_volatility(returns, vol_lookbacks, vol_weights)
  cors <- compute_total_correlation(returns, cor_lookbacks, cor_weights)
  covs <- t(vols) %*% as.numeric(vols) * cors
  min_vol_rets <- t(matrix(rep(1, ncol(covs))))
  
  n.col = ncol(covs)
  zero.mat <- array(0, dim = c(n.col, 1))
  one.zero.diagonal.a <- cbind(1, diag(n.col), 1 * diag(n.col), -1 * diag(n.col))
  min.wgt <- rep(0, n.col)
  max.wgt <- rep(1, n.col)
  bvec.1.vector.a <- c(1, rep(0, n.col), min.wgt, -max.wgt)
  meq.1 <- 1
  
  mv.port.noshort.a <- solve.QP(Dmat = covs, dvec = zero.mat, Amat = one.zero.diagonal.a,
                                bvec = bvec.1.vector.a, meq = meq.1)
  
  
  min_vol_wt <- mv.port.noshort.a$solution
  names(min_vol_wt) <- rownames(covs)
  return(min_vol_wt)
}
asset_allocator <- function(returns, 
                           canary_returns = NULL, # canary assets for KDA algorithm and similar
                           
                           mom_threshold = 0, # threshold momentum must exceed
                           mom_lookbacks = 126, # momentum lookbacks for custom weights (EG 1-3-6-12)
                           
                           # weights on various momentum lookbacks (EG 12/19, 4/19, 2/19, 1/19)
                           mom_weights = rep(1/length(mom_lookbacks), 
                                             length(mom_lookbacks)), 
                           
                           # repeat for correlation weights
                           cor_lookbacks = mom_lookbacks, # correlation lookback
                           cor_weights = rep(1/length(mom_lookbacks), 
                                             length(mom_lookbacks)),
                           
                           vol_lookbacks = 20, # volatility lookback
                           vol_weights = rep(1/length(vol_lookbacks), 
                                             length(vol_lookbacks)),
                           
                           # number of assets to hold (if all above threshold)
                           top_n = floor(ncol(returns)/2), 
                           
                           # diversification weight scheme (ew, invVol, invVar, minVol, etc.)
                           weight_scheme = "minVol",
                           
                           # how often holdings rebalance
                           rebalance_on = "months",
                           
                           # how many days to offset rebalance period from end of month/quarter/year
                           offset = 0, 
                           
                           # penalize new asset mom to reduce turnover
                           new_asset_mom_penalty = 0, 
                           
                           # run Return.Portfolio, or just return weights?
                           # for use in robust momentum type portfolios
                           compute_portfolio_returns = TRUE,
                           verbose = FALSE,
                           
                           # crash protection asset
                           crash_asset = NULL,
                           ...
                           ) {
  
  # normalize weights
  mom_weights <- mom_weights/sum(mom_weights)
  cor_weights <- cor_weights/sum(cor_weights)
  vol_weights <- vol_weights/sum(vol_weights)
  
  # if we have canary returns (I.E. KDA strat), align both time periods
  if(!is.null(canary_returns)) {
   smush <- na.omit(cbind(returns, canary_returns))
   returns <- smush[,1:ncol(returns)]
   canary_returns <- smush[,-c(1:ncol(returns))]
   empty_canary_vec <- data.frame(t(rep(0, ncol(canary_returns))))
   colnames(empty_canary_vec) <- colnames(canary_returns)
  }
  
  # get endpoints and offset them
  ep <- endpoints(returns, on = rebalance_on)
  ep <- dailyOffset(ep, offset = offset)
  
  # initialize vector holding zeroes for assets
  empty_vec <- data.frame(t(rep(0, ncol(returns))))
  colnames(empty_vec) <- colnames(returns)
  weights <- empty_vec
  
  # initialize list to hold all our weights
  all_weights <- list()
  
  # get number of periods per year
  switch(rebalance_on,
         "months" = { yearly_periods = 12},
         "quarters" = { yearly_periods = 4},
         "years" = { yearly_periods = 1})
  
  for(i in 1:(length(ep) - yearly_periods)) {
    
    # remember old weights for the purposes of penalizing momentum of new assets
    old_weights <- weights
    
    # subset one year of returns, leave off first day 
    return_subset <- returns[c((ep[i]+1):ep[(i+yearly_periods)]),]

    # compute total weighted momentum, penalize potential new assets if desired
    momentums <- compute_total_momentum(return_subset,  
                                        momentum_lookbacks = mom_lookbacks,
                                        momentum_weights = mom_weights,
                                        old_weights = old_weights, 
                                        new_asset_mom_penalty = new_asset_mom_penalty)
    
    # rank negative momentum so that best asset is ranked 1 and so on
    momentum_ranks <- rank(-momentums)
    selected_assets <- momentum_ranks <= top_n & momentums > mom_threshold
    selected_subset <- return_subset[, selected_assets]
    
    # case of 0 valid assets
    if(sum(selected_assets)==0) {
      weights <- empty_vec
    } else if (sum(selected_assets)==1) {
      
      # case of only 1 valid asset -- invest everything into it
      weights <- empty_vec + selected_assets
      
    } else {
      # apply a user-selected weighting algorithm
      # modify this portion to select more weighting schemes
      if (weight_scheme == "ew") {
        weights <- ew(selected_subset)
      } else if (weight_scheme == "invVol") {
        weights <- invVol(selected_subset, vol_lookbacks, vol_weights)
      } else if (weight_scheme == "invVar"){
        weights <- invVar(selected_subset, vol_lookbacks, vol_weights)
      } else if (weight_scheme == "minVol") {
        weights <- minVol(selected_subset, cor_lookbacks, cor_weights,
                          vol_lookbacks, vol_weights)
      } 
    }
    
    # include all assets
    wt_names <- names(weights) 
    if(is.null(wt_names)){wt_names <- colnames(weights)}
    zero_weights <- empty_vec
    zero_weights[wt_names] <- weights
    weights <- zero_weights
    weights <- xts(weights, order.by=last(index(return_subset)))
    
    # if there's a canary universe, modify weights by fraction with positive momentum
    # if there's a safety asset, allocate the crash protection modifier to it.
    if(!is.null(canary_returns)) {
      canary_subset <- canary_returns[c(ep[i]:ep[(i+yearly_periods)]),]
      canary_subset <- canary_subset[-1,]
      canary_mom <- compute_total_momentum(canary_subset, 
                                           mom_lookbacks, mom_weights,
                                           empty_canary_vec, 0)
      canary_mod <- mean(canary_mom > 0)
      weights <- weights * canary_mod
      if(!is.null(crash_asset)) {
        if(momentums[crash_asset] > mom_threshold) {
          weights[,crash_asset] <- weights[,crash_asset] + (1-canary_mod)
        }
      }
    }
    
    all_weights[[i]] <- weights
  }
  
  # combine weights
  all_weights <- do.call(rbind, all_weights)
  if(compute_portfolio_returns) {
    strategy_returns <- Return.portfolio(R = returns, weights = all_weights, verbose = verbose)
    return(list(all_weights, strategy_returns))
  }
  return(all_weights)
  
}

#out <- asset_allocator(returns, offset = 0)
kda <- asset_allocator(returns = returns, canary_returns = canary, 
                       mom_lookbacks = c(21, 63, 126, 252),
                       mom_weights = c(12, 4, 2, 1),
                       cor_lookbacks = c(21, 63, 126, 252),
                       cor_weights = c(12, 4, 2, 1), vol_lookbacks = 21,
                       weight_scheme = "minVol",
                       crash_asset = "IEF")


The one thing that I’d like to focus on, however, are the lookback parameters. Essentially, assuming daily data, they’re set using a *daily lookback*, as that’s what AllocateSmartly did when testing my own KDA Asset Allocation algorithm. Namely, the salient line is this:

“For all assets across all three universes, at the close on the last trading day of the month, calculate a “momentum score” as follows:(12 * (p0 / p21 – 1)) + (4 * (p0 / p63 – 1)) + (2 * (p0 / p126 – 1)) + (p0 / p252 – 1)Where p0 = the asset’s price at today’s close, p1 = the asset’s price at the close of the previous trading day and so on. 21, 63, 126 and 252 days correspond to 1, 3, 6 and 12 months.”

So, to make sure I had apples to apples when trying to generalize KDA asset allocation, I compared the output of my new algorithm, asset_allocator (or should I call it allocate_smartly ?=] ) to my formal KDA asset allocation algorithm.

Here are the results:

                            KDA_algo KDA_approximated_months
Annualized Return         0.10190000              0.08640000
Annualized Std Dev        0.09030000              0.09040000
Annualized Sharpe (Rf=0%) 1.12790000              0.95520000
Worst Drawdown            0.07920336              0.09774612
Calmar Ratio              1.28656163              0.88392257
Ulcer Performance Index   3.78648873              2.62691398

Essentially, the long and short of it is that I modified my original KDA algorithm until I got identical output to my asset_allocator algorithm, then went back to the original KDA algorithm. The salient difference is this part:

# computes total weighted momentum and penalizes new assets (if desired)
compute_total_momentum <- function(yearly_subset, 
                                   momentum_lookbacks, momentum_weights,
                                   old_weights, new_asset_mom_penalty) {
  
  empty_vec <- data.frame(t(rep(0, ncol(yearly_subset)))) 
  colnames(empty_vec) <- colnames(yearly_subset)
  
  total_momentum <- empty_vec
  for(j in 1:length(momentum_lookbacks)) {
    momentum_subset <- tail(yearly_subset, momentum_lookbacks[j])
    total_momentum <- total_momentum + Return.cumulative(momentum_subset) * 
      momentum_weights[j]  
  }
  
  # if asset returns are negative, penalize by *increasing* negative momentum
  # this algorithm assumes we go long only
  total_momentum[old_weights == 0] <- total_momentum[old_weights==0] * 
    (1-new_asset_mom_penalty * sign(total_momentum[old_weights==0]))
  
  return(total_momentum)
}

Namely, the part that further subsets the yearly subset by the lookback period, in terms of days, rather than monthly endpoints. Essentially, the difference in the exact measurement of momentum–that is, the measurement that explicitly selects *which* instruments the algorithm will allocate to in a particular period, unsurprisingly, has a large impact on the performance of the algorithm.

And lest anyone think that this phenomena no longer applies, here’s a yearly performance comparison.

                KDA_algo KDA_approximated_months
2008-12-31  0.1578348930             0.062776766
2009-12-31  0.1816957178             0.166017499
2010-12-31  0.1779839604             0.160781537
2011-12-30  0.1722014474             0.149143148
2012-12-31  0.1303019332             0.103579674
2013-12-31  0.1269207487             0.134197066
2014-12-31  0.0402888320             0.071784979
2015-12-31 -0.0119459453            -0.028090873
2016-12-30  0.0125302658             0.002996917
2017-12-29  0.1507895287             0.133514924
2018-12-31  0.0747520266             0.062544709
2019-11-27  0.0002062636             0.008798310

Of note: the variant that formally measures momentum from monthly endpoints consistently outperforms the one using synthetic monthly measurements.

So, that will do it for this post. I hope to have a more thorough walk-through of the asset_allocator function in the very near future before moving onto Python-related matters (hopefully), but I thought that this artifact, and just how much it affects outcomes, was too important not to share.

An iteration of the algorithm capable of measuring momentum with proper monthly endpoints should be available in the near future.

Thanks for reading.

KDA–Robustness Results

This post will display some robustness results for KDA asset allocation.

Ultimately, the two canary instruments fare much better using the original filter weights in Defensive Asset Allocation than in other variants of the weights for the filter. While this isn’t as worrying (the filter most likely was created that way and paired with those instruments by design), what *is* somewhat more irritating is that the strategy is dependent upon the end-of-month phenomenon, meaning this strategy cannot be simply tranched throughout an entire trading month.

So first off, let’s review the code from last time:

# KDA asset allocation 
# KDA stands for Kipnis Defensive Adaptive (Asset Allocation).

# compute strategy statistics
stratStats <- function(rets) {
  stats <- rbind(table.AnnualizedReturns(rets), maxDrawdown(rets))
  stats[5,] <- stats[1,]/stats[4,]
  stats[6,] <- stats[1,]/UlcerIndex(rets)
  rownames(stats)[4] <- "Worst Drawdown"
  rownames(stats)[5] <- "Calmar Ratio"
  rownames(stats)[6] <- "Ulcer Performance Index"
  return(stats)
}

# required libraries
require(quantmod)
require(PerformanceAnalytics)
require(tseries)

# symbols
symbols <- c("SPY", "VGK",   "EWJ",  "EEM",  "VNQ",  "RWX",  "IEF",  "TLT",  "DBC",  "GLD", "VWO", "BND")  


# get data
rets <- list()
for(i in 1:length(symbols)) {
  
  returns <- Return.calculate(Ad(get(getSymbols(symbols[i], from = '1990-01-01'))))
  colnames(returns) <- symbols[i]
  rets[[i]] <- returns
}
rets <- na.omit(do.call(cbind, rets))


# algorithm
KDA <- function(rets, offset = 0, leverageFactor = 1.5, momWeights = c(12, 4, 2, 1)) {
  
  # get monthly endpoints, allow for offsetting ala AllocateSmartly/Newfound Research
  ep <- endpoints(rets) + offset
  ep[ep < 1] <- 1
  ep[ep > nrow(rets)] <- nrow(rets)
  ep <- unique(ep)
  epDiff <- diff(ep)
  if(last(epDiff)==1) { # if the last period only has one observation, remove it
    ep <- ep[-length(ep)]
  }
  
  # initialize vector holding zeroes for assets
  emptyVec <- data.frame(t(rep(0, 10)))
  colnames(emptyVec) <- symbols[1:10]
  
  
  allWts <- list()
  # we will use the 13612F filter
  for(i in 1:(length(ep)-12)) {
    
    # 12 assets for returns -- 2 of which are our crash protection assets
    retSubset <- rets[c((ep[i]+1):ep[(i+12)]),]
    epSub <- ep[i:(i+12)]
    sixMonths <- rets[(epSub[7]+1):epSub[13],]
    threeMonths <- rets[(epSub[10]+1):epSub[13],]
    oneMonth <- rets[(epSub[12]+1):epSub[13],]
    
    # computer 13612 fast momentum
    moms <- Return.cumulative(oneMonth) * momWeights[1] + Return.cumulative(threeMonths) * momWeights[2] + 
      Return.cumulative(sixMonths) * momWeights[3] + Return.cumulative(retSubset) * momWeights[4]
    assetMoms <- moms[,1:10] # Adaptive Asset Allocation investable universe
    cpMoms <- moms[,11:12] # VWO and BND from Defensive Asset Allocation
    
    # find qualifying assets
    highRankAssets <- rank(assetMoms) >= 6 # top 5 assets
    posReturnAssets <- assetMoms > 0 # positive momentum assets
    selectedAssets <- highRankAssets & posReturnAssets # intersection of the above
    
    # perform mean-variance/quadratic optimization
    investedAssets <- emptyVec
    if(sum(selectedAssets)==0) {
      investedAssets <- emptyVec
    } else if(sum(selectedAssets)==1) {
      investedAssets <- emptyVec + selectedAssets 
    } else {
      idx <- which(selectedAssets)
      # use 1-3-6-12 fast correlation average to match with momentum filter  
      cors <- (cor(oneMonth[,idx]) * momWeights[1] + cor(threeMonths[,idx]) * momWeights[2] + 
                 cor(sixMonths[,idx]) * momWeights[3] + cor(retSubset[,idx]) * momWeights[4])/sum(momWeights)
      vols <- StdDev(oneMonth[,idx]) # use last month of data for volatility computation from AAA
      covs <- t(vols) %*% vols * cors
      
      # do standard min vol optimization
      minVolRets <- t(matrix(rep(1, sum(selectedAssets))))
      minVolWt <- portfolio.optim(x=minVolRets, covmat = covs)$pw
      names(minVolWt) <- colnames(covs)
      investedAssets <- emptyVec
      investedAssets[,selectedAssets] <- minVolWt
    }
    
    # crash protection -- between aggressive allocation and crash protection allocation
    pctAggressive <- mean(cpMoms > 0)
    investedAssets <- investedAssets * pctAggressive 
    
    pctCp <- 1-pctAggressive
    
    # if IEF momentum is positive, invest all crash protection allocation into it
    # otherwise stay in cash for crash allocation
    if(assetMoms["IEF"] > 0) {
      investedAssets["IEF"] <- investedAssets["IEF"] + pctCp
    }
    
    # leverage portfolio if desired in cases when both risk indicator assets have positive momentum
    if(pctAggressive == 1) {
      investedAssets = investedAssets * leverageFactor
    }
    
    # append to list of monthly allocations
    wts <- xts(investedAssets, order.by=last(index(retSubset)))
    allWts[[i]] <- wts
    
  }
  
  # put all weights together and compute cash allocation
  allWts <- do.call(rbind, allWts)
  allWts$CASH <- 1-rowSums(allWts)
  
  # add cash returns to universe of investments
  investedRets <- rets[,1:10]
  investedRets$CASH <- 0
  
  # compute portfolio returns
  out <- Return.portfolio(R = investedRets, weights = allWts)
  return(list(allWts, out))
}

So, the idea is that we take the basic Adaptive Asset Allocation algorithm, and wrap it in a canary universe from Defensive Asset Allocation (see previous post for links to both), which we use to control capital allocation, ranging from 0 to 1 (or beyond, in cases where leverage applies).

One of the ideas was to test out different permutations of the parameters belonging to the canary filter–a 1, 3, 6, 12 weighted filter focusing on the first month.

There are two interesting variants of this–equal weighting on the filter (both for momentum and the safety assets), and reversing the weights (that is, 1 * 1, 3 * 2, 6 * 4, 12 * 12). Here are the results of that experiment:


# different leverages
KDA_100 <- KDA(rets, leverageFactor = 1)
KDA_EW <- KDA(rets, leverageFactor = 1, momWeights = c(1,1,1,1))
KDA_rev <- KDA(rets, leverageFactor = 1, momWeights = c(1, 2, 4, 12))
# KDA_150 <- KDA(rets, leverageFactor = 1.5)
# KDA_200 <- KDA(rets, leverageFactor = 2)

# compare
compare <- na.omit(cbind(KDA_100[[2]], KDA_EW[[2]], KDA_rev[[2]]))
colnames(compare) <- c("KDA_base", "KDA_EW", "KDA_rev")
charts.PerformanceSummary(compare, colorset = c('black', 'purple', 'gold'), 
                          main = "KDA AA with various momentum weights")

stratStats(compare)
apply.yearly(compare, Return.cumulative)

With the following results:

> stratStats(compare)
                            KDA_base    KDA_EW   KDA_rev
Annualized Return         0.10990000 0.0879000 0.0859000
Annualized Std Dev        0.09070000 0.0900000 0.0875000
Annualized Sharpe (Rf=0%) 1.21180000 0.9764000 0.9814000
Worst Drawdown            0.07920363 0.1360625 0.1500333
Calmar Ratio              1.38756275 0.6460266 0.5725396
Ulcer Performance Index   3.96188378 2.4331636 1.8267448

> apply.yearly(compare, Return.cumulative)
              KDA_base       KDA_EW    KDA_rev
2008-12-31  0.15783690  0.101929228 0.08499664
2009-12-31  0.18169281 -0.004707164 0.02403531
2010-12-31  0.17797930  0.283216782 0.27889530
2011-12-30  0.17220203  0.161001680 0.03341651
2012-12-31  0.13030215  0.081280035 0.09736187
2013-12-31  0.12692163  0.120902015 0.09898799
2014-12-31  0.04028492  0.047381890 0.06883301
2015-12-31 -0.01621646 -0.005016891 0.01841095
2016-12-30  0.01253209  0.020960805 0.01580218
2017-12-29  0.15079063  0.148073455 0.18811112
2018-12-31  0.06583962  0.029804042 0.04375225
2019-02-20  0.01689700  0.003934044 0.00962020

So, one mea culpa: after comparing AllocateSmartly, my initial code (which I’ve since edited, most likely owing to getting some logic mixed up when I added functionality to lag the day of month to trade) had some sort of bug in it which gave a slightly better than expected 2015 return. Nevertheless, the results are very similar. What is interesting to note is that in the raging bull market that was essentially from 2010 onwards, the equal weight and reverse weight filters don’t perform too badly, though the reverse weight filter has a massive drawdown in 2011, but in terms of capitalizing in awful markets, the original filter as designed by Keller and TrendXplorer works best, both in terms of making money during the recession, and does better near the market bottom in 2009.

Now that that’s out of the way, the more interesting question is how does the strategy work when not trading at the end of the month? Long story short, the best time to trade it is in the last week of the month. Once the new month rolls around, hands off. If you’re talking about tranching this strategy, then you have about a week’s time to get your positions in, so I’m not sure the actual dollar volume this strategy can manage, as it’s dependent on the month-end effect (I know that one of my former managers–a brilliant man, by all accounts–said that this phenomena no longer existed, but I feel these empirical results refute that assertion in this particular instance). Here are these results:

lagCompare <- list()
for(i in 1:21) {
  offRets <- KDA(rets, leverageFactor = 1, offset = i)
  tmp <- offRets[[2]]
  colnames(tmp) <- paste0("Lag", i)
  lagCompare[[i]] <- tmp
}
lagCompare <- do.call(cbind, lagCompare)
lagCompare <- na.omit(cbind(KDA_100[[2]], lagCompare))
colnames(lagCompare)[1] <- "Base"

charts.PerformanceSummary(lagCompare, colorset=c("orange", rep("gray", 21)))
stratStats(lagCompare)

With the results:

> stratStats(lagCompare)
                                Base      Lag1      Lag2      Lag3      Lag4      Lag5      Lag6      Lag7      Lag8
Annualized Return         0.11230000 0.0584000 0.0524000 0.0589000 0.0319000 0.0319000 0.0698000 0.0790000 0.0912000
Annualized Std Dev        0.09100000 0.0919000 0.0926000 0.0945000 0.0975000 0.0957000 0.0943000 0.0934000 0.0923000
Annualized Sharpe (Rf=0%) 1.23480000 0.6357000 0.5654000 0.6229000 0.3270000 0.3328000 0.7405000 0.8460000 0.9879000
Worst Drawdown            0.07920363 0.1055243 0.1269207 0.1292193 0.1303246 0.1546962 0.1290020 0.1495558 0.1227749
Calmar Ratio              1.41786439 0.5534272 0.4128561 0.4558141 0.2447734 0.2062107 0.5410771 0.5282311 0.7428230
Ulcer Performance Index   4.03566328 1.4648618 1.1219982 1.2100649 0.4984094 0.5012318 1.3445786 1.4418132 2.3277271
                               Lag9     Lag10     Lag11     Lag12     Lag13     Lag14     Lag15     Lag16     Lag17
Annualized Return         0.0854000 0.0863000 0.0785000 0.0732000 0.0690000 0.0862000 0.0999000 0.0967000 0.1006000
Annualized Std Dev        0.0906000 0.0906000 0.0900000 0.0913000 0.0906000 0.0909000 0.0923000 0.0947000 0.0949000
Annualized Sharpe (Rf=0%) 0.9426000 0.9524000 0.8722000 0.8023000 0.7617000 0.9492000 1.0825000 1.0209000 1.0600000
Worst Drawdown            0.1278059 0.1189949 0.1197596 0.1112761 0.1294588 0.1498408 0.1224511 0.1290538 0.1274083
Calmar Ratio              0.6682006 0.7252411 0.6554796 0.6578231 0.5329880 0.5752771 0.8158357 0.7493000 0.7895878
Ulcer Performance Index   2.3120919 2.6415855 2.4441605 1.9248615 1.8096134 2.2378207 2.8753265 2.9092448 3.0703542
                             Lag18     Lag19     Lag20     Lag21
Annualized Return         0.097100 0.0921000 0.1047000 0.1019000
Annualized Std Dev        0.092900 0.0903000 0.0958000 0.0921000
Annualized Sharpe (Rf=0%) 1.044900 1.0205000 1.0936000 1.1064000
Worst Drawdown            0.100604 0.1032067 0.1161583 0.1517104
Calmar Ratio              0.965170 0.8923835 0.9013561 0.6716747
Ulcer Performance Index   3.263040 2.7159601 3.0758230 3.0414002

Essentially, the trade at the very end of the month is the only one with a Calmar ratio above 1, though the Calmar ratio from lag15 to lag 21 is about .8 or higher, with a Sharpe ratio of 1 or higher. So, there’s definitely a window of when to trade, and when not to–namely, the lag 1 through 5 variations have the worst performances by no small margin. Therefore, I strongly suspect that the 1-3-6-12 filter was designed around the idea of the end-of-month effect, or at least, not stress-tested for different trading days within the month (and given that longer-dated data is only monthly, this is understandable).

Nevertheless, I hope this does answer some people’s questions from the quant finance universe. I know that Corey Hoffstein of Think Newfound (and wow that blog is good from the perspective of properties of trend-following) loves diversifying across every bit of the process, though in this case, I do think there’s something to be said about “diworsification”.

In any case, I do think there are some future research venues for further research here.

Thanks for reading.

Right Now It’s KDA…Asset Allocation.

This post will introduce KDA Asset Allocation. KDA — I.E. Kipnis Defensive Adaptive Asset Allocation is a combination of Wouter Keller’s and TrendXplorer’s Defensive Asset Allocation, along with ReSolve Asset Management’s Adaptive Asset Allocation. This is an asset allocation strategy with a profile unlike most tactical asset allocation strategies I’ve seen before (namely, it barely loses any money in 2015, which was generally a brutal year for tactical asset allocation strategies).

So, the idea for this strategy came from reading an excellent post from TrendXplorer on the idea of a canary universe–using a pair of assets to determine when to increase exposure to risky/aggressive assets, and when to stay in cash. Rather than gauge it on the momentum of the universe itself, the paper by Wouter Keller and TrendXplorer instead uses proxy assets VWO and BND as a proxy universe. Furthermore, in which situations say to take full exposure to risky assets, the latest iteration of DAA actually recommends leveraging exposure to risky assets, which will also be demonstrated. Furthermore, I also applied the idea of the 1-3-6-12 fast filter espoused by Wouter Keller and TrendXplorer–namely, the sum of the 12 * 1-month momentum, 4 * 3-month momentum, 2 * 6-month momentum, and the 12 month momentum (that is, month * some number = 12). This puts a large emphasis on the front month of returns, both for the risk on/off assets, and the invested assets themselves.

However, rather than adopt the universe of investments from the TrendXplorer post, I decided to instead defer to the well-thought-out universe construction from Adaptive Asset Allocation, along with their idea to use a mean variance optimization approach for actually weighting the selected assets.

So, here are the rules:

Take the investment universe–SPY, VGK, EWJ, EEM, VNQ, RWX, IEF, TLT, DBC, GLD, and compute the 1-3-6-12 momentum filter for them (that is, the sum of 12 * 1-month momentum, 4 * 3-month momentum, 2* 6-month momentum and 12 month momentum), and rank them. The selected assets are those with a momentum above zero, and that are in the top 5.

Use a basic quadratic optimization algorithm on them, feeding in equal returns (as they passed the dual momentum filter), such as the portfolio.optim function from the tseries package.

From adaptive asset allocation, the covariance matrix is computed using one-month volatility estimates, and a correlation matrix that is the weighted average of the same parameters used for the momentum filter (that is, 12 * 1-month correlation + 4 * 3-month correlation + 2 * 6-month correlation + 12-month correlation, all divided by 19).

Next, compute your exposure to risky assets by which percentage of the two canary assets–VWO and BND–have a positive 1-3-6-12 momentum. If both assets have a positive momentum, leverage the portfolio (if desired). Reapply this algorithm every month.

All of the allocation not made to risky assets goes towards IEF (which is in the pool of risky assets as well, so some months may have a large IEF allocation) if it has a positive 1-3-6-12 momentum, or just stay in cash if it does not.

The one somewhat optimistic assumption made is that the strategy observes the close on a day, and enters at the close as well. Given a holding period of a month, this should not have a massive material impact as compared to a strategy which turns over potentially every day.

Here’s the R code to do this:

# KDA asset allocation 
# KDA stands for Kipnis Defensive Adaptive (Asset Allocation).

# compute strategy statistics
stratStats <- function(rets) {
  stats <- rbind(table.AnnualizedReturns(rets), maxDrawdown(rets))
  stats[5,] <- stats[1,]/stats[4,]
  stats[6,] <- stats[1,]/UlcerIndex(rets)
  rownames(stats)[4] <- "Worst Drawdown"
  rownames(stats)[5] <- "Calmar Ratio"
  rownames(stats)[6] <- "Ulcer Performance Index"
  return(stats)
}

# required libraries
require(quantmod)
require(PerformanceAnalytics)
require(tseries)

# symbols
symbols <- c("SPY", "VGK",   "EWJ",  "EEM",  "VNQ",  "RWX",  "IEF",  "TLT",  "DBC",  "GLD", "VWO", "BND")  


# get data
rets <- list()
for(i in 1:length(symbols)) {
  
    returns <- Return.calculate(Ad(get(getSymbols(symbols[i], from = '1990-01-01'))))
  colnames(returns) <- symbols[i]
  rets[[i]] <- returns
}
rets <- na.omit(do.call(cbind, rets))


# algorithm
KDA <- function(rets, offset = 0, leverageFactor = 1.5) {
  
  # get monthly endpoints, allow for offsetting ala AllocateSmartly/Newfound Research
  ep <- endpoints(rets) + offset
  ep[ep < 1] <- 1
  ep[ep > nrow(rets)] <- nrow(rets)
  ep <- unique(ep)
  epDiff <- diff(ep)
  if(last(epDiff)==1) { # if the last period only has one observation, remove it
    ep <- ep[-length(ep)]
  }
  # initialize vector holding zeroes for assets
  emptyVec <- data.frame(t(rep(0, 10)))
  colnames(emptyVec) <- symbols[1:10]
  
  
  allWts <- list()
  # we will use the 13612F filter
  for(i in 1:(length(ep)-12)) {
    
    # 12 assets for returns -- 2 of which are our crash protection assets
    retSubset <- rets[c((ep[i]+1):ep[(i+12)]),]
    epSub <- ep[i:(i+12)]
    sixMonths <- rets[(epSub[7]+1):epSub[13],]
    threeMonths <- rets[(epSub[10]+1):epSub[13],]
    oneMonth <- rets[(epSub[12]+1):epSub[13],]
    
    # computer 13612 fast momentum
    moms <- Return.cumulative(oneMonth) * 12 + Return.cumulative(threeMonths) * 4 + 
      Return.cumulative(sixMonths) * 2 + Return.cumulative(retSubset)
    assetMoms <- moms[,1:10] # Adaptive Asset Allocation investable universe
    cpMoms <- moms[,11:12] # VWO and BND from Defensive Asset Allocation
    
    # find qualifying assets
    highRankAssets <- rank(assetMoms) >= 6 # top 5 assets
    posReturnAssets <- assetMoms > 0 # positive momentum assets
    selectedAssets <- highRankAssets & posReturnAssets # intersection of the above
    
    # perform mean-variance/quadratic optimization
    investedAssets <- emptyVec
    if(sum(selectedAssets)==0) {
      investedAssets <- emptyVec
    } else if(sum(selectedAssets)==1) {
      investedAssets <- emptyVec + selectedAssets 
    } else {
      idx <- which(selectedAssets)
      # use 1-3-6-12 fast correlation average to match with momentum filter  
      cors <- (cor(oneMonth[,idx]) * 12 + cor(threeMonths[,idx]) * 4 + 
                 cor(sixMonths[,idx]) * 2 + cor(retSubset[,idx]))/19
      vols <- StdDev(oneMonth[,idx]) # use last month of data for volatility computation from AAA
      covs <- t(vols) %*% vols * cors
      
      # do standard min vol optimization
      minVolRets <- t(matrix(rep(1, sum(selectedAssets))))
      minVolWt <- portfolio.optim(x=minVolRets, covmat = covs)$pw
      names(minVolWt) <- colnames(covs)
      investedAssets <- emptyVec
      investedAssets[,selectedAssets] <- minVolWt
    }
    
    # crash protection -- between aggressive allocation and crash protection allocation
    pctAggressive <- mean(cpMoms > 0)
    investedAssets <- investedAssets * pctAggressive 
    
    pctCp <- 1-pctAggressive
    
    # if IEF momentum is positive, invest all crash protection allocation into it
    # otherwise stay in cash for crash allocation
    if(assetMoms["IEF"] > 0) {
      investedAssets["IEF"] <- investedAssets["IEF"] + pctCp
    }
    
    # leverage portfolio if desired in cases when both risk indicator assets have positive momentum
    if(pctAggressive == 1) {
      investedAssets = investedAssets * leverageFactor
    }
    
    # append to list of monthly allocations
    wts <- xts(investedAssets, order.by=last(index(retSubset)))
    allWts[[i]] <- wts
    
  }
  
  # put all weights together and compute cash allocation
  allWts <- do.call(rbind, allWts)
  allWts$CASH <- 1-rowSums(allWts)
  
  # add cash returns to universe of investments
  investedRets <- rets[,1:10]
  investedRets$CASH <- 0
  
  # compute portfolio returns
  out <- Return.portfolio(R = investedRets, weights = allWts)
  return(out)
}

# different leverages
KDA_100 <- KDA(rets, leverageFactor = 1)
KDA_150 <- KDA(rets, leverageFactor = 1.5)
KDA_200 <- KDA(rets, leverageFactor = 2)

# compare
compare <- na.omit(cbind(KDA_100, KDA_150, KDA_200))
colnames(compare) <- c("KDA_base", "KDA_lev_150%", "KDA_lev_200%")
charts.PerformanceSummary(compare, colorset = c('black', 'purple', 'gold'), 
                          main = "KDA AA with various offensive leverages")

And here are the equity curves and statistics:

What appeals to me about this strategy, is that unlike most tactical asset allocation strategies, this strategy comes out relatively unscathed by the 2015-2016 whipsaws that hurt so many other tactical asset allocation strategies. However this strategy isn’t completely flawless, as sometimes, it decides that it’d be a great time to enter full risk-on mode and hit a drawdown, as evidenced by the drawdown curve. Nevertheless, the Calmar ratios are fairly solid for a tactical asset allocation rotation strategy, and even in a brutal 2018 that decimated all risk assets, this strategy managed to post a very noticeable *positive* return. On the downside, the leverage plan actually seems to *negatively* affect risk/reward characteristics in this strategy–that is, as leverage during aggressive allocations increases, characteristics such as the Sharpe and Calmar ratio actually *decrease*.

Overall, I think there are different aspects to unpack here–such as performances of risky assets as a function of the two canary universe assets, and a more optimal leverage plan. This was just the first attempt at combining two excellent ideas and seeing where the performance goes. I also hope that this strategy can have a longer backtest over at AllocateSmartly.

Thanks for reading.