library(tidyverse)
library(quantmod)
library(PerformanceAnalytics)
library(PortfolioAnalytics)
library(timetk)
library(tibbletime)
library(xts)
library(zoo)
library(tidyquant)
In this post, we will explore some finance topics— portfolio optimization and computing portfolio returns. My goal is to apply what I’ve learned in portfolio theory using R as the main tool of analysis. There are many advantages to using a GUI like MS Excel, but R has an amazing data analysis work flow— a sort of one-stop solution from initial data importation to data wrangling and transformation to computations and analysis and then finally to visualizing and reporting results. We will be using functions from several R packages— xts
, PerformanceAnalytics
, PortfolioAnalytics
, tidyquant
, tidyverse
. In particular, the tidyquant
package is a recent development that has markedly enriched R’s financial analytics infrastructure, enhancing its usability in finance. While I cover some theory in this post, my main focus will be on the implementation of these topics in R.
For more readings on the theory, I recommend Essentials of Investments and Practical Portfolio Performance Measurement and Attribution. To learn more about analyzing financial data in R, there is Reproducible Finance with R, which is a very practical book with a strong emphasis on application.
The following packages will be used in this post:
Optimal Weights for a five-asset portfolio (Minimum Variance)
We will employ Markowitz’s Mean-Variance model as the framework for computing optimal weights, essentially treating the task as an “unconstrained” optimization problem. The objective of this optimization problem is one of minimization:
\[\begin{align*} \text{Minimize}\hspace{2mm}(\sigma^{2}=\vec{W}^{T}\sum\vec{W})) \end{align*}\]
subject to the sum of weights constraint and the box constraint: \[ \sum_{i=1}^{N} W_{i}=1 \quad \text { and } \hspace{3mm} \varepsilon_{i} \leq W_{i} \leq \delta_{i} \] where
- where \(\varepsilon_{i}=0.05 \hspace{3mm} \delta_{i}=0.6\) are the lower and upper bounds for the weights \(W_{i}\)
The minimization problem is a quadratic problem with linear constraints, since the variance formula is a quadratic function and the constraints are linear functions; this type of problem is well suited to be solved using a quadratic programming solver. The PortfolioAnalytics
package uses ROI.plugin.quadprog
, a plug-in for the “R” Optimization Infrastructure, to solve the problem. The solver can be specified with the optimize_method argument in optimize.portfolio()
. If optimize_method = “ROI” is specified, a default solver will be selected based on the optimization problem. The glpk
solver is the default solver for LP and MILP optimization problems. The quadprog
solver is the default solver for QP optimization problems.
Implementation in R
For this task, we will import our data from Yahoo Finance. The five assets under examination are Exchange Traded Funds, which are funds that can be traded on an exchange like a stock. Exchange-traded funds are a type of investment fund that offers the best attributes of two popular assets: they have the diversification benefits of mutual funds and the ease with which stocks are traded. However, before we import any data, we must answer the following question: In what form do we want our data to be? Since we are in the world of finance, times series is the most common type of data. R has the xts
package to handle data that are indexed by date. Our task, therefore, reduces to the following:
Import daily prices from Yahoo Finance
Convert daily prices to monthly return
Because we will be aggregating monthly returns to form a portfolio, we will need to compute the simple returns
# Create a vector of ticker symbols
<- c("SPY", "EFA", "IJS", "EEM", "AGG")
symbols # Load data from 2012 to today
# Specify the "to = " argument to specify an end date
<- getSymbols(
prices Symbols = symbols,
src = "yahoo",
from = "2012-12-31",
auto.assign = TRUE,
warnings = FALSE
%>%
) # The map function takes an anonymous function and will return a list of five
# The function Ad() extracts the daily adjusted price series
map(.f = ~ Ad(get(x = .x))) %>%
# Use reduce() to merge the elements of .x interactively
reduce(.f = merge) %>%
# Use a replacement function to set column names to ticker symbols
# This function is in prefix form
# It is equivalent to colnames(x = prices) <- value
`colnames<-`(value = symbols)
# Keep only the last reading of each month
# We could have chosen to keep only the first reading of each month
<- to.monthly(
asset_returns_xts x = prices,
drop.time = TRUE,
indexAt = "lastof",
OHLC = FALSE
|>
) # Compute simple returns
# Log returns are time-additive but not portfolio additive
Return.calculate(method = "discrete") |>
# Drop the first row since we lose 12/31/2012
na.omit()
# Keep only the xts returns, ticker symbols, and the prices series
rm(list = setdiff(x = ls(), y = c("symbols", "prices", "asset_returns_xts")))
- Create Portfolio object
# Examine the monthly simple returns for our five ETF's
head(x = asset_returns_xts, 5)
SPY EFA IJS EEM AGG
2013-01-31 0.05119054 0.03728456 0.053516298 -0.002931048 -0.0062109942
2013-02-28 0.01275881 -0.01288583 0.016306771 -0.022840678 0.0059082976
2013-03-31 0.03797125 0.01305404 0.041079860 -0.010182812 0.0009850652
2013-04-30 0.01921193 0.05018626 0.001222809 0.012158337 0.0096861394
2013-05-31 0.02360979 -0.03019048 0.042869447 -0.048279242 -0.0200112432
# Create Portfolio object which is essentially a list object
<- portfolio.spec(assets = symbols)
min_var_portfolio typeof(min_var_portfolio)
[1] "list"
- Add constraints to the portfolio object
# Add the full investment constraint that specifies that the weights must sum to 1
<- add.constraint(
min_var_portfolio portfolio = min_var_portfolio,
type = "full_investment"
)# Examine the constraint element by extracting min_var_portfolio[["constraints"]][[1]]
str(pluck(.x = min_var_portfolio, "constraints", 1))
List of 6
$ type : chr "full_investment"
$ enabled: logi TRUE
$ message: logi FALSE
$ min_sum: num 1
$ max_sum: num 1
$ call : language add.constraint(portfolio = min_var_portfolio, type = "full_investment")
- attr(*, "class")= chr [1:2] "weight_sum_constraint" "constraint"
# Add the box constraint that ensure the weights are between 0.1 and 0.6
<- add.constraint(
min_var_portfolio portfolio = min_var_portfolio,
type = "box", min = 0.05, max = 0.6
)# Examine the constraint element by extracting min_var_portfolio[["constraints"]][[2]]
str(pluck(.x = min_var_portfolio, "constraints", 2))
List of 5
$ type : chr "box"
$ enabled: logi TRUE
$ min : Named num [1:5] 0.05 0.05 0.05 0.05 0.05
..- attr(*, "names")= chr [1:5] "SPY" "EFA" "IJS" "EEM" ...
$ max : Named num [1:5] 0.6 0.6 0.6 0.6 0.6
..- attr(*, "names")= chr [1:5] "SPY" "EFA" "IJS" "EEM" ...
$ call : language add.constraint(portfolio = min_var_portfolio, type = "box", min = 0.05, max = 0.6)
- attr(*, "class")= chr [1:2] "box_constraint" "constraint"
- Add objective function
# Add objective to minimize variance
<- add.objective(
min_var_portfolio portfolio = min_var_portfolio,
# Minimize risk
type = "risk",
# A character corresponding to a function name, var()
name = "var"
)
- Optimization
# Run the optimization
<- optimize.portfolio(
global_min_portfolio R = asset_returns_xts,
portfolio = min_var_portfolio,
# This defaults to the "quadprog" solver
optimize_method = "quadprog",
# Return additional information on the path or portfolios searched
trace = TRUE
)
# Examine returned portfolio list object
global_min_portfolio
***********************************
PortfolioAnalytics Optimization
***********************************
Call:
optimize.portfolio(R = asset_returns_xts, portfolio = min_var_portfolio,
optimize_method = "quadprog", trace = TRUE)
Optimal Weights:
SPY EFA IJS EEM AGG
0.243 0.050 0.050 0.057 0.600
Objective Measure:
StdDev
0.02117
Optimal Weights for the five-asset portfolio (Maximum Expected Return)
Now that we found the global minimum variance portfolio, we may be interested in finding the maximal expected return portfolio. The objective is one of maximization:
\[\begin{align*} \text{Maximize}\hspace{2mm}(\vec{\mu}^{T}\vec{W}) \end{align*}\]
subject to the sum of weights constraint and the box constraint: \[ \sum_{i=1}^{N} W_{i}=1 \quad \text { and } \hspace{3mm} \varepsilon_{i} \leq W_{i} \leq \delta_{i} \]
The optimization problem in this case is a linear programming problem, since the portfolio expected return formula is a linear function. This is best tackled using a linear programming solver. The package PortfolioAnalytics
uses the ROI package with the glpk plugin
, the GNU Linear Programming toolkit of R’s Optimization Infrastructure.
Implementation in R
- Create portfolio object
# Create Portfolio object
<- portfolio.spec(assets = symbols) max_exp_return_portfolio
- Add constraints to the object
# Add the full investment constraint that specifies the weights must sum to 1
<- add.constraint(
max_exp_return_portfolio portfolio = max_exp_return_portfolio,
type = "full_investment"
)# Add the box constraint that ensure the weights are between 0.1 and 0.6
<- add.constraint(
max_exp_return_portfolio portfolio = max_exp_return_portfolio,
type = "box", min = 0.05, max = 0.6
)
- Add objective function
# Add objective to maximize mean returns
<- add.objective(
max_exp_return_portfolio portfolio = max_exp_return_portfolio,
# Maximize expected returns
type = "return",
# A character corresponding to a function name, mean()
name = "mean"
)
- Optimization
# Run the optimization
<- optimize.portfolio(
global_max_portfolio R = asset_returns_xts,
portfolio = max_exp_return_portfolio,
# This defaults to the "glpk" solver
optimize_method = "glpk",
# Return additional information on the path or portfolios searched
trace = TRUE
)# Examine returned portfolio list object
global_max_portfolio
***********************************
PortfolioAnalytics Optimization
***********************************
Call:
optimize.portfolio(R = asset_returns_xts, portfolio = max_exp_return_portfolio,
optimize_method = "glpk", trace = TRUE)
Optimal Weights:
SPY EFA IJS EEM AGG
0.60 0.05 0.25 0.05 0.05
Objective Measure:
mean
0.01033
Building a portfolio
We have found two sets of optimal weights that yield portfolios that offer the lowest possible risk and the high possible expected return given two basic constraints. Our next task is to aggregate the monthly returns of the individual ETF’s to find the monthly returns of our five-asset portfolio.
# Set optimal weights
<- pluck(.x = global_max_portfolio, "weights")
weights # Check if the weights and symbols align
tibble(weights, symbols)
# A tibble: 5 × 2
weights symbols
<dbl> <chr>
1 0.6 SPY
2 0.05 EFA
3 0.250 IJS
4 0.05 EEM
5 0.05 AGG
# Ensure that the weights vector sums up to 1
tibble(weights, symbols) |>
summarize(total_weight = sum(weights))
# A tibble: 1 × 1
total_weight
<dbl>
1 1
The portfolio return in month, \(t\), is given by:
\[\begin{align*} r_{\text{portfolio,t}}=\sum_{i=1}^{5}W_{i}r_{i,t} \end{align*}\]
Monthly portfolio returns by hand
We can compute portfolio monthly returns using a brute force method:
# Compute by hand
<-
portfolio_returns_by_hand 1]] * asset_returns_xts[, 1]) +
(weights[[2]] * asset_returns_xts[, 2]) +
(weights[[3]] * asset_returns_xts[, 3]) +
(weights[[4]] * asset_returns_xts[, 4]) +
(weights[[5]] * asset_returns_xts[, 5])
(weights[[# Name the series
<- `names<-`(portfolio_returns_by_hand, "Monthly portfolio returns")
portfolio_returns_by_hand # Examine
head(portfolio_returns_by_hand, 5)
Monthly portfolio returns
2013-01-31 0.04550053
2013-02-28 0.01024107
2013-03-31 0.03324553
2013-04-30 0.01543440
2013-05-31 0.01995919
Monthly portfolio returns in xts
Another way to compute portfolio monthly returns is to use functions from the PerformanceAnalytics
package, which works well with xts
objects. We also adopt monthly re-balancing as a strategy. The re-balancing of investments is the action of keeping the portfolio weights consistent with the optimal weights. Note that re-balancing the weights every month may be unrealistic; in reality, the convention is often to re-balance quarterly or annually. For this example, however, we will re-balance monthly to be consistent with our brute force computation.
# Compute monthly portfolio returns
<-
portfolio_returns_xts_rebalanced_monthly Return.portfolio(
R = asset_returns_xts,
weights = weights,
# Monthly re-balancing
reblance_on = "months",
# Use simple/arithmetic chaining to aggregate returns
geometric = FALSE
|>
) `colnames<-`("Monthly_portfolio_returns")
# Examine
head(portfolio_returns_xts_rebalanced_monthly, 5)
Monthly_portfolio_returns
2013-01-31 0.04550053
2013-02-28 0.01024107
2013-03-31 0.03324553
2013-04-30 0.01543440
2013-05-31 0.01995919
- The function
Return.portfolio(R, weights = NULL, wealth.index = FALSE, contribution = FALSE, geometric = TRUE, rebalance_on = c(NA, "years", "quarters", "months", "weeks", "days"), value = 1, verbose = FALSE, ...)
calculates the returns of a portfolio with the same periodicity of the returns data.
Monthly portfolio returns in the tidyverse
The tidyverse
is a collection of R packages designed with the same underlying philosophy, grammar, and data structures. Simply put, the “tidy” data structure that works well with tidyverse functions is one where every row is an observation and every column is a variable. If we re-examine the xts
object called “asset_returns_xts,” we see that every column is a returns series for a particular asset. This is inconsistent with the tidyverse data structure, and so we must convert the xts object to a tidy format for our computations. Ideally, we would like to have one column called “asset” that specifies the names of the ETF instead of having five individual columns of returns data. This idea will become clearer once we convert our xts object to a tibble.
Now, examine the “tidy” data structure and compare it to the xts object created earlier:
::headtail(asset_returns_long, n = 5) FSA
date asset returns
1 2013-01-31 SPY 0.051190543
2 2013-01-31 EFA 0.037284562
3 2013-01-31 IJS 0.053516298
4 2013-01-31 EEM -0.002931048
5 2013-01-31 AGG -0.006210994
706 2024-10-31 SPY 0.007372386
707 2024-10-31 EFA -0.046992711
708 2024-10-31 IJS -0.015511850
709 2024-10-31 EEM -0.013737485
710 2024-10-31 AGG -0.023404604
Since our returns data is in a “tidy” format, computing portfolio monthly returns is very a straight forward. For those who are familiar with SQL, we are essentially using the CASE statement here.
# Use vectorized if else statements to assign weights according to the asset column
<- asset_returns_long |>
potfolio_returns_dplyr_byhand group_by(asset) |>
mutate(
weights = dplyr::case_when(
== symbols[[1]] ~ weights[[1]],
asset == symbols[[2]] ~ weights[[2]],
asset == symbols[[3]] ~ weights[[3]],
asset == symbols[[4]] ~ weights[[4]],
asset == symbols[[5]] ~ weights[[5]]
asset
),weighted_returns = weights * returns
|>
) # Group by date
# We need to group by date so that the aggregate sum() function is carried out by month
# For each date, the original series has 5 weighted returns, one for each ETF
# The results should be 1 portfolio return (the sum of the 5 weighted returns) for each month
group_by(date) |>
# Compute monthly portfolio returns
summarize(Monthly_portfolio_returns = sum(weighted_returns))
# Examine the data
head(potfolio_returns_dplyr_byhand, 5)
# A tibble: 5 × 2
date Monthly_portfolio_returns
<date> <dbl>
1 2013-01-31 0.0455
2 2013-02-28 0.0102
3 2013-03-31 0.0332
4 2013-04-30 0.0154
5 2013-05-31 0.0200
- The function
summarize(.data, ..., .groups = NULL)
creates a new data frame. It will have one (or more) rows for each combination of grouping variables.
Monthly portfolio returns in tidyquant
The tidyquant
package gives us two ways to compute the portfolio monthly returns.
# Use function from the tidyquant
<- asset_returns_long |>
portfolio_returns_tq_rebalanced_monthly_method_1 tq_portfolio(
assets_col = asset,
returns_col = returns,
weights = weights,
col_rename = "Monthly_portfolio_returns",
rebalance_on = "months"
)# Examine
head(portfolio_returns_tq_rebalanced_monthly_method_1, 5)
# A tibble: 5 × 2
date Monthly_portfolio_returns
<date> <dbl>
1 2013-01-31 0.0455
2 2013-02-28 0.0102
3 2013-03-31 0.0332
4 2013-04-30 0.0154
5 2013-05-31 0.0200
- The function
tq_portfolio(data, assets_col, returns_col, weights = NULL, col_rename = NULL, ...)
aggregates a group of returns by asset into portfolio returns.
An alternative way to load data is to use the tidyquant
wrapper function, which automatically coerces the returns into a long tidy format:
# Load data
<- tq_get(
asset_returns_tq x = symbols,
get = "stock.prices",
from = "2012-12-31"
|>
) # The asset column is named symbol by default (see body(tidyquant::tq_get))
group_by(symbol) |>
# Select adjusted daily prices
tq_transmute(
select = adjusted,
col_rename = "returns",
# This function is from quantmod
mutate_fun = periodReturn,
# These arguments are passed along to the mutate function quantmod::periodReturn
period = "monthly",
# Simple returns
type = "arithmetic",
# Do not return leading period data
leading = FALSE,
# This argument is passed along to xts::to.period, which is wrapped by quantmod::periodReturn
# We use the last reading of each month to find percentage changes
indexAt = "lastof"
|>
) rename(asset = symbol) |>
na.omit()
- The function
tq_get(x, get = "stock.prices", complete_cases = TRUE, ...)
gets data in tibble format. The most important argument is perhaps the dot-dot-dot, since these are the arguments passed to the underlying functions from the other packages thattq_get()
uses. There could be multiple layers of wrapper functions, and so it is best practice to read the documentations carefully.
A possibly more useful method of aggregating asset returns to portfolio returns is then to map a tibble of ticker symbols and weights to the tibble of ticker symbols and monthly returns:
# Create a tibble of weights
<- tibble(
weights_tibble asset = symbols,
weights = weights
)head(weights_tibble, 5)
# A tibble: 5 × 2
asset weights
<chr> <dbl>
1 SPY 0.6
2 EFA 0.05
3 IJS 0.250
4 EEM 0.05
5 AGG 0.05
# Map the weights to the returns column using the asset column as the identifier
<- asset_returns_tq |>
portfolio_returns_tq_rebalanced_monthly_method_2 tq_portfolio(
assets_col = asset,
returns_col = returns,
weights = weights_tibble,
col_rename = "Monthly_portfolio_returns",
rebalance_on = "months"
)head(portfolio_returns_tq_rebalanced_monthly_method_2, 5)
# A tibble: 5 × 2
date Monthly_portfolio_returns
<date> <dbl>
1 2013-01-31 0.0455
2 2013-02-28 0.0102
3 2013-03-31 0.0332
4 2013-04-30 0.0154
5 2013-05-31 0.0200
Compare and Contrast the four methods
We have covered a variety of different methods for aggregating asset monthly returns to portfolio monthly returns. As a sanity check, we want to ensure that these methods yield consistent results:
|>
potfolio_returns_dplyr_byhand # Rename the column
rename(tidyverse_method = Monthly_portfolio_returns) |>
# Create three new columns that contain results for other methods
mutate(
"equation_byhand" = zoo::coredata(x = portfolio_returns_by_hand),
"tq_method_1" = portfolio_returns_tq_rebalanced_monthly_method_1[["Monthly_portfolio_returns"]],
"tq_method_2" = portfolio_returns_tq_rebalanced_monthly_method_2[["Monthly_portfolio_returns"]],
"xts_method" = zoo::coredata(portfolio_returns_xts_rebalanced_monthly)
|>
) modify_if(.p = is.numeric, .f = ~ round(x = .x, digits = 7)) |>
select(-date) |>
# Examine
head(10)
# A tibble: 10 × 5
tidyverse_method equation_byhand[,"Monthly portfoli…¹ tq_method_1 tq_method_2
<dbl> <dbl> <dbl> <dbl>
1 0.0455 0.0455 0.0455 0.0455
2 0.0102 0.0102 0.0102 0.0102
3 0.0332 0.0332 0.0332 0.0332
4 0.0154 0.0154 0.0154 0.0154
5 0.0200 0.0200 0.0200 0.0200
6 -0.0131 -0.0131 -0.0131 -0.0131
7 0.0509 0.0509 0.0509 0.0509
8 -0.0292 -0.0292 -0.0292 -0.0292
9 0.0436 0.0436 0.0436 0.0436
10 0.0406 0.0406 0.0406 0.0406
# ℹ abbreviated name: ¹equation_byhand[,"Monthly portfolio returns"]
# ℹ 1 more variable: xts_method <dbl[,1]>
- The function
coredata(x, ...)
is a generic function for extracting the core data contained in, a (more complex) object and replacing it. The replacement function iscoredata<-
.
As can be seen, the results are all consistent. Sigh of relief. For the upcoming posts, I also plan on tackling other topics in portfolio theory and to build off of what I’ve covered in this post. In particular, Python also has many great libraries— numpy, scipy, and yahoo finance— all of which contain great tools for financial analysis.