Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
f081b26
touchup images
robertmartin8 Aug 21, 2018
5abc46a
updated versioning
robertmartin8 Aug 26, 2018
cacd569
test semicovariance
robertmartin8 Aug 26, 2018
7d1b5cf
semicov, expcov (minus docs)
robertmartin8 Aug 26, 2018
630bf15
bug: volatility was actually std. fixed
robertmartin8 Sep 11, 2018
9d98850
first commit for CVaR opt
robertmartin8 Sep 11, 2018
3db090a
all opt now inherits from base optimizer
robertmartin8 Sep 11, 2018
dc7e9c8
inherit from baseoptimizer
robertmartin8 Sep 11, 2018
6eb8ed3
fixed volatility/variance mixup
robertmartin8 Sep 14, 2018
77c4957
exponential covariance
robertmartin8 Sep 20, 2018
3b3994d
test cvar objective
robertmartin8 Sep 20, 2018
ca5cc72
test VaR
robertmartin8 Sep 20, 2018
0110e2f
test exp cov
robertmartin8 Sep 20, 2018
e868859
added tests on base optimiser
robertmartin8 Sep 20, 2018
6a144aa
fixed tests
robertmartin8 Sep 20, 2018
618040b
cvar objective
robertmartin8 Sep 20, 2018
880882b
minor formatting
robertmartin8 Sep 20, 2018
dbe76d2
Added documentation
robertmartin8 Sep 20, 2018
bf18a1c
basic HRP test
robertmartin8 Sep 23, 2018
f36e1e4
basic HRP test
robertmartin8 Sep 23, 2018
308cc56
minor typo fix
robertmartin8 Sep 23, 2018
8d85eab
test custom objective
robertmartin8 Sep 23, 2018
1c3c97e
changed custom objective api
robertmartin8 Sep 23, 2018
cb0f794
updated docs for v0.2.0
robertmartin8 Sep 23, 2018
aafb0ea
updated roadmap and changelog
robertmartin8 Sep 23, 2018
5dcd6b1
updated docs v0.2.0
robertmartin8 Sep 23, 2018
3d1d625
updated docs v0.2.0
robertmartin8 Sep 23, 2018
dd23d46
updated docs v0.2.0
robertmartin8 Sep 23, 2018
8ffee30
added docs for otheroptimisers
robertmartin8 Sep 23, 2018
0b094eb
added link
robertmartin8 Sep 23, 2018
2702c5c
updated readme
robertmartin8 Sep 23, 2018
2c4e9f3
minor refactors
robertmartin8 Sep 23, 2018
37abf8d
added noisyopt dependency
robertmartin8 Sep 23, 2018
28292c1
added v0.2.0 examples
robertmartin8 Sep 23, 2018
b6c7fcc
changed caution notes
robertmartin8 Sep 23, 2018
50dddba
added test for utility objective
robertmartin8 Sep 23, 2018
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 37 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
<img src="https://img.shields.io/badge/python-v3-brightgreen.svg"
alt="python"></a> &nbsp;
<a href="https://pypi.org/project/PyPortfolioOpt/">
<img src="https://img.shields.io/badge/pypi-v0.1.1-brightgreen.svg"
<img src="https://img.shields.io/badge/pypi-v0.2.0-brightgreen.svg"
alt="pypi"></a> &nbsp;
<a href="https://opensource.org/licenses/MIT">
<img src="https://img.shields.io/badge/license-MIT-brightgreen.svg"
Expand Down Expand Up @@ -42,14 +42,14 @@ Head over to the [documentation on ReadTheDocs](https://pyportfolioopt.readthedo
- [Getting started](#getting-started)
- [For development](#for-development)
- [A quick example](#a-quick-example)
- [Project principles and design decisions](#project-principles-and-design-decisions)
- [Advantages over existing implementations](#advantages-over-existing-implementations)
- [An overview of classical portfolio optimisation methods](#an-overview-of-classical-portfolio-optimisation-methods)
- [Features](#features)
- [Expected returns](#expected-returns)
- [Covariance](#covariance)
- [Risk models (covariance)](#risk-models-covariance)
- [Objective functions](#objective-functions)
- [Optional parameters](#optional-parameters)
- [Advantages over existing implementations](#advantages-over-existing-implementations)
- [Project principles and design decisions](#project-principles-and-design-decisions)
- [Roadmap](#roadmap)
- [Testing](#testing)
- [Contributing](#contributing)
Expand Down Expand Up @@ -156,24 +156,6 @@ Funds remaining: $12.15

*Disclaimer: nothing about this project constitues investment advice, and the author bears no responsibiltiy for your subsequent investment decisions. Please refer to the [license](https://github.com/robertmartin8/PyPortfolioOpt/blob/master/LICENSE.txt) for more information.*

## Project principles and design decisions

- It should be easy to swap out individual components of the optimisation process with the user's proprietary improvements.
- User-friendliness is **everything**.
- There is no point in portfolio optimisation unless it can be practically applied to real asset prices.
- Everything that has been implemented should be tested.
- Inline documentation is good: dedicated (separate) documentation is better. The two are not mutually exclusive.
- Formatting should never get in the way of good code: because of this I have deferred **all** formatting decisions to [Black](https://github.com/ambv/black). Initially, some of its decisions irritated me, but it is extremely consistent and actually quite elegant.

## Advantages over existing implementations

- Includes both classical methods (Markowitz 1952), and more recent developments (covariance shrinkage), as well as experimental features such as L2-regularised weights.
- Native support for pandas dataframes: easily input your daily prices data.
- Clear and comprehensive [documentation](https://pyportfolioopt.readthedocs.io/en/latest/), hosted on ReadTheDocs.
- Extensive practical tests, which use real-life data.
- Easy to combine with your own proprietary strategies and models.
- Robust to missing data, and price-series of different lengths (e.g FB data only goes back to 2012, whereas AAPL data goes back to 1980).

## An overview of classical portfolio optimisation methods

Harry Markowitz's 1952 paper is the undeniable classic, which turned portfolio optimisation from an art into a science. The key insight is that by combining assets with different expected returns and volatilities, one can decide on a mathematically optimal allocation which minimises the risk for a target return – the set of all such optimal portfolios is referred to as the **efficient frontier**.
Expand Down Expand Up @@ -202,7 +184,7 @@ A far more comprehensive version of this can be found on [ReadTheDocs](https://p
- similar to mean historical returns, except it gives exponentially more weight to recent prices
- it is likely the case that an asset's most recent returns hold more weight than returns from 10 years ago when it comes to estimating future returns.

### Covariance
### Risk models (covariance)

The covariance matrix encodes not just the volatility of an asset, but also how it correlated to other assets. This is important because in order to reap the benefits of diversification (and thus increase return per unit risk), the assets in the portfolio should be as uncorrelated as possible.

Expand All @@ -211,6 +193,8 @@ The covariance matrix encodes not just the volatility of an asset, but also how
- relatively easy to compute
- the de facto standard for many years
- however, it has a high estimation error, which is particularly dangerous in mean-variance optimisation because the optimiser is likely to give excess weight to these erroneous estimates.
- Semicovariance: a measure of risk that focuses on downside variation.
- Exponential covariance: an improvement over sample covariance that gives more weight to recent data
- Covariance shrinkage: techniques that involve combining the sample covariance matrix with a structured estimator, in order to reduce the effect of erroneous weights. PyPortfolioOpt provides wrappers around the efficient vectorised implementations provided by `sklearn.covariance`.
- manual shrinkage
- Ledoit Wolf shrinkage, which chooses an optimal shrinkage parameter
Expand All @@ -221,10 +205,11 @@ The covariance matrix encodes not just the volatility of an asset, but also how

### Objective functions

- Maximum Sharpe ratio: this is also called the *tangency portfolio* because on a graph of returns vs risk, this portfolio corresponds to the tangent of the efficient frontier that has a y-intercept equal to the risk-free rate. This is the default option because it finds the optimal return per unit risk.
- Maximum Sharpe ratio: this results in a *tangency portfolio* because on a graph of returns vs risk, this portfolio corresponds to the tangent of the efficient frontier that has a y-intercept equal to the risk-free rate. This is the default option because it finds the optimal return per unit risk.
- Minimum volatility. This may be useful if you're trying to get an idea of how low the volatility *could* be, but in practice it makes a lot more sense to me to use the portfolio that maximises the Sharpe ratio.
- Efficient return, a.k.a. the Markowitz portfolio, which minimises risk for a given target return – this was the main focus of Markowitz 1952
- Efficient risk: the Sharpe-maximising portfolio for a given target risk.
- Condiitional value-at-risk: a measure of tail loss

### Optional parameters

Expand Down Expand Up @@ -254,6 +239,31 @@ ef = EfficientFrontier(mu, S, gamma=1)
ef.max_sharpe()
```

## Advantages over existing implementations

- Includes both classical methods (Markowitz 1952), suggested best practices
(e.g covariance shrinkage), along with many recent developments and novel
features, like L2 regularisation, shrunk covariance, hierarchical risk parity.
- Native support for pandas dataframes: easily input your daily prices data.
- Extensive practical tests, which use real-life data.
- Easy to combine with your own proprietary strategies and models.
- Robust to missing data, and price-series of different lengths (e.g FB data
only goes back to 2012 whereas AAPL data goes back to 1980).

## Project principles and design decisions

- It should be easy to swap out individual components of the optimisation process
with the user's proprietary improvements.
- Usability is everything: it is better to be self-explanatory than consistent.
- There is no point in portfolio optimisation unless it can be practically
applied to real asset prices.
- Everything that has been implemented should be tested.
- Inline documentation is good: dedicated (separate) documentation is better.
The two are not mutually exclusive.
- Formatting should never get in the way of good code: because of this,
I have deferred **all** formatting decisions to `Black
<https://github.com/ambv/black>`_.

## Roadmap

Feel free to raise an issue requesting any new features – here are some of the things I want to implement:
Expand Down Expand Up @@ -284,7 +294,9 @@ PyPortfolioOpt provides a test dataset of daily returns for 20 tickers:
- different performances and volatilities
- different amounts of data to test robustness

Currently, the tests have not explored all of the edge cases, however I have investigated the experimental features like L2 regularisation. Additionally, the tests currently have not satisfactorily tested all combinations of objective function and options.
Currently, the tests have not explored all of the edge cases and combinations
of objective functions and parameters. However, each method and parameter has
been tested to work as intended.

## Contributing

Expand Down
48 changes: 46 additions & 2 deletions docs/EfficientFrontier.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,10 @@ between the optimisation objective and the actual optimisation method – if we
wanted to use something other than mean-variance optimisation via quadratic programming,
these objective functions would still be applicable.

It should be noted that while efficient frontier optimisation is technically a very
specific method, I tend to use it as a blanket term (interchangeably with mean-variance
optimisation) to refer to anything similar, such as minimising variance.

Optimisation
============

Expand All @@ -29,6 +33,7 @@ magnitude, I will definitely consider switching.

.. autoclass:: EfficientFrontier
:members:
:exclude-members: custom_objective

.. automethod:: __init__

Expand Down Expand Up @@ -65,7 +70,7 @@ if you need a certain number of assets in your portfolio.
In order to coerce the efficient frontier optimiser to produce more non-negligible
weights, I have added what can be thought of as a "small weights penalty" to all
of the objective functions, parameterised by :math:`\gamma` (``gamma``). Considering
the minimum volatility objective for instance, we have:
the minimum variance objective for instance, we have:

.. math::
\underset{w}{\text{minimise}} ~ \left\{w^T \Sigma w \right\} ~~~ \longrightarrow ~~~
Expand All @@ -78,7 +83,8 @@ negligible weights, because it has a minimum value when all weights are
equally distributed, and maximum value in the limiting case where the entire portfolio
is allocated to one asset. I refer to it as **L2 regularisation** because it has
exactly the same form as the L2 regularisation term in machine learning, though
a slightly different purpose (in ML it is used to keep weights small).
a slightly different purpose (in ML it is used to keep weights small while here it is
used to make them larger).

.. note::

Expand All @@ -87,3 +93,41 @@ a slightly different purpose (in ML it is used to keep weights small).
(less than 20 assets), then ``gamma=1`` is a good starting point. For larger
universes, or if you want more non-negligible weights in the final portfolio,
increase ``gamma``.


Custom objectives
=================

Though it is simple enough to modify ``objective_functions.py`` to implement
a custom objective (indeed, this is the recommended approach for long-term use),
I understand that most users would find it much more convenient to pass a
custom objective into the optimiser without having to edit the source files.

Thus, v0.2.0 introduces a simple API within the ``EfficientFrontier`` object for
optimising your own objective function.

The first step is to define the objective function, which must take an array
of weights as input (with optional additional arguments), and return a single
float corresponding to the cost. As an example, we will pretend that L2
regularisation is not built-in and re-implement it below:


.. code:: python

def my_objective_function(weights, cov_matrix, k):
variance = np.dot(weights.T, np.dot(cov_matrix, weights))
return variance + k * (weights ** 2).sum()

Next, we instantiate the ``EfficientFrontier`` object, and pass the objectives
function (and all required arguments) into ``custom_objective()``,

.. code:: python

ef = EfficientFrontier(mu, S)
weights = ef.custom_objective(my_objective_function, ef.cov_matrix, 0.3)


.. caution::
It is assumed that the objective function you define will be solvable
by sequential quadratic programming. If this isn't the case, you may
experience silent failure.
27 changes: 17 additions & 10 deletions docs/ExpectedReturns.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,25 +4,32 @@
Expected Returns
################

Mean-variance optimisation requires knowledge of the mean returns. In practice, these
are rather difficult to know with any certainty. Thus the best we can do is to come up
with estimates; one way is to extrapolate historical price data. This is where the main
flaw in efficient frontier lies – the optimisation procedure is sound, and provides
Mean-variance optimisation requires knowledge of the expected returns. In practice,
these are rather difficult to know with any certainty. Thus the best we can do is to
come up with estimates, for example by extrapolating historical data, This is where the
main flaw in efficient frontier lies – the optimisation procedure is sound, and provides
strong mathematical guarantees, *given the correct inputs*. This is one of the reasons
why I have emphasised modularity: users should be able to come up with their own
superior models and feed them into the optimiser.

.. caution::

In my experience, supplying expected returns often does more harm than good. If
predicting stock returns were as easy as calcualting the mean historical return,
we'd all be rich! For most use-cases, I would suggest that you focus your efforts
on choosing an appropriate risk model (see :ref:`risk-models`)

.. automodule:: pypfopt.expected_returns

.. autofunction:: mean_historical_return

This is probably the default textbook approach. It is intuitive and easily interpretable,
however the estimates are unlikely to be accurate. That being said, one of the advantages
of efficient frontier is that the estimation error is reduced by having multiple assets, so
perhaps this inaccuracy is not such an issue. In some informal backtests, I've found
that vanilla efficient frontier portfolios (using mean historical returns and sample covariance)
actually do have a statistically significant outperformance over the S&P500 (in the order of
3-5%). At some stage, I may redo these backtests rigorously and add them to the repo
however the estimates are unlikely to be accurate. This is a problem especially in the
context of a quadratic optimiser, which will maximise the erroneous inputs, In some informal
backtests, I've found that vanilla efficient frontier portfolios (using mean historical
returns and sample covariance) actually do have a statistically significant outperformance
over the S&P500 (in the order of 3-5%), though the same isn't true for cryptoasset portfolios.
At some stage, I may redo these backtests rigorously and add them to the repo
(see the :ref:`roadmap` page for more).


Expand Down
90 changes: 90 additions & 0 deletions docs/OtherOptimisers.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
.. _other-optimisers:

################
Other Optimisers
################

In addition to optimisers that rely on the covariance matrix in the style of
Markowitz, recent developments in portfolio optimisation have seen a number
of alternative optimisation schemes.

Value-at-Risk
=============

The value-at-risk is a measure of tail risk that estimates how much a portfolio
will lose in a day with a given probability. Alternatively, it is the maximum
loss with a confidence of beta. In fact, a more useful measure is the
**expected shortfall**, or **conditional value-at-risk** (CVaR), which is the
mean of all losses so severe that they only occur with a probability
:math:`1-\beta`.

.. math::
CVaR_\beta = \frac{1}{1-\beta} \int_0^{1-\beta} VaR_\gamma(X) d\gamma

To approximate the CVaR for a portfolio, we will follow these steps:

1. Generate the portfolio returns, i.e the weighted sum of individual asset returns.
2. Fit a Gaussian KDE to these returns, then resample.
3. Compute the value-at-risk as the :math:`1-\beta` quantile of sampled returns.
4. Calculate the mean of all the sample returns that are below the value-at-risk.

Though CVaR optimisation can be transformed into a linear programming problem [1]_, I
have opted to keep things simple using the `NoisyOpt <https://noisyopt.readthedocs.io/en/latest/>`_
library, which is suited for optimising noisy functions.

.. warning::
Caveat emptor: this functionality is still experimental. Although I have
used the CVaR optimisation, I've noticed that it is very inconsistent
(which to some extent is expected because of its stochastic nature).
However, the optimiser doesn't always find a minimum, and it fails
silently. Additionally, the weight bounds are not treated as hard bounds.


.. automodule:: pypfopt.value_at_risk

.. autoclass:: CVAROpt
:members:

.. automethod:: __init__

.. caution::
Currently, we have not implemented any performance function. If you
would like to calculate the actual CVaR of the resulting portfolio,
please import the function from `objective_functions`.


Hierarchical Risk Parity (HRP)
==============================

Hierarchical Risk Parity is a novel portfolio optimisation method developed by
Marcos Lopez de Prado [2]_. Though a detailed explanation can be found in the
linked paper, here is a rough overview of how HRP works:


1. From a universe of assets, form a distance matrix based on the correlation
of the assets.
2. Using this distance matrix, cluster the assets into a tree via hierarchical
clustering
3. Within each branch of the tree, form the minimum variance portfolio (normally
between just two assets).
4. Iterate over each level, optimally combining the mini-portfolios at each node.


The advantages of this are that it does not require inversion of the covariance
matrix as with traditional quadratic optimisers, and seems to produce diverse
portfolios that perform well out of sample.


.. automodule:: pypfopt.hierarchical_risk_parity

.. autofunction:: hrp_portfolio

.. note::
Because the HRP functionality doesn't inherit from ``BaseOptimizer``, you will
have to implement pre-processing and post-processing methods on your own.

References
==========

.. [1] Rockafellar and Uryasev (2011) `Optimization of conditional value-at-risk <http://www.ise.ufl.edu/uryasev/files/2011/11/CVaR1_JOR.pdf>`_.
.. [2] López de Prado, M. (2016). `Building Diversified Portfolios that Outperform Out of Sample <https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2708678>`_. The Journal of Portfolio Management, 42(4), 59–69.
2 changes: 2 additions & 0 deletions docs/Postprocessing.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _post-processing:

#######################
Post-processing weights
#######################
Expand Down
Loading