diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 9dda4aff..5e5072c8 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -2,7 +2,7 @@ Some of the things that I'd love for people to help with: -- Improve performance of existing code (but not at the cost of readability) – I realise that most optimisation projects in python use `cvxopt` rather than `scipy.optimize`, but the latter is far cleaner and much more readable. If it transpires that performance differs by orders of magnitude, I will definitely consider switching. +- Improve performance of existing code (but not at the cost of readability) – are there any nice numpy hacks I've missed? - Add new optimisation objectives. For example, if you think that the best performance metric has not been included, write an optimiser! (or suggest it in [Issues](https://github.com/robertmartin8/PyPortfolioOpt/issues) and I will have a go). - Help me write more tests! If you are someone learning about quant finance and/or unit testing in python, what better way to practice than to write some tests on an open-source project! Feel free to check for edge cases, or test performance on a dataset with more stocks. diff --git a/README.md b/README.md index ac45590b..94b9012f 100755 --- a/README.md +++ b/README.md @@ -8,7 +8,7 @@ python   - pypi   PyPortfolioOpt is a library that implements portfolio optimisation methods, including -classical efficient frontier techniques and Black-Litterman allocation, as well as more +classical mean-variance optimisation techniques and Black-Litterman allocation, as well as more recent developments in the field like shrinkage and Hierarchical Risk Parity, along with some novel experimental features like exponentially-weighted covariance matrices. -It is **extensive** yet easily **extensible**, and can be useful for both the casual investor and the serious -practitioner. Whether you are a fundamentals-oriented investor who has identified a +It is **extensive** yet easily **extensible**, and can be useful for both the casual investor and the serious practitioner. Whether you are a fundamentals-oriented investor who has identified a handful of undervalued picks, or an algorithmic trader who has a basket of -interesting signals, PyPortfolioOpt can help you combine your alpha-generators +interesting signals, PyPortfolioOpt can help you combine your alpha streams in a risk-efficient way. Head over to the [documentation on ReadTheDocs](https://pyportfolioopt.readthedocs.io/en/latest/) to get an in-depth look at the project, or continue below to check out some examples.
- +
## Table of contents @@ -54,7 +53,7 @@ Head over to the [documentation on ReadTheDocs](https://pyportfolioopt.readthedo - [Expected returns](#expected-returns) - [Risk models (covariance)](#risk-models-covariance) - [Objective functions](#objective-functions) - - [Efficient Frontier hyperparameters](#efficient-frontier-hyperparameters) + - [Adding constraints or different objectives](#adding-constraints-or-different-objectives) - [Black-Litterman allocation](#black-litterman-allocation) - [Other optimisers](#other-optimisers) - [Advantages over existing implementations](#advantages-over-existing-implementations) @@ -72,7 +71,8 @@ This project is available on PyPI, meaning that you can just: pip install PyPortfolioOpt ``` -However, I have since been converted to `poetry`, so my current recommendation is to get yourself set up with [poetry](https://github.com/sdispater/poetry) then just run +However, it is best practice to use a dependency manager within a virtual environment. +My current recommendation is to get yourself set up with [poetry](https://github.com/sdispater/poetry) then just run ```bash poetry add PyPortfolioOpt @@ -104,7 +104,7 @@ Here is an example on real life stock data, demonstrating how easy it is to find ```python import pandas as pd -from pypfopt.efficient_frontier import EfficientFrontier +from pypfopt import EfficientFrontier from pypfopt import risk_models from pypfopt import expected_returns @@ -153,7 +153,7 @@ Annual volatility: 21.7% Sharpe Ratio: 1.43 ``` -Instead of just stopping here, PyPortfolioOpt provides a method which allows you to convert the above continuous weights to an actual allocation that you could buy. Just enter the most recent prices, and the desired portfolio size ($10000 in this example): +This is interesting but not useful in itself. However, PyPortfolioOpt provides a method which allows you to convert the above continuous weights to an actual allocation that you could buy. Just enter the most recent prices, and the desired portfolio size ($10,000 in this example): ```python from pypfopt.discrete_allocation import DiscreteAllocation, get_latest_prices @@ -180,14 +180,22 @@ Funds remaining: $8.42 Harry Markowitz's 1952 paper is the undeniable classic, which turned portfolio optimisation from an art into a science. The key insight is that by combining assets with different expected returns and volatilities, one can decide on a mathematically optimal allocation which minimises the risk for a target return – the set of all such optimal portfolios is referred to as the **efficient frontier**. -Although much development has been made in the subject, more than half a century later, Markowitz's core ideas are still fundamentally important, and see daily use in many portfolio management firms. The main drawback of mean-variance optimisation is that the theoretical treatment requires knowledge of the expected returns and the future risk-characteristics (covariance) of the assets. Obviously, if we knew the expected returns of a stock life would be much easier, but the whole game is that stock returns are notoriously hard to forecast. As a substitute, we can derive estimates of the expected return and covariance based on historical data – though we do lose the theoretical guarantees provided by Markowitz, the closer our estimates are to the real values, the better our portfolio will be. +
+ +
+ +Although much development has been made in the subject, more than half a century later, Markowitz's core ideas are still fundamentally important, and see daily use in many portfolio management firms. +The main drawback of mean-variance optimisation is that the theoretical treatment requires knowledge of the expected returns and the future risk-characteristics (covariance) of the assets. Obviously, if we knew the expected returns of a stock life would be much easier, but the whole game is that stock returns are notoriously hard to forecast. As a substitute, we can derive estimates of the expected return and covariance based on historical data – though we do lose the theoretical guarantees provided by Markowitz, the closer our estimates are to the real values, the better our portfolio will be. Thus this project provides four major sets of functionality (though of course they are intimately related) - Estimate of expected returns -- Estimate of the covariance of assets +- Estimate of risk (i.e covariance of asset returns) - Objective functions to be optimised -- Parameters for the efficient frontier +- Optimisers. + +A key design goal of PyPortfolioOpt is **modularity** – the user should be able to swap in their +components while still making use of the framework that PyPortfolioOpt provides. ## Features @@ -224,14 +232,21 @@ The covariance matrix encodes not just the volatility of an asset, but also how - a robust estimate of the covariance - implemented in `sklearn.covariance` +

+ +

+ +(This plot was generated using `risk_models.correlation_plot`) + ### Objective functions - Maximum Sharpe ratio: this results in a *tangency portfolio* because on a graph of returns vs risk, this portfolio corresponds to the tangent of the efficient frontier that has a y-intercept equal to the risk-free rate. This is the default option because it finds the optimal return per unit risk. - Minimum volatility. This may be useful if you're trying to get an idea of how low the volatility *could* be, but in practice it makes a lot more sense to me to use the portfolio that maximises the Sharpe ratio. - Efficient return, a.k.a. the Markowitz portfolio, which minimises risk for a given target return – this was the main focus of Markowitz 1952 - Efficient risk: the Sharpe-maximising portfolio for a given target risk. +- Maximum qudratic utility. You can provide your own risk-aversion level annd compute the appropriate portfolio. -### Efficient Frontier hyperparameters +### Adding constraints or different objectives - Long/short: by default all of the mean-variance optimisation methods in PyPortfolioOpt are long-only, but they can be initialised to allow for short positions by changing the weight bounds: @@ -252,10 +267,14 @@ ef.efficient_return(target_return=0.2, market_neutral=True) ef = EfficientFrontier(mu, S, weight_bounds=(0, 0.1)) ``` -- L2 Regularisation: this is a novel experimental feature which can be used to reduce the number of negligible weights for any of the objective functions. Essentially, it adds a penalty (parameterised by `gamma`) on small weights, with a term that looks just like L2 regularisation in machine learning. It may be necessary to trial a number of `gamma` values to achieve the desired number of non-negligible weights. For the test portfolio of 20 securities, `gamma ~ 1` is sufficient +One issue with mean-variance optimisation is that it leads to many zero-weights. While these are +"optimal" in-sample, there is a large body of research showing that this characteristic leads +mean-variance portfolios to underperform out-of-sample. To that end, I have introduced an +objective function that can reduce the number of negligible weights for any of the objective functions. Essentially, it adds a penalty (parameterised by `gamma`) on small weights, with a term that looks just like L2 regularisation in machine learning. It may be necessary to trial a number of `gamma` values to achieve the desired number of non-negligible weights. For the test portfolio of 20 securities, `gamma ~ 1` is sufficient ```python -ef = EfficientFrontier(mu, S, gamma=1) +ef = EfficientFrontier(mu, S) +ef.add_objective(objective_functions.L2_reg, gamma=1) ef.max_sharpe() ``` @@ -272,11 +291,14 @@ S = risk_models.sample_cov(df) viewdict = {"AAPL": 0.20, "BBY": -0.30, "BAC": 0, "SBUX": -0.2, "T": 0.131321} bl = BlackLittermanModel(S, absolute_views=viewdict) rets = bl.bl_returns() + +ef = EfficientFrontier(rets, S) +ef.max_sharpe() ``` ### Other optimisers -The features above mostly pertain to efficient frontier optimisation via quadratic programming. However, we offer different optimisers as well: +The features above mostly pertain to solving efficient frontier optimisation problems via quadratic programming (though this is taken care of by `cvxpy`). However, we offer different optimisers as well: - Hierarchical Risk Parity, using clustering algorithms to choose uncorrelated assets - Markowitz's critical line algorithm (CLA) @@ -285,7 +307,7 @@ Please refer to the [documentation](https://pyportfolioopt.readthedocs.io/en/lat ## Advantages over existing implementations -- Includes both classical methods (Markowitz 1952), suggested best practices +- Includes both classical methods (Markowitz 1952 and Black-Litterman), suggested best practices (e.g covariance shrinkage), along with many recent developments and novel features, like L2 regularisation, shrunk covariance, hierarchical risk parity. - Native support for pandas dataframes: easily input your daily prices data. @@ -304,19 +326,21 @@ Please refer to the [documentation](https://pyportfolioopt.readthedocs.io/en/lat - Everything that has been implemented should be tested. - Inline documentation is good: dedicated (separate) documentation is better. The two are not mutually exclusive. -- Formatting should never get in the way of good code: because of this, +- Formatting should never get in the way of coding: because of this, I have deferred **all** formatting decisions to [Black](https://github.com/ambv/black). ## Roadmap Feel free to raise an issue requesting any new features – here are some of the things I want to implement: -- Custom utility functions, including risk aversion -- Plotting the efficient frontier. -- More optimisation goals, including the Calmar Ratio, Sortino Ratio, etc. +- Optimising for higher moments (i.e skew and kurtosis) +- Factor modelling: doable but not sure if it fits within the API. +- Proper CVaR optimisation – remove NoisyOpt and use linear programming +- More objective functions, including the Calmar Ratio, Sortino Ratio, etc. - Monte Carlo optimisation with custom distributions -- Black-Litterman portfolio selection -- Improved CVaR optimisation using linear programming. +- Open-source backtests using either `Backtrader `_ or + `Zipline `_. +- Further support for different risk/return models ## Testing diff --git a/docs/EfficientFrontier.rst b/docs/EfficientFrontier.rst index ca23aa29..17114f92 100644 --- a/docs/EfficientFrontier.rst +++ b/docs/EfficientFrontier.rst @@ -4,34 +4,63 @@ Efficient Frontier Optimisation ############################### -The implementation of efficient frontier optimisation in PyPortfolioOpt is separated -into the :py:mod:`objective_functions` and :py:mod:`efficient_frontier` modules. It -was designed this way because in my mind there is a clear conceptual separation -between the optimisation objective and the actual optimisation method – if we -wanted to use something other than mean-variance optimisation via quadratic programming, -these objective functions would still be applicable. - -It should be noted that while efficient frontier optimisation is technically a very +Mathematical optimisation is a very difficult problem in general, particularly when we are dealing +with complex objectives and constraints. However, **convex optimisation** problems are a well-understood +class of problems, which happen to be incredibly useful for finance. A convex problem has the following form: + +.. math:: + + \begin{equation*} + \begin{aligned} + & \underset{\mathbf{x}}{\text{minimise}} & & f(\mathbf{x}) \\ + & \text{subject to} & & g_i(\mathbf{x}) \leq 0, i = 1, \ldots, m\\ + &&& A\mathbf{x} = b,\\ + \end{aligned} + \end{equation*} + +where :math:`\mathbf{x} \in \mathbb{R}^n`, and :math:`f(\mathbf{x}), g_i(\mathbf{x})` are convex functions. [1]_ + +Fortunately, portfolio optimisation problems (with standard and objective constraints) are convex. This +allows us to immediately apply the vast body of theory as well as the refined solving routines -- accordingly, +the main difficulty is inputting our specific problem into a solver. + +PyPortfolioOpt aims to do the hard work for you, allowing for one-liners like ``ef.min_volatility()`` +to generate a portfolio that minimises the volatility, while at the same time allowing for more +complex problems to be built up from modular units. This is all possible thanks to +`cvxpy `_, the *fantastic* python-embedded modelling +language for convex optimisation upon which PyPortfolioOpt's efficient frontier functionality lies. + +As a brief aside, I should note that while "efficient frontier" optimisation is technically a very specific method, I tend to use it as a blanket term (interchangeably with mean-variance optimisation) to refer to anything similar, such as minimising variance. -Optimisation -============ +Structure +========= -PyPortfolioOpt uses `scipy.optimize `_. -I realise that most python optimisation projects use `cvxopt `_ -instead, but I do think that scipy.optimize is far cleaner and much more readable -(as per the Zen of Python, "Readability counts"). That being said, scipy.optimize -arguably has worse documentation, though ultimately I felt that it was intuitive -enough to justify the lack of explained examples. Because they are both based on -`LAPACK `_, I don't see why performance should -differ significantly, but if it transpires that cvxopt is faster by an order of -magnitude, I will definitely consider switching. +As shown in the definition of a convex problem, there are essentially two things we need to specify: +the optimisation objective, and the optimisation constraints. For example, the classic portfolio +optimisation problem is to **minimise risk** subject to a **return constraint** (i.e the portfolio +must return more than a certain amount). From an implementation perspective, however, there is +not much difference between an objective and a constraint. Consider a similar problem, which is to +**maximize return** subject to a **risk constraint** -- now, the role of risk and return have swapped. -.. tip:: +To that end, PyPortfolioOpt defines an :py:mod:`objective_functions` module that contains objective functions +(which can also act as constraints, as we have just seen). The actual optimisation occurs in the :py:class:`efficient_frontier.EfficientFrontier` class. +This class provides straightforward methods for optimising different objectives (all documented below). + +However, PyPortfolioOpt was designed so that you can easily add new constraints or objective terms to an existing problem. +For example, adding a regularisation objective (explained below) to a minimum volatility objective is as simple as:: - If you would like to plot the efficient frontier, take a look at the :ref:`cla`. + ef = EfficientFrontier(expected_returns, cov_matrix) # setup + ef.add_objective(objective_functions.L2_reg) # add a secondary objective + ef.min_volatility() # find the portfolio that minimises volatility and L2_reg + +.. tip:: + + If you would like to plot the efficeint frontier, take a look at the :ref:`cla`. +Basic Usage +=========== .. automodule:: pypfopt.efficient_frontier @@ -46,18 +75,25 @@ magnitude, I will definitely consider switching. As of v0.5.0, you can pass a collection (list or tuple) of (min, max) pairs representing different bounds for different assets. + .. tip:: + + If you want to generate short-only portfolios, there is a quick hack. Multiply + your expected returns by -1, then optimise a long-only portfolio. + + .. automethod:: max_sharpe - .. note:: + .. caution:: + + Because ``max_sharpe()`` makes a variable substitution, additional objectives may + not work as intended. - If you want to generate short-only portfolios, there is a quick hack. Multiply - your expected returns by -1, then maximise a long-only portfolio. - .. automethod:: max_unconstrained_utility + .. automethod:: max_quadratic_utility .. note:: - pypfopt.BlackLitterman provides a method for calculating the market-implied + ``pypfopt.black_litterman`` provides a method for calculating the market-implied risk-aversion parameter, which gives a useful estimate in the absence of other information! @@ -67,19 +103,34 @@ magnitude, I will definitely consider switching. :py:meth:`efficient_return`, the optimiser will fail silently and return weird weights. *Caveat emptor* applies! +Adding objectives and constraints +================================= + +EfficientFrontier inherits from the BaseConvexOptimizer class. In particular, the functions to +add constraints and objectives are documented below: + + + .. class:: pypfopt.base_optimizer.BaseConvexOptimizer + + .. automethod:: add_constraint + + .. automethod:: add_objective + + Objective functions =================== .. automodule:: pypfopt.objective_functions :members: + One of the experimental features implemented in PyPortfolioOpt is the L2 regularisation parameter ``gamma``, which is discussed below. .. _L2-Regularisation: -L2 Regularisation -================= +More on L2 Regularisation +========================= As has been discussed in the :ref:`user-guide`, efficient frontier optimisation often results in many weights being negligible, i.e the efficient portfolio does not end up @@ -87,7 +138,7 @@ including most of the assets. This is expected behaviour, but it may be undesira if you need a certain number of assets in your portfolio. In order to coerce the efficient frontier optimiser to produce more non-negligible -weights, I have added what can be thought of as a "small weights penalty" to all +weights, we add what can be thought of as a "small weights penalty" to all of the objective functions, parameterised by :math:`\gamma` (``gamma``). Considering the minimum variance objective for instance, we have: @@ -113,36 +164,31 @@ used to make them larger). universes, or if you want more non-negligible weights in the final portfolio, increase ``gamma``. -.. _custom-objectives: +.. _custom-optimisation: -Custom objectives -================= +Custom optimisation problems +============================ -Though it is simple enough to modify ``objective_functions.py`` to implement -a custom objective (indeed, this is the recommended approach for long-term use), -I understand that most users would find it much more convenient to pass a -custom objective into the optimiser without having to edit the source files. +Previously we described an API for adding constraints and objectives to one of the core +optimisation problems in the ``EfficientFrontier`` class. However, what if you aren't interested +in anything related to ``max_sharpe()``, ``min_volatility()``, ``efficient_risk()`` etc and want to +set up a completely new problem to optimise for some custom objective? -Thus, v0.2.0 introduces a simple API within the ``EfficientFrontier`` object for -optimising your own objective function. +The ``EfficientFrontier`` class inherits from the ``BaseConvexOptimizer``, which allows you to +define your own optimisation problem. You can either optimise some generic ``convex_objective`` +(which *must* be built using ``cvxpy`` atomic functions -- see `here `_) +or a ``nonconvex_objective``, which uses ``scipy.optimize`` as the backend and thus has a completely +different API. -The first step is to define the objective function, which must take an array -of weights as input (with optional additional arguments), and return a single -float corresponding to the cost. As an example, we will pretend that L2 -regularisation is not built-in and re-implement it below:: + .. class:: pypfopt.base_optimizer.BaseConvexOptimizer - def my_objective_function(weights, cov_matrix, k): - variance = np.dot(weights.T, np.dot(cov_matrix, weights)) - return variance + k * (weights ** 2).sum() + .. automethod:: convex_objective + + .. automethod:: nonconvex_objective -Next, we instantiate the ``EfficientFrontier`` object, and pass the objectives -function (and all required arguments) into ``custom_objective()``:: - ef = EfficientFrontier(mu, S) - weights = ef.custom_objective(my_objective_function, ef.cov_matrix, 0.3) +References +========== +.. [1] Boyd, S.; Vandenberghe, L. (2004). `Convex Optimization `_. -.. caution:: - It is assumed that the objective function you define will be solvable - by sequential quadratic programming. If this isn't the case, you may - experience silent failure. diff --git a/docs/ExpectedReturns.rst b/docs/ExpectedReturns.rst index 97e5819b..09708d0b 100755 --- a/docs/ExpectedReturns.rst +++ b/docs/ExpectedReturns.rst @@ -27,13 +27,8 @@ superior models and feed them into the optimiser. .. autofunction:: mean_historical_return This is probably the default textbook approach. It is intuitive and easily interpretable, - however the estimates are unlikely to be accurate. This is a problem especially in the - context of a quadratic optimiser, which will maximise the erroneous inputs, In some informal - backtests, I've found that vanilla efficient frontier portfolios (using mean historical - returns and sample covariance) actually do have a statistically significant outperformance - over the S&P500 (in the order of 3-5%), though the same isn't true for cryptoasset portfolios. - At some stage, I may redo these backtests rigorously and add them to the repo - (see the :ref:`roadmap` page for more). + however the estimates are subject to large uncertainty. This is a problem especially in the + context of a quadratic optimiser, which will maximise the erroneous inputs. .. autofunction:: ema_historical_return diff --git a/docs/OtherOptimisers.rst b/docs/OtherOptimisers.rst index 94df9e64..50503b0a 100644 --- a/docs/OtherOptimisers.rst +++ b/docs/OtherOptimisers.rst @@ -11,7 +11,7 @@ though please note that the implementations may be slightly unstable. .. note:: As of v0.4, these other optimisers now inherit from ``BaseOptimizer`` or - ``BaseScipyOptimizer``, so you no longer have to implement pre-processing and + ``BaseConvexOptimizer``, so you no longer have to implement pre-processing and post-processing methods on your own. You can thus easily swap out, say, ``EfficientFrontier`` for ``HRPOpt``. @@ -36,8 +36,12 @@ The advantages of this are that it does not require inversion of the covariance matrix as with traditional quadratic optimisers, and seems to produce diverse portfolios that perform well out of sample. +.. image:: ../media/dendrogram.png + :width: 80% + :align: center -.. automodule:: pypfopt.hierarchical_risk_parity + +.. automodule:: pypfopt.hierarchical_portfolio .. autoclass:: HRPOpt :members: @@ -54,6 +58,10 @@ that is especially advantageous when we apply linear inequalities. Unlike generi the CLA is specially designed for portfolio optimisation. It is guaranteed to converge after a certain number of iterations, and can efficiently derive the entire efficient frontier. +.. image:: ../media/cla_plot.png + :width: 80% + :align: center + .. tip:: In general, unless you have specific requirements e.g you would like to efficiently compute the entire @@ -74,13 +82,13 @@ the same API, though as of v0.5.0 we only support ``max_sharpe()`` and ``min_vol Implementing your own optimiser =============================== -Please note that this is quite different to implementing :ref:`custom-objectives`, because in -that case we are still using the same quadratic optimiser. However, HRP and CVaR optimisation +Please note that this is quite different to implementing :ref:`custom-optimisation`, because in +that case we are still using the same quadratic optimiser. However, HRP and CLA optimisation have a fundamentally different optimisation method. In general, these are much more difficult to code up compared to custom objective functions. To implement a custom optimiser that is compatible with the rest of PyPortfolioOpt, just -extend ``BaseOptimizer`` (or ``BaseScipyOptimizer`` if you want to use scipy.optimize), +extend ``BaseOptimizer`` (or ``BaseConvexOptimizer`` if you want to use ``cvxpy``), both of which can be found in ``base_optimizer.py``. This gives you access to utility methods like ``clean_weights()``, as well as making sure that any output is compatible with ``portfolio_performance()`` and post-processing methods. @@ -92,7 +100,7 @@ with ``portfolio_performance()`` and post-processing methods. .. automethod:: __init__ - .. autoclass:: BaseScipyOptimizer + .. autoclass:: BaseConvexOptimizer :members: :private-members: @@ -151,4 +159,3 @@ References .. [1] López de Prado, M. (2016). `Building Diversified Portfolios that Outperform Out of Sample `_. The Journal of Portfolio Management, 42(4), 59–69. .. [2] Bailey and Loópez de Prado (2013). `An Open-Source Implementation of the Critical-Line Algorithm for Portfolio Optimization `_ -.. [3] Rockafellar and Uryasev (2011) `Optimization of conditional value-at-risk `_. diff --git a/docs/RiskModels.rst b/docs/RiskModels.rst index d14e4a05..ebfbc6df 100644 --- a/docs/RiskModels.rst +++ b/docs/RiskModels.rst @@ -5,18 +5,23 @@ Risk Models ########### In addition to the expected returns, mean-variance optimisation requires a -**risk model**, some way of quantifying asset risk. The most commonly-use risk model +**risk model**, some way of quantifying asset risk. The most commonly-used risk model is the covariance matrix, a statistical entity that describes the volatility of asset returns and how they vary with one another. This is important because one of the principles of diversification is that risk can be reduced by making many uncorrelated bets (and correlation is just normalised covariance). +.. image:: ../media/corrplot.png + :align: center + :width: 60% + + In many ways, the subject of risk models is far more important than that of expected returns because historical variance is generally a much more predictive statistic than mean historical returns. In fact, research by Kritzman et -al. (2010) [1]_ suggests that minimum variance portfolios, which neglect to -provide expected returns, actually perform much better out of sample. +al. (2010) [1]_ suggests that minimum variance portfolios, formed by optimising +wthout providing expected returns, actually perform much better out of sample. The problem, however, is that in practice we do not have access to the covariance matrix (in the same way that we don't have access to expected returns) – the only @@ -25,15 +30,15 @@ approach is to just calculate the **sample covariance matrix** based on historic returns, but relatively recent (post-2000) research indicates that there are much more robust statistical estimators of the covariance matrix. In addition to providing a wrapper around the estimators in ``sklearn``, PyPortfolioOpt -provides some novel alternatives such as semicovariance and exponentially weighted +provides some experimental alternatives such as semicovariance and exponentially weighted covariance. .. attention:: Estimation of the covariance matrix is a very deep and actively-researched topic that involves statistics, econometrics, and numerical/computational - approaches. Please note that I am not an expert, but I have made an effort - to familiarise myself with the seminal papers in the field. + approaches. I have made an effort to familiarise myself with the seminal papers in the field + and implement a few basic options, but there are many more advanced models that could be used. .. automodule:: pypfopt.risk_models @@ -79,7 +84,7 @@ covariance. recent data when calculating covariance, in the same way that the exponential moving average price is often preferred to the simple average price. For a full explanation of how this estimator works, please refer to the - `blog post `_ + `blog post `_ on my academic website. .. autofunction:: min_cov_determinant @@ -89,13 +94,21 @@ covariance. :py:mod:`sklearn.covariance` module, which is based on the algorithm presented in Rousseeuw 1999 [4]_. + .. caution:: + + Some of my tests have shown that ``min_cov_determinant`` does not always + result in positive definite matrices. Please use ``risk_models._is_positive_semidefinite()`` + to check your covariance matrix before optimising portfolios. + .. autofunction:: cov_to_corr - .. note:: - This is especially useful when it comes to visualise the 'correlation matrices' that - are associated with (shrunk) covariance matrices, using Matplotlib's ``imshow`` or - Seaborn's ``heatmap``. + .. autofunction:: correlation_plot + + An example of a correlation plot is shown at the top of the page. + + + Shrinkage estimators ==================== diff --git a/docs/Roadmap.rst b/docs/Roadmap.rst index b7728f3a..372829de 100644 --- a/docs/Roadmap.rst +++ b/docs/Roadmap.rst @@ -14,12 +14,32 @@ have any other feature requests, please raise them using GitHub - Optimising for higher moments (i.e skew and kurtosis) - Factor modelling: doable but not sure if it fits within the API. -- Proper CVaR optimisation – remove NoisyOpt and use proper linear programming +- Proper CVaR optimisation – remove NoisyOpt and use linear programming - Monte Carlo optimisation with custom distributions - Open-source backtests using either `Backtrader `_ or `Zipline `_. - Further support for different risk/return models +1.0.0 +===== + +- Migrated backend from ``scipy`` to ``cvxpy`` and made significant breaking changes to the API + + - PyPortfolioOpt is now significantly more robust and numerically stable. + - These changes will not affect basic users, who can still access features like ``max_sharpe()``. + - However, additional objectives and constraints (including L2 regularisation) are now + explicitly added before optimising some 'primary' objective. + +- Added basic plotting capabilities for the efficient frontier, hierarchical clusters, + and HRP dendrograms. +- Added a basic transaction cost objective. +- Made breaking changes to some modules and classes so that PyPortfolioOpt is easier to extend + in future: + + - Replaced ``BaseScipyOptimizer`` with ``BaseConvexOptimizer`` + - ``hierarchical_risk_parity`` was replaced by ``hierarchical_portfolios`` to leave the door open for other hierarchical methods. + - Sadly, removed CVaR optimisation for the time being until I can properly fix it. + 0.5.0 ===== @@ -115,7 +135,7 @@ fixing a bug in the arguments of a call to ``portfolio_performance``. 0.3.3 ----- -Migrated the project internally to use the ``poetry`` dependency manager. Will still keep ``setup.py`` and ``requirements.txt``, but ``poetry`` is now the recommended way to interact with ``PyPortfolioOpt`` +Migrated the project internally to use the ``poetry`` dependency manager. Will still keep ``setup.py`` and ``requirements.txt``, but ``poetry`` is now the recommended way to interact with PyPortfolioOpt. 0.3.4 ----- diff --git a/docs/UserGuide.rst b/docs/UserGuide.rst index 6021002d..faaa0c49 100644 --- a/docs/UserGuide.rst +++ b/docs/UserGuide.rst @@ -20,7 +20,7 @@ offered. But for now, we will continue with the Efficient Frontier. PyPortfolioOpt is designed with modularity in mind; the below flowchart sums up the current functionality and overall layout of PyPortfolioOpt. -.. image:: ../media/conceptual_flowchart_v1-grey.png +.. image:: ../media/conceptual_flowchart_v2-grey.png Processing historical prices ============================ @@ -211,38 +211,39 @@ From experience, I have found that efficient frontier optimisation often sets ma of the asset weights to be zero. This may not be ideal if you need to have a certain number of positions in your portfolio, for diversification purposes or otherwise. -To combat this, I have introduced an experimental feature, which borrows the idea of +To combat this, I have introduced an objective function which borrows the idea of regularisation from machine learning. Essentially, by adding an additional cost function to the objective, you can 'encourage' the optimiser to choose different weights (mathematical details are provided in the :ref:`L2-Regularisation` section). To use this feature, change the ``gamma`` parameter:: - ef = EfficientFrontier(mu, S, gamma=1) - ef.max_sharpe() + ef = EfficientFrontier(mu, S) + ef.add_objective(objective_functions.L2_reg, gamma=0.1) + w = ef.max_sharpe() print(ef.clean_weights()) The result of this has far fewer negligible weights than before:: - {'GOOG': 0.05664, - 'AAPL': 0.087, - 'FB': 0.1591, - 'BABA': 0.09784, - 'AMZN': 0.06986, + {'GOOG': 0.06366, + 'AAPL': 0.09947, + 'FB': 0.15742, + 'BABA': 0.08701, + 'AMZN': 0.09454, 'GE': 0.0, 'AMD': 0.0, - 'WMT': 0.03649, + 'WMT': 0.01766, 'BAC': 0.0, 'GM': 0.0, - 'T': 0.02204, + 'T': 0.00398, 'UAA': 0.0, 'SHLD': 0.0, - 'XOM': 0.04812, - 'RRC': 0.0045, - 'BBY': 0.06389, - 'MA': 0.16382, - 'PFE': 0.1358, + 'XOM': 0.03072, + 'RRC': 0.00737, + 'BBY': 0.07572, + 'MA': 0.1769, + 'PFE': 0.12346, 'JPM': 0.0, - 'SBUX': 0.05489} + 'SBUX': 0.06209} Post-processing weights ----------------------- @@ -260,38 +261,40 @@ further in :ref:`post-processing`, but we provide an example below:: These are the quantitites of shares that should be bought to have a $20,000 portfolio:: - {'GOOG': 1, - 'AAPL': 10, - 'FB': 19, - 'BABA': 11, - 'AMZN': 1, - 'WMT': 9, - 'T': 13, - 'XOM': 13, - 'BBY': 19, - 'MA': 19, - 'PFE': 76, - 'SBUX': 19} + {'AAPL': 2.0, + 'FB': 12.0, + 'BABA': 14.0, + 'GE': 18.0, + 'WMT': 40.0, + 'GM': 58.0, + 'T': 97.0, + 'SHLD': 1.0, + 'XOM': 47.0, + 'RRC': 3.0, + 'BBY': 1.0, + 'PFE': 47.0, + 'SBUX': 5.0} Improving performance ---------------------- +===================== -Let us say you have conducted backtests and the results aren't spectacular. What +Let's say you have conducted backtests and the results aren't spectacular. What should you try? -- Drop the expected returns. There is a large body of research that suggests that - minimum variance portfolios consistently outperform maximum Sharpe ratio portfolios - out-of-sample, because of the dififuclty of forecasting expected returns. +- Try the Hierarchical Risk Parity model (see :ref:`other-optimisers`) – which seems + to robustly outperform mean-variance optimisation out of sample. +- Use the Black-Litterman model to construct a more stable model of expected returns. + Alternatively, just drop the expected returns altogether!. There is a large body of research + that suggests that minimum variance portfolios (``ef.min_volatility()``) consistently outperform + maximum Sharpe ratio portfolios out-of-sample, because of the dififuclty of forecasting expected returns. - Try different risk models: different asset classes may require different risk models. -- Tune the L2 regularisation parameter to see how diversification affects the - performance. -- Try a different optimiser: see the :ref:`other-optimisers` section for some - possibilities. +- Add some new objective terms or constraints. Tune the L2 regularisation parameter to see how diversification + affects the performance. This concludes the guided tour. Head over to the appropriate sections in the sidebar to learn more about the parameters and theoretical details of the -different functionality offered by PyPortfolioOpt. If you have any questions, please +different models offered by PyPortfolioOpt. If you have any questions, please raise an issue on GitHub and I will try to respond promptly. diff --git a/docs/conf.py b/docs/conf.py index 36bb50d5..8114a22f 100755 --- a/docs/conf.py +++ b/docs/conf.py @@ -56,9 +56,9 @@ # built documents. # # The short X.Y version. -version = "0.5" +version = "1.0" # The full version, including alpha/beta/rc tags. -release = "0.5.5" +release = "1.0.0" # The language for content autogenerated by Sphinx. Refer to documentation # for a list of supported languages. diff --git a/docs/index.rst b/docs/index.rst index 691fc0e2..ec63832c 100755 --- a/docs/index.rst +++ b/docs/index.rst @@ -10,7 +10,7 @@ python
  - python   =3.2)", "rst.linker (>=1.9)"] testing = ["jaraco.itertools", "func-timeout"] [extras] -noisyopt = [] +matplotlib = [] scikit-learn = [] [metadata] -content-hash = "43a19db2a64532808163facbd1ae1dad90128eb2005a6a70b4d46b32fa08dad5" +content-hash = "b610210272bfbe1a6f01415f73802e560f0661db58b0f98efcaf8a2694b2b6dd" python-versions = "^3.6.0" [metadata.files] @@ -673,27 +673,23 @@ multiprocess = [ {file = "multiprocess-0.70.9.tar.gz", hash = "sha256:9fd5bd990132da77e73dec6e9613408602a4612e1d73caf2e2b813d2b61508e5"}, ] numpy = [ - {file = "numpy-1.18.1-cp35-cp35m-macosx_10_6_intel.whl", hash = "sha256:20b26aaa5b3da029942cdcce719b363dbe58696ad182aff0e5dcb1687ec946dc"}, - {file = "numpy-1.18.1-cp35-cp35m-manylinux1_i686.whl", hash = "sha256:70a840a26f4e61defa7bdf811d7498a284ced303dfbc35acb7be12a39b2aa121"}, - {file = "numpy-1.18.1-cp35-cp35m-manylinux1_x86_64.whl", hash = "sha256:17aa7a81fe7599a10f2b7d95856dc5cf84a4eefa45bc96123cbbc3ebc568994e"}, - {file = "numpy-1.18.1-cp35-cp35m-win32.whl", hash = "sha256:f3d0a94ad151870978fb93538e95411c83899c9dc63e6fb65542f769568ecfa5"}, - {file = "numpy-1.18.1-cp35-cp35m-win_amd64.whl", hash = "sha256:1786a08236f2c92ae0e70423c45e1e62788ed33028f94ca99c4df03f5be6b3c6"}, - {file = "numpy-1.18.1-cp36-cp36m-macosx_10_9_x86_64.whl", hash = "sha256:ae0975f42ab1f28364dcda3dde3cf6c1ddab3e1d4b2909da0cb0191fa9ca0480"}, - {file = "numpy-1.18.1-cp36-cp36m-manylinux1_i686.whl", hash = "sha256:cf7eb6b1025d3e169989416b1adcd676624c2dbed9e3bcb7137f51bfc8cc2572"}, - {file = "numpy-1.18.1-cp36-cp36m-manylinux1_x86_64.whl", hash = "sha256:b765ed3930b92812aa698a455847141869ef755a87e099fddd4ccf9d81fffb57"}, - {file = "numpy-1.18.1-cp36-cp36m-win32.whl", hash = "sha256:2d75908ab3ced4223ccba595b48e538afa5ecc37405923d1fea6906d7c3a50bc"}, - {file = "numpy-1.18.1-cp36-cp36m-win_amd64.whl", hash = "sha256:9acdf933c1fd263c513a2df3dceecea6f3ff4419d80bf238510976bf9bcb26cd"}, - {file = "numpy-1.18.1-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:56bc8ded6fcd9adea90f65377438f9fea8c05fcf7c5ba766bef258d0da1554aa"}, - {file = "numpy-1.18.1-cp37-cp37m-manylinux1_i686.whl", hash = "sha256:e422c3152921cece8b6a2fb6b0b4d73b6579bd20ae075e7d15143e711f3ca2ca"}, - {file = "numpy-1.18.1-cp37-cp37m-manylinux1_x86_64.whl", hash = "sha256:b3af02ecc999c8003e538e60c89a2b37646b39b688d4e44d7373e11c2debabec"}, - {file = "numpy-1.18.1-cp37-cp37m-win32.whl", hash = "sha256:d92350c22b150c1cae7ebb0ee8b5670cc84848f6359cf6b5d8f86617098a9b73"}, - {file = "numpy-1.18.1-cp37-cp37m-win_amd64.whl", hash = "sha256:77c3bfe65d8560487052ad55c6998a04b654c2fbc36d546aef2b2e511e760971"}, - {file = "numpy-1.18.1-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:c98c5ffd7d41611407a1103ae11c8b634ad6a43606eca3e2a5a269e5d6e8eb07"}, - {file = "numpy-1.18.1-cp38-cp38-manylinux1_i686.whl", hash = "sha256:9537eecf179f566fd1c160a2e912ca0b8e02d773af0a7a1120ad4f7507cd0d26"}, - {file = "numpy-1.18.1-cp38-cp38-manylinux1_x86_64.whl", hash = "sha256:e840f552a509e3380b0f0ec977e8124d0dc34dc0e68289ca28f4d7c1d0d79474"}, - {file = "numpy-1.18.1-cp38-cp38-win32.whl", hash = "sha256:590355aeade1a2eaba17617c19edccb7db8d78760175256e3cf94590a1a964f3"}, - {file = "numpy-1.18.1-cp38-cp38-win_amd64.whl", hash = "sha256:39d2c685af15d3ce682c99ce5925cc66efc824652e10990d2462dfe9b8918c6a"}, - {file = "numpy-1.18.1.zip", hash = "sha256:b6ff59cee96b454516e47e7721098e6ceebef435e3e21ac2d6c3b8b02628eb77"}, + {file = "numpy-1.18.2-cp35-cp35m-macosx_10_9_x86_64.whl", hash = "sha256:a1baa1dc8ecd88fb2d2a651671a84b9938461e8a8eed13e2f0a812a94084d1fa"}, + {file = "numpy-1.18.2-cp35-cp35m-manylinux1_i686.whl", hash = "sha256:a244f7af80dacf21054386539699ce29bcc64796ed9850c99a34b41305630286"}, + {file = "numpy-1.18.2-cp35-cp35m-manylinux1_x86_64.whl", hash = "sha256:6fcc5a3990e269f86d388f165a089259893851437b904f422d301cdce4ff25c8"}, + {file = "numpy-1.18.2-cp35-cp35m-win32.whl", hash = "sha256:b5ad0adb51b2dee7d0ee75a69e9871e2ddfb061c73ea8bc439376298141f77f5"}, + {file = "numpy-1.18.2-cp35-cp35m-win_amd64.whl", hash = "sha256:87902e5c03355335fc5992a74ba0247a70d937f326d852fc613b7f53516c0963"}, + {file = "numpy-1.18.2-cp36-cp36m-macosx_10_9_x86_64.whl", hash = "sha256:9ab21d1cb156a620d3999dd92f7d1c86824c622873841d6b080ca5495fa10fef"}, + {file = "numpy-1.18.2-cp36-cp36m-manylinux1_i686.whl", hash = "sha256:cdb3a70285e8220875e4d2bc394e49b4988bdb1298ffa4e0bd81b2f613be397c"}, + {file = "numpy-1.18.2-cp36-cp36m-manylinux1_x86_64.whl", hash = "sha256:6d205249a0293e62bbb3898c4c2e1ff8a22f98375a34775a259a0523111a8f6c"}, + {file = "numpy-1.18.2-cp36-cp36m-win32.whl", hash = "sha256:a35af656a7ba1d3decdd4fae5322b87277de8ac98b7d9da657d9e212ece76a61"}, + {file = "numpy-1.18.2-cp36-cp36m-win_amd64.whl", hash = "sha256:1598a6de323508cfeed6b7cd6c4efb43324f4692e20d1f76e1feec7f59013448"}, + {file = "numpy-1.18.2-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:deb529c40c3f1e38d53d5ae6cd077c21f1d49e13afc7936f7f868455e16b64a0"}, + {file = "numpy-1.18.2-cp37-cp37m-manylinux1_i686.whl", hash = "sha256:cd77d58fb2acf57c1d1ee2835567cd70e6f1835e32090538f17f8a3a99e5e34b"}, + {file = "numpy-1.18.2-cp37-cp37m-manylinux1_x86_64.whl", hash = "sha256:b1fe1a6f3a6f355f6c29789b5927f8bd4f134a4bd9a781099a7c4f66af8850f5"}, + {file = "numpy-1.18.2-cp37-cp37m-win32.whl", hash = "sha256:2e40be731ad618cb4974d5ba60d373cdf4f1b8dcbf1dcf4d9dff5e212baf69c5"}, + {file = "numpy-1.18.2-cp37-cp37m-win_amd64.whl", hash = "sha256:4ba59db1fcc27ea31368af524dcf874d9277f21fd2e1f7f1e2e0c75ee61419ed"}, + {file = "numpy-1.18.2-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:59ca9c6592da581a03d42cc4e270732552243dc45e87248aa8d636d53812f6a5"}, + {file = "numpy-1.18.2-cp38-cp38-manylinux1_i686.whl", hash = "sha256:1b0ece94018ae21163d1f651b527156e1f03943b986188dd81bc7e066eae9d1c"}, ] osqp = [ {file = "osqp-0.6.1-cp27-cp27m-macosx_10_6_intel.whl", hash = "sha256:72bca52b84c7ea7762c42e2fd801b301eb2990b7f033887c997fd1b91623d110"}, diff --git a/pypfopt/__init__.py b/pypfopt/__init__.py index e69de29b..725a8e0e 100755 --- a/pypfopt/__init__.py +++ b/pypfopt/__init__.py @@ -0,0 +1,22 @@ +from .black_litterman import ( + market_implied_prior_returns, + market_implied_risk_aversion, + BlackLittermanModel, +) +from .cla import CLA +from .discrete_allocation import get_latest_prices, DiscreteAllocation +from .efficient_frontier import EfficientFrontier +from .hierarchical_portfolio import HRPOpt +from .risk_models import CovarianceShrinkage + +__all__ = [ + "market_implied_prior_returns", + "market_implied_risk_aversion", + "BlackLittermanModel", + "CLA", + "get_latest_prices", + "DiscreteAllocation", + "EfficientFrontier", + "HRPOpt", + "CovarianceShrinkage", +] diff --git a/pypfopt/base_optimizer.py b/pypfopt/base_optimizer.py index aa4b7359..114aa10e 100644 --- a/pypfopt/base_optimizer.py +++ b/pypfopt/base_optimizer.py @@ -1,7 +1,7 @@ """ -The ``base_optimizer`` module houses the parent classes ``BaseOptimizer`` and -``BaseScipyOptimizer``, from which all optimisers will inherit. The later is for -optimisers that use the scipy solver. +The ``base_optimizer`` module houses the parent classes ``BaseOptimizer`` from which all +optimisers will inherit. ``BaseConvexOptimizer`` is the base class for all ``cvxpy`` (and ``scipy``) +optimisation. Additionally, we define a general utility function ``portfolio_performance`` to evaluate return and risk for a given set of portfolio weights. @@ -10,7 +10,10 @@ import json import numpy as np import pandas as pd +import cvxpy as cp +import scipy.optimize as sco from . import objective_functions +from . import exceptions class BaseOptimizer: @@ -96,18 +99,30 @@ def save_weights_to_file(self, filename="weights.csv"): f.write(str(clean_weights)) -class BaseScipyOptimizer(BaseOptimizer): +class BaseConvexOptimizer(BaseOptimizer): """ + The BaseConvexOptimizer contains many private variables for use by + ``cvxpy``. For example, the immutable optimisation variable for weights + is stored as self._w. Interacting directly with these variables is highly + discouraged. + Instance variables: - ``n_assets`` - int - ``tickers`` - str list - ``weights`` - np.ndarray - - ``bounds`` - float tuple OR (float tuple) list - - ``initial_guess`` - np.ndarray - - ``constraints`` - dict list - - ``opt_method`` - the optimisation algorithm to use. Defaults to SLSQP. + + Public methods: + + - ``add_objective()`` adds a (convex) objective to the optimisation problem + - ``add_constraint()`` adds a (linear) constraint to the optimisation problem + - ``convex_objective()`` solves for a generic convex objective with linear constraints + - ``nonconvex_objective()`` solves for a generic nonconvex objective using the scipy backend. + This is prone to getting stuck in local minima and is generally *not* recommended. + - ``set_weights()`` creates self.weights (np.ndarray) from a weights dict + - ``clean_weights()`` rounds the weights and clips near-zeros. + - ``save_weights_to_file()`` saves the weights to csv, json, or txt. """ def __init__(self, n_assets, tickers=None, weight_bounds=(0, 1)): @@ -118,50 +133,219 @@ def __init__(self, n_assets, tickers=None, weight_bounds=(0, 1)): :type weight_bounds: tuple OR tuple list, optional """ super().__init__(n_assets, tickers) - self.bounds = self._make_valid_bounds(weight_bounds) - # Optimisation parameters - self.initial_guess = np.array([1 / self.n_assets] * self.n_assets) - self.constraints = [{"type": "eq", "fun": lambda x: np.sum(x) - 1}] - self.opt_method = "SLSQP" - def _make_valid_bounds(self, test_bounds): + # Optimisation variables + self._w = cp.Variable(n_assets) + self._objective = None + self._additional_objectives = [] + self._additional_constraints_raw = [] + self._constraints = [] + self._lower_bounds = None + self._upper_bounds = None + self._map_bounds_to_constraints(weight_bounds) + + def _map_bounds_to_constraints(self, test_bounds): """ - Private method: process input bounds into a form acceptable by scipy.optimize, - and check the validity of said bounds. + Process input bounds into a form acceptable by cvxpy and add to the constraints list. :param test_bounds: minimum and maximum weight of each asset OR single min/max pair - if all identical, defaults to (0, 1). - :type test_bounds: tuple OR list/tuple of tuples. - :raises ValueError: if ``test_bounds`` is not a tuple of length two OR a collection - of pairs. - :raises ValueError: if the lower bound is too high - :return: a tuple of bounds, e.g ((0, 1), (0, 1), (0, 1) ...) - :rtype: tuple of tuples + if all identical OR pair of arrays corresponding to lower/upper bounds. defaults to (0, 1). + :type test_bounds: tuple OR list/tuple of tuples OR pair of np arrays + :raises TypeError: if ``test_bounds`` is not of the right type + :return: bounds suitable for cvxpy + :rtype: tuple pair of np.ndarray """ # If it is a collection with the right length, assume they are all bounds. if len(test_bounds) == self.n_assets and not isinstance( test_bounds[0], (float, int) ): - bounds = test_bounds + bounds = np.array(test_bounds, dtype=np.float) + self._lower_bounds = np.nan_to_num(bounds[:, 0], nan=-np.inf) + self._upper_bounds = np.nan_to_num(bounds[:, 1], nan=np.inf) else: - if len(test_bounds) != 2 or not isinstance(test_bounds, tuple): - raise ValueError( - "test_bounds must be a tuple of (lower bound, upper bound) " - "OR collection of bounds for each asset" + # Otherwise this must be a pair. + if len(test_bounds) != 2 or not isinstance(test_bounds, (tuple, list)): + raise TypeError( + "test_bounds must be a pair (lower bound, upper bound) " + "OR a collection of bounds for each asset" ) - bounds = (test_bounds,) * self.n_assets + lower, upper = test_bounds + + # Replace None values with the appropriate +/- 1 + if np.isscalar(lower) or lower is None: + lower = -1 if lower is None else lower + self._lower_bounds = np.array([lower] * self.n_assets) + upper = 1 if upper is None else upper + self._upper_bounds = np.array([upper] * self.n_assets) + else: + self._lower_bounds = np.nan_to_num(lower, nan=-1) + self._upper_bounds = np.nan_to_num(upper, nan=1) + + self._constraints.append(self._w >= self._lower_bounds) + self._constraints.append(self._w <= self._upper_bounds) + + def _solve_cvxpy_opt_problem(self): + """ + Helper method to solve the cvxpy problem and check output, + once objectives and constraints have been defined + + :raises exceptions.OptimizationError: if problem is not solvable by cvxpy + """ + try: + opt = cp.Problem(cp.Minimize(self._objective), self._constraints) + opt.solve() + except (TypeError, cp.DCPError): + raise exceptions.OptimizationError + if opt.status != "optimal": + raise exceptions.OptimizationError + self.weights = self._w.value.round(16) + 0.0 # +0.0 removes signed zero + + def add_objective(self, new_objective, **kwargs): + """ + Add a new term into the objective function. This term must be convex, + and built from cvxpy atomic functions. + + Example:: + + def L1_norm(w, k=1): + return k * cp.norm(w, 1) + + ef.add_objective(L1_norm, k=2) + + :param new_objective: the objective to be added + :type new_objective: cp.Expression (i.e function of cp.Variable) + """ + self._additional_objectives.append(new_objective(self._w, **kwargs)) + + def add_constraint(self, new_constraint): + """ + Add a new constraint to the optimisation problem. This constraint must be linear and + must be either an equality or simple inequality. + + Examples:: + + ef.add_constraint(lambda x : x[0] == 0.02) + ef.add_constraint(lambda x : x >= 0.01) + ef.add_constraint(lambda x: x <= np.array([0.01, 0.08, ..., 0.5])) - # Ensure lower bound is not too high - if sum((0 if b[0] is None else b[0]) for b in bounds) > 1: - raise ValueError( - "Lower bound is too high. Impossible to construct valid portfolio" + :param new_constraint: the constraint to be added + :type constraintfunc: lambda function + """ + if not callable(new_constraint): + raise TypeError("New constraint must be provided as a lambda function") + + # Save raw constraint (needed for e.g max_sharpe) + self._additional_constraints_raw.append(new_constraint) + # Add constraint + self._constraints.append(new_constraint(self._w)) + + def convex_objective(self, custom_objective, weights_sum_to_one=True, **kwargs): + """ + Optimise a custom convex objective function. Constraints should be added with + ``ef.add_constraint()``. Optimiser arguments must be passed as keyword-args. Example:: + + # Could define as a lambda function instead + def logarithmic_barrier(w, cov_matrix, k=0.1): + # 60 Years of Portfolio Optimisation, Kolm et al (2014) + return cp.quad_form(w, cov_matrix) - k * cp.sum(cp.log(w)) + + w = ef.convex_objective(logarithmic_barrier, cov_matrix=ef.cov_matrix) + + :param custom_objective: an objective function to be MINIMISED. This should be written using + cvxpy atoms Should map (w, **kwargs) -> float. + :type custom_objective: function with signature (cp.Variable, **kwargs) -> cp.Expression + :param weights_sum_to_one: whether to add the default objective, defaults to True + :type weights_sum_to_one: bool, optional + :raises OptimizationError: if the objective is nonconvex or constraints nonlinear. + :return: asset weights for the efficient risk portfolio + :rtype: dict + """ + # custom_objective must have the right signature (w, **kwargs) + self._objective = custom_objective(self._w, **kwargs) + + for obj in self._additional_objectives: + self._objective += obj + + if weights_sum_to_one: + self._constraints.append(cp.sum(self._w) == 1) + + self._solve_cvxpy_opt_problem() + return dict(zip(self.tickers, self.weights)) + + def nonconvex_objective( + self, + custom_objective, + objective_args=None, + weights_sum_to_one=True, + constraints=None, + solver="SLSQP", + ): + """ + Optimise some objective function using the scipy backend. This can + support nonconvex objectives and nonlinear constraints, but often gets stuck + at local minima. This method is not recommended – caveat emptor. Example:: + + # Market-neutral efficient risk + constraints = [ + {"type": "eq", "fun": lambda w: np.sum(w)}, # weights sum to zero + { + "type": "eq", + "fun": lambda w: target_risk ** 2 - np.dot(w.T, np.dot(ef.cov_matrix, w)), + }, # risk = target_risk + ] + ef.nonconvex_objective( + lambda w, mu: -w.T.dot(mu), # min negative return (i.e maximise return) + objective_args=(ef.expected_returns,), + weights_sum_to_one=False, + constraints=constraints, ) - return bounds + :param objective_function: an objective function to be MINIMISED. This function + should map (weight, args) -> cost + :type objective_function: function with signature (np.ndarray, args) -> float + :param objective_args: arguments for the objective function (excluding weight) + :type objective_args: tuple of np.ndarrays + :param weights_sum_to_one: whether to add the default objective, defaults to True + :type weights_sum_to_one: bool, optional + :param constraints: list of constraints in the scipy format (i.e dicts) + :type constraints: dict list + :param solver: which SCIPY solver to use, e.g "SLSQP", "COBYLA", "BFGS". + User beware: different optimisers require different inputs. + :type solver: string + :return: asset weights that optimise the custom objective + :rtype: dict + """ + # Sanitise inputs + if not isinstance(objective_args, tuple): + objective_args = (objective_args,) + + # Make scipy bounds + bound_array = np.vstack((self._lower_bounds, self._upper_bounds)).T + bounds = list(map(tuple, bound_array)) + + initial_guess = np.array([1 / self.n_assets] * self.n_assets) + + # Construct constraints + final_constraints = [] + if weights_sum_to_one: + final_constraints.append({"type": "eq", "fun": lambda x: np.sum(x) - 1}) + if constraints is not None: + final_constraints += constraints + + result = sco.minimize( + custom_objective, + x0=initial_guess, + args=objective_args, + method=solver, + bounds=bounds, + constraints=final_constraints, + ) + self.weights = result["x"] + return dict(zip(self.tickers, self.weights)) def portfolio_performance( - expected_returns, cov_matrix, weights, verbose=False, risk_free_rate=0.02 + weights, expected_returns, cov_matrix, verbose=False, risk_free_rate=0.02 ): """ After optimising, calculate (and optionally print) the performance of the optimal @@ -169,9 +353,9 @@ def portfolio_performance( :param expected_returns: expected returns for each asset. Set to None if optimising for volatility only. - :type expected_returns: pd.Series, list, np.ndarray + :type expected_returns: np.ndarray or pd.Series :param cov_matrix: covariance of returns for each asset - :type cov_matrix: pd.DataFrame or np.array + :type cov_matrix: np.array or pd.DataFrame :param weights: weights or assets :type weights: list, np.array or dict, optional :param verbose: whether performance should be printed, defaults to False @@ -199,12 +383,25 @@ def portfolio_performance( new_weights = np.asarray(weights) else: raise ValueError("Weights is None") - sigma = np.sqrt(objective_functions.volatility(new_weights, cov_matrix)) - mu = new_weights.dot(expected_returns) - sharpe = -objective_functions.negative_sharpe( - new_weights, expected_returns, cov_matrix, risk_free_rate=risk_free_rate + sigma = np.sqrt(objective_functions.portfolio_variance(new_weights, cov_matrix)) + mu = objective_functions.portfolio_return( + new_weights, expected_returns, negative=False + ) + # new_weights.dot(expected_returns) + + # sharpe = -objective_functions.negative_sharpe( + # new_weights, expected_returns, cov_matrix, risk_free_rate=risk_free_rate + # ) + + sharpe = objective_functions.sharpe_ratio( + new_weights, + expected_returns, + cov_matrix, + risk_free_rate=risk_free_rate, + negative=False, ) + if verbose: print("Expected annual return: {:.1f}%".format(100 * mu)) print("Annual volatility: {:.1f}%".format(100 * sigma)) diff --git a/pypfopt/black_litterman.py b/pypfopt/black_litterman.py index cce163fb..be649b72 100644 --- a/pypfopt/black_litterman.py +++ b/pypfopt/black_litterman.py @@ -87,7 +87,7 @@ class BlackLittermanModel(base_optimizer.BaseOptimizer): - Inputs: - - ``cov_matrix`` - pd.DataFrame + - ``cov_matrix`` - np.ndarray - ``n_assets`` - int - ``tickers`` - str list - ``Q`` - np.ndarray @@ -341,9 +341,9 @@ def portfolio_performance(self, verbose=False, risk_free_rate=0.02): if self.posterior_cov is None: self.posterior_cov = self.bl_cov() return base_optimizer.portfolio_performance( + self.weights, self.posterior_rets, self.posterior_cov, - self.weights, verbose, risk_free_rate, ) diff --git a/pypfopt/cla.py b/pypfopt/cla.py index 4451789c..e2317476 100644 --- a/pypfopt/cla.py +++ b/pypfopt/cla.py @@ -11,18 +11,6 @@ from . import base_optimizer -def _infnone(x): - """ - Helper method to map None to float infinity. - - :param x: argument - :type x: float - :return: infinity if the argmument was None otherwise x - :rtype: float - """ - return float("-inf") if x is None else x - - class CLA(base_optimizer.BaseOptimizer): """ @@ -33,8 +21,8 @@ class CLA(base_optimizer.BaseOptimizer): - ``n_assets`` - int - ``tickers`` - str list - ``mean`` - np.ndarray - - ``cov_matrix`` - pd.DataFrame - - ``expected_returns`` - pd.Series + - ``cov_matrix`` - np.ndarray + - ``expected_returns`` - np.ndarray - ``lb`` - np.ndarray - ``ub`` - np.ndarray @@ -45,13 +33,17 @@ class CLA(base_optimizer.BaseOptimizer): - ``g`` - float list - ``f`` - float list list - - Outputs: ``weights`` - np.ndarray + - Outputs: + + - ``weights`` - np.ndarray + - ``frontier_values`` - (float list, float list, np.ndarray list) Public methods: - ``max_sharpe()`` optimises for maximal Sharpe ratio (a.k.a the tangency portfolio) - ``min_volatility()`` optimises for minimum volatility - ``efficient_frontier()`` computes the entire efficient frontier + - ``plot_efficient_frontier()`` to plot the efficient frontier. - ``portfolio_performance()`` calculates the expected return, volatility and Sharpe ratio for the optimised portfolio. - ``set_weights()`` creates self.weights (np.ndarray) from a weights dict @@ -98,12 +90,26 @@ def __init__(self, expected_returns, cov_matrix, weight_bounds=(0, 1)): self.g = [] # gammas self.f = [] # free weights + self.frontier_values = None # result of computing efficient frontier + if isinstance(expected_returns, pd.Series): tickers = list(expected_returns.index) else: tickers = list(range(len(self.mean))) super().__init__(len(tickers), tickers) + @staticmethod + def _infnone(x): + """ + Helper method to map None to float infinity. + + :param x: argument + :type x: float + :return: infinity if the argmument was None otherwise x + :rtype: float + """ + return float("-inf") if x is None else x + def _init_algo(self): # Initialize the algo # 1) Form structured array @@ -193,13 +199,13 @@ def _reduce_matrix(matrix, listX, listY): # Reduce a matrix to the provided list of rows and columns if len(listX) == 0 or len(listY) == 0: return - matrix_ = matrix[:, listY[0]: listY[0] + 1] + matrix_ = matrix[:, listY[0] : listY[0] + 1] for i in listY[1:]: - a = matrix[:, i: i + 1] + a = matrix[:, i : i + 1] matrix_ = np.append(matrix_, a, 1) - matrix__ = matrix_[listX[0]: listX[0] + 1, :] + matrix__ = matrix_[listX[0] : listX[0] + 1, :] for i in listX[1:]: - a = matrix_[i: i + 1, :] + a = matrix_[i : i + 1, :] matrix__ = np.append(matrix__, a, 0) return matrix__ @@ -313,7 +319,7 @@ def _solve(self): l, bi = self._compute_lambda( covarF_inv, covarFB, meanF, wB, j, [self.lB[i], self.uB[i]] ) - if _infnone(l) > _infnone(l_in): + if CLA._infnone(l) > CLA._infnone(l_in): l_in, i_in, bi_in = l, i, bi j += 1 # 2) case b): Free one bounded weight @@ -331,7 +337,9 @@ def _solve(self): meanF.shape[0] - 1, self.w[-1][i], ) - if (self.ls[-1] is None or l < self.ls[-1]) and l > _infnone(l_out): + if (self.ls[-1] is None or l < self.ls[-1]) and l > CLA._infnone( + l_out + ): l_out, i_out = l, i if (l_in is None or l_in < 0) and (l_out is None or l_out < 0): # 3) compute minimum variance solution @@ -341,7 +349,7 @@ def _solve(self): meanF = np.zeros(meanF.shape) else: # 4) decide lambda - if _infnone(l_in) > _infnone(l_out): + if CLA._infnone(l_in) > CLA._infnone(l_out): self.ls.append(l_in) f.remove(i_in) w[i_in] = bi_in # set value at the correct boundary @@ -364,7 +372,12 @@ def _solve(self): self._purge_excess() def max_sharpe(self): - """Get the max Sharpe ratio portfolio""" + """ + Maximise the sharpe ratio. + + :return: asset weights for the volatility-minimising portfolio + :rtype: dict + """ if not self.w: self._solve() # 1) Compute the local max SR portfolio between any two neighbor turning points @@ -376,12 +389,17 @@ def max_sharpe(self): a, b = self._golden_section(self._eval_sr, 0, 1, **kargs) w_sr.append(a * w0 + (1 - a) * w1) sr.append(b) - # return max(sr), w_sr[sr.index(max(sr))] + self.weights = w_sr[sr.index(max(sr))].reshape((self.n_assets,)) return dict(zip(self.tickers, self.weights)) def min_volatility(self): - """Get the minimum variance solution""" + """ + Minimise volatility. + + :return: asset weights for the volatility-minimising portfolio + :rtype: dict + """ if not self.w: self._solve() var = [] @@ -419,8 +437,66 @@ def efficient_frontier(self, points=100): weights.append(np.copy(w)) mu.append(np.dot(w.T, self.mean)[0, 0]) sigma.append(np.dot(np.dot(w.T, self.cov_matrix), w)[0, 0] ** 0.5) + + self.frontier_values = (mu, sigma, weights) return mu, sigma, weights + def plot_efficient_frontier( + self, points=100, show_assets=True, filename=None, showfig=True + ): + """ + Plot the efficient frontier + + :param points: number of points to plot, defaults to 100 + :type points: int, optional + :param show_assets: whether we should plot the asset risks/returns also, defaults to True + :type show_assets: bool, optional + :param filename: name of the file to save to, defaults to None (doesn't save) + :type filename: str, optional + :param showfig: whether to plt.show() the figure, defaults to True + :type showfig: bool, optional + :raises ImportError: if matplotlib is not installed + :return: matplotlib axis + :rtype: matplotlib.axes object + """ + try: + import matplotlib.pyplot as plt + except (ModuleNotFoundError, ImportError): + raise ImportError("Please install matplotlib via pip or poetry") + + optimal_ret, optimal_risk, _ = self.portfolio_performance() + + if self.frontier_values is None: + self.efficient_frontier(points=points) + + mus, sigmas, _ = self.frontier_values + + fig, ax = plt.subplots() + ax.plot(sigmas, mus, label="Efficient frontier") + + if show_assets: + ax.scatter( + np.sqrt(np.diag(self.cov_matrix)), + self.expected_returns, + s=30, + color="k", + label="assets", + ) + + ax.scatter( + optimal_risk, optimal_ret, marker="x", s=100, color="r", label="optimal" + ) + ax.legend() + ax.set_xlabel("Volatility") + ax.set_ylabel("Return") + + if filename: + plt.savefig(fname=filename, dpi=300) + + if showfig: + plt.show() + return ax + def portfolio_performance(self, verbose=False, risk_free_rate=0.02): """ After optimising, calculate (and optionally print) the performance of the optimal @@ -435,9 +511,9 @@ def portfolio_performance(self, verbose=False, risk_free_rate=0.02): :rtype: (float, float, float) """ return base_optimizer.portfolio_performance( + self.weights, self.expected_returns, self.cov_matrix, - self.weights, verbose, risk_free_rate, ) diff --git a/pypfopt/discrete_allocation.py b/pypfopt/discrete_allocation.py index 3e5ac53a..98030c98 100644 --- a/pypfopt/discrete_allocation.py +++ b/pypfopt/discrete_allocation.py @@ -148,7 +148,8 @@ def greedy_portfolio(self, verbose=False): # Construct long-only discrete allocations for each short_val = self.total_portfolio_value * self.short_ratio - print("\nAllocating long sub-portfolio:") + if verbose: + print("\nAllocating long sub-portfolio...") da1 = DiscreteAllocation( longs, self.latest_prices[longs.keys()], @@ -156,7 +157,8 @@ def greedy_portfolio(self, verbose=False): ) long_alloc, long_leftover = da1.greedy_portfolio() - print("\nAllocating short sub-portfolio:") + if verbose: + print("\nAllocating short sub-portfolio...") da2 = DiscreteAllocation( shorts, self.latest_prices[shorts.keys()], @@ -263,7 +265,8 @@ def lp_portfolio(self, verbose=False): # Construct long-only discrete allocations for each short_val = self.total_portfolio_value * self.short_ratio - print("\nAllocating long sub-portfolio:") + if verbose: + print("\nAllocating long sub-portfolio:") da1 = DiscreteAllocation( longs, self.latest_prices[longs.keys()], @@ -271,7 +274,8 @@ def lp_portfolio(self, verbose=False): ) long_alloc, long_leftover = da1.lp_portfolio() - print("\nAllocating short sub-portfolio:") + if verbose: + print("\nAllocating short sub-portfolio:") da2 = DiscreteAllocation( shorts, self.latest_prices[shorts.keys()], diff --git a/pypfopt/efficient_frontier.py b/pypfopt/efficient_frontier.py index fb67e7bf..2aec5fcc 100644 --- a/pypfopt/efficient_frontier.py +++ b/pypfopt/efficient_frontier.py @@ -6,16 +6,18 @@ import warnings import numpy as np import pandas as pd -import scipy.optimize as sco +import cvxpy as cp + from . import objective_functions, base_optimizer -class EfficientFrontier(base_optimizer.BaseScipyOptimizer): +class EfficientFrontier(base_optimizer.BaseConvexOptimizer): """ - An EfficientFrontier object (inheriting from BaseScipyOptimizer) contains multiple + An EfficientFrontier object (inheriting from BaseConvexOptimizer) contains multiple optimisation methods that can be called (corresponding to different objective - functions) with various parameters. + functions) with various parameters. Note: a new EfficientFrontier object should + be instantiated if you want to make any change to objectives/constraints/bounds/parameters. Instance variables: @@ -24,14 +26,9 @@ class EfficientFrontier(base_optimizer.BaseScipyOptimizer): - ``n_assets`` - int - ``tickers`` - str list - ``bounds`` - float tuple OR (float tuple) list - - ``cov_matrix`` - pd.DataFrame - - ``expected_returns`` - pd.Series - - - Optimisation parameters: + - ``cov_matrix`` - np.ndarray + - ``expected_returns`` - np.ndarray - - ``initial_guess`` - np.ndarray - - ``constraints`` - dict list - - ``opt_method`` - the optimisation algorithm to use. Defaults to SLSQP. - Output: ``weights`` - np.ndarray @@ -39,9 +36,16 @@ class EfficientFrontier(base_optimizer.BaseScipyOptimizer): - ``max_sharpe()`` optimises for maximal Sharpe ratio (a.k.a the tangency portfolio) - ``min_volatility()`` optimises for minimum volatility - - ``custom_objective()`` optimises for some custom objective function + - ``max_quadratic_utility()`` maximises the quadratic utility, giiven some risk aversion. - ``efficient_risk()`` maximises Sharpe for a given target risk - ``efficient_return()`` minimises risk for a given target return + + - ``add_objective()`` adds a (convex) objective to the optimisation problem + - ``add_constraint()`` adds a (linear) constraint to the optimisation problem + - ``convex_objective()`` solves for a generic convex objective with linear constraints + - ``nonconvex_objective()`` solves for a generic nonconvex objective using the scipy backend. + This is prone to getting stuck in local minima and is generally *not* recommended. + - ``portfolio_performance()`` calculates the expected return, volatility and Sharpe ratio for the optimised portfolio. - ``set_weights()`` creates self.weights (np.ndarray) from a weights dict @@ -67,33 +71,89 @@ def __init__(self, expected_returns, cov_matrix, weight_bounds=(0, 1), gamma=0): :raises TypeError: if ``cov_matrix`` is not a dataframe or array """ # Inputs - self.cov_matrix = cov_matrix - if expected_returns is not None: - if not isinstance(expected_returns, (pd.Series, list, np.ndarray)): - raise TypeError("expected_returns is not a series, list or array") - if not isinstance(cov_matrix, (pd.DataFrame, np.ndarray)): - raise TypeError("cov_matrix is not a dataframe or array") - self.expected_returns = expected_returns + self.cov_matrix = EfficientFrontier._validate_cov_matrix(cov_matrix) + self.expected_returns = EfficientFrontier._validate_expected_returns( + expected_returns + ) + + # Labels if isinstance(expected_returns, pd.Series): tickers = list(expected_returns.index) elif isinstance(cov_matrix, pd.DataFrame): tickers = list(cov_matrix.columns) - else: + else: # use integer labels tickers = list(range(len(expected_returns))) + if cov_matrix.shape != (len(expected_returns), len(expected_returns)): + raise ValueError("Covariance matrix does not match expected returns") + super().__init__(len(tickers), tickers, weight_bounds) - if not isinstance(gamma, (int, float)): - raise ValueError("gamma should be numeric") - if gamma < 0: - warnings.warn("in most cases, gamma should be positive", UserWarning) - self.gamma = gamma + @staticmethod + def _validate_expected_returns(expected_returns): + if expected_returns is None: + raise ValueError("expected_returns must be provided") + elif isinstance(expected_returns, pd.Series): + return expected_returns.values + elif isinstance(expected_returns, list): + return np.array(expected_returns) + elif isinstance(expected_returns, np.ndarray): + return expected_returns.ravel() + else: + raise TypeError("expected_returns is not a series, list or array") + + @staticmethod + def _validate_cov_matrix(cov_matrix): + if cov_matrix is None: + raise ValueError("cov_matrix must be provided") + elif isinstance(cov_matrix, pd.DataFrame): + return cov_matrix.values + elif isinstance(cov_matrix, np.ndarray): + return cov_matrix + else: + raise TypeError("cov_matrix is not a series, list or array") + + def _market_neutral_bounds_check(self): + """ + Helper method to make sure bounds are suitable for a market neutral + optimisation. + """ + portfolio_possible = np.any(self._lower_bounds < 0) + if not portfolio_possible: + warnings.warn( + "Market neutrality requires shorting - bounds have been amended", + RuntimeWarning, + ) + self._map_bounds_to_constraints((-1, 1)) + # Delete original constraints + del self._constraints[0] + del self._constraints[0] + + def min_volatility(self): + """ + Minimise volatility. + + :return: asset weights for the volatility-minimising portfolio + :rtype: dict + """ + self._objective = objective_functions.portfolio_variance( + self._w, self.cov_matrix + ) + for obj in self._additional_objectives: + self._objective += obj + + self._constraints.append(cp.sum(self._w) == 1) + + self._solve_cvxpy_opt_problem() + return dict(zip(self.tickers, self.weights)) def max_sharpe(self, risk_free_rate=0.02): """ Maximise the Sharpe Ratio. The result is also referred to as the tangency portfolio, - as it is the tangent to the efficient frontier curve that intercepts the risk-free - rate. + as it is the portfolio for which the capital market line is tangent to the efficient frontier. + + This is a convex optimisation problem after making a certain variable substitution. See + `Cornuejols and Tutuncu (2006) `_ for more. :param risk_free_rate: risk-free rate of borrowing/lending, defaults to 0.02. The period of the risk-free rate should correspond to the @@ -106,150 +166,114 @@ def max_sharpe(self, risk_free_rate=0.02): if not isinstance(risk_free_rate, (int, float)): raise ValueError("risk_free_rate should be numeric") - args = (self.expected_returns, self.cov_matrix, self.gamma, risk_free_rate) - result = sco.minimize( - objective_functions.negative_sharpe, - x0=self.initial_guess, - args=args, - method=self.opt_method, - bounds=self.bounds, - constraints=self.constraints, - ) - self.weights = result["x"] - return dict(zip(self.tickers, self.weights)) + # max_sharpe requires us to make a variable transformation. + # Here we treat w as the transformed variable. + self._objective = cp.quad_form(self._w, self.cov_matrix) + k = cp.Variable() - def min_volatility(self): - """ - Minimise volatility. - - :return: asset weights for the volatility-minimising portfolio - :rtype: dict - """ - args = (self.cov_matrix, self.gamma) - result = sco.minimize( - objective_functions.volatility, - x0=self.initial_guess, - args=args, - method=self.opt_method, - bounds=self.bounds, - constraints=self.constraints, - ) - self.weights = result["x"] + # Note: objectives are not scaled by k. Hence there are subtle differences + # between how these objectives work for max_sharpe vs min_volatility + if len(self._additional_objectives) > 0: + warnings.warn( + "max_sharpe transforms the optimisation problem so additional objectives may not work as expected." + ) + for obj in self._additional_objectives: + self._objective += obj + + # Overwrite original constraints with suitable constraints + # for the transformed max_sharpe problem + self._constraints = [ + (self.expected_returns - risk_free_rate).T * self._w == 1, + cp.sum(self._w) == k, + k >= 0, + ] + #  Rebuild original constraints with scaling factor + for raw_constr in self._additional_constraints_raw: + self._constraints.append(raw_constr(self._w / k)) + # Sharpe ratio is invariant w.r.t scaled weights, so we must + # replace infinities and negative infinities + # new_lower_bound = np.nan_to_num(self._lower_bounds, neginf=-1) + # new_upper_bound = np.nan_to_num(self._upper_bounds, posinf=1) + self._constraints.append(self._w >= k * self._lower_bounds) + self._constraints.append(self._w <= k * self._upper_bounds) + + self._solve_cvxpy_opt_problem() + # Inverse-transform + self.weights = (self._w.value / k.value).round(16) + 0.0 return dict(zip(self.tickers, self.weights)) - def max_unconstrained_utility(self, risk_aversion=1): + def max_quadratic_utility(self, risk_aversion=1, market_neutral=False): r""" - Solve for weights in the unconstrained maximisation problem: + Maximise the given quadratic utility, i.e: .. math:: \max_w w^T \mu - \frac \delta 2 w^T \Sigma w - This has an analytic solution, so scipy.optimize is not needed. - Note: this method ignores most of the parameters passed in the - constructor, including bounds and gamma. Because this is unconstrained, - resulting weights may be negative or greater than 1. It is completely up - to the user to decide how the resulting weights should be normalised. - :param risk_aversion: risk aversion parameter (must be greater than 0), defaults to 1 :type risk_aversion: positive float + :param market_neutral: whether the portfolio should be market neutral (weights sum to zero), + defaults to False. Requires negative lower weight bound. + :param market_neutral: bool, optional + :return: asset weights for the maximum-utility portfolio + :rtype: dict """ if risk_aversion <= 0: raise ValueError("risk aversion coefficient must be greater than zero") - A = risk_aversion * self.cov_matrix - b = self.expected_returns - self.weights = np.linalg.solve(A, b) - return dict(zip(self.tickers, self.weights)) - - def custom_objective(self, objective_function, *args): - """ - Optimise some objective function. While an implicit requirement is that the function - can be optimised via a quadratic optimiser, this is not enforced. Thus there is a - decent chance of silent failure. - :param objective_function: function which maps (weight, args) -> cost - :type objective_function: function with signature (np.ndarray, args) -> float - :return: asset weights that optimise the custom objective - :rtype: dict - """ - result = sco.minimize( - objective_function, - x0=self.initial_guess, - args=args, - method=self.opt_method, - bounds=self.bounds, - constraints=self.constraints, + self._objective = objective_functions.quadratic_utility( + self._w, self.expected_returns, self.cov_matrix, risk_aversion=risk_aversion ) - self.weights = result["x"] + for obj in self._additional_objectives: + self._objective += obj + + if market_neutral: + self._market_neutral_bounds_check() + self._constraints.append(cp.sum(self._w) == 0) + else: + self._constraints.append(cp.sum(self._w) == 1) + + self._solve_cvxpy_opt_problem() return dict(zip(self.tickers, self.weights)) - def efficient_risk(self, target_risk, risk_free_rate=0.02, market_neutral=False): + def efficient_risk(self, target_volatility, market_neutral=False): """ - Calculate the Sharpe-maximising portfolio for a given volatility (i.e max return - for a target risk). + Maximise return for a target risk. - :param target_risk: the desired volatility of the resulting portfolio. - :type target_risk: float - :param risk_free_rate: risk-free rate of borrowing/lending, defaults to 0.02. - The period of the risk-free rate should correspond to the - frequency of expected returns. - :type risk_free_rate: float, optional + :param target_volatility: the desired volatility of the resulting portfolio. + :type target_volatility: float :param market_neutral: whether the portfolio should be market neutral (weights sum to zero), defaults to False. Requires negative lower weight bound. :param market_neutral: bool, optional - :raises ValueError: if ``target_risk`` is not a positive float - :raises ValueError: if no portfolio can be found with volatility equal to ``target_risk`` + :raises ValueError: if ``target_volatility`` is not a positive float + :raises ValueError: if no portfolio can be found with volatility equal to ``target_volatility`` :raises ValueError: if ``risk_free_rate`` is non-numeric :return: asset weights for the efficient risk portfolio :rtype: dict """ - if not isinstance(target_risk, float) or target_risk < 0: - raise ValueError("target_risk should be a positive float") - if not isinstance(risk_free_rate, (int, float)): - raise ValueError("risk_free_rate should be numeric") + if not isinstance(target_volatility, float) or target_volatility < 0: + raise ValueError("target_volatility should be a positive float") + + self._objective = objective_functions.portfolio_return( + self._w, self.expected_returns + ) + variance = objective_functions.portfolio_variance(self._w, self.cov_matrix) + + for obj in self._additional_objectives: + self._objective += obj + + self._constraints.append(variance <= target_volatility ** 2) - args = (self.expected_returns, self.cov_matrix, self.gamma, risk_free_rate) - target_constraint = { - "type": "eq", - "fun": lambda w: target_risk ** 2 - - objective_functions.volatility(w, self.cov_matrix), - } # The equality constraint is either "weights sum to 1" (default), or # "weights sum to 0" (market neutral). if market_neutral: - portfolio_possible = any(b[0] < 0 for b in self.bounds if b[0] is not None) - if not portfolio_possible: - warnings.warn( - "Market neutrality requires shorting - bounds have been amended", - RuntimeWarning, - ) - self.bounds = self._make_valid_bounds((-1, 1)) - constraints = [ - {"type": "eq", "fun": lambda x: np.sum(x)}, - target_constraint, - ] + self._market_neutral_bounds_check() + self._constraints.append(cp.sum(self._w) == 0) else: - constraints = self.constraints + [target_constraint] - - result = sco.minimize( - objective_functions.negative_sharpe, - x0=self.initial_guess, - args=args, - method=self.opt_method, - bounds=self.bounds, - constraints=constraints, - ) - self.weights = result["x"] - - if not np.isclose( - objective_functions.volatility(self.weights, self.cov_matrix), - target_risk ** 2, - ): - raise ValueError( - "Optimisation was not succesful. Please increase target_risk" - ) + self._constraints.append(cp.sum(self._w) == 1) + self._solve_cvxpy_opt_problem() return dict(zip(self.tickers, self.weights)) def efficient_return(self, target_return, market_neutral=False): @@ -268,42 +292,36 @@ def efficient_return(self, target_return, market_neutral=False): """ if not isinstance(target_return, float) or target_return < 0: raise ValueError("target_return should be a positive float") + if target_return > self.expected_returns.max(): + raise ValueError( + "target_return must be lower than the largest expected return" + ) + + self._objective = objective_functions.portfolio_variance( + self._w, self.cov_matrix + ) + ret = objective_functions.portfolio_return( + self._w, self.expected_returns, negative=False + ) + + self.objective = cp.quad_form(self._w, self.cov_matrix) + ret = self.expected_returns.T @ self._w + + for obj in self._additional_objectives: + self._objective += obj + + self._constraints.append(ret >= target_return) - args = (self.cov_matrix, self.gamma) - target_constraint = { - "type": "eq", - "fun": lambda w: w.dot(self.expected_returns) - target_return, - } # The equality constraint is either "weights sum to 1" (default), or # "weights sum to 0" (market neutral). if market_neutral: - portfolio_possible = any(b[0] < 0 for b in self.bounds if b[0] is not None) - if not portfolio_possible: - warnings.warn( - "Market neutrality requires shorting - bounds have been amended", - RuntimeWarning, - ) - self.bounds = self._make_valid_bounds((-1, 1)) - constraints = [ - {"type": "eq", "fun": lambda x: np.sum(x)}, - target_constraint, - ] + self._market_neutral_bounds_check() + self._constraints.append(cp.sum(self._w) == 0) else: - constraints = self.constraints + [target_constraint] - - result = sco.minimize( - objective_functions.volatility, - x0=self.initial_guess, - args=args, - method=self.opt_method, - bounds=self.bounds, - constraints=constraints, - ) - self.weights = result["x"] - if not np.isclose(self.weights.dot(self.expected_returns), target_return): - raise ValueError( - "Optimisation was not succesful. Please reduce target_return" - ) + self._constraints.append(cp.sum(self._w) == 1) + + self._solve_cvxpy_opt_problem() + return dict(zip(self.tickers, self.weights)) def portfolio_performance(self, verbose=False, risk_free_rate=0.02): @@ -322,9 +340,9 @@ def portfolio_performance(self, verbose=False, risk_free_rate=0.02): :rtype: (float, float, float) """ return base_optimizer.portfolio_performance( + self.weights, self.expected_returns, self.cov_matrix, - self.weights, verbose, risk_free_rate, ) diff --git a/pypfopt/exceptions.py b/pypfopt/exceptions.py index 0b585231..bb08aa0d 100644 --- a/pypfopt/exceptions.py +++ b/pypfopt/exceptions.py @@ -12,7 +12,9 @@ class OptimizationError(Exception): """ def __init__(self, *args, **kwargs): - default_message = "Please check your constraints or use a different solver." + default_message = ( + "Please check your objectives/constraints or use a different solver." + ) if not (args or kwargs): args = (default_message,) diff --git a/pypfopt/hierarchical_risk_parity.py b/pypfopt/hierarchical_portfolio.py similarity index 68% rename from pypfopt/hierarchical_risk_parity.py rename to pypfopt/hierarchical_portfolio.py index c7c6d623..63445809 100644 --- a/pypfopt/hierarchical_risk_parity.py +++ b/pypfopt/hierarchical_portfolio.py @@ -1,15 +1,22 @@ """ -The ``hierarchical_risk_parity`` module implements the HRP portfolio from Marcos Lopez de Prado. -It has the same interface as ``EfficientFrontier``. Call the ``hrp_portfolio()`` method -to generate a portfolio. +The ``hierarchical_portfolio`` module seeks to implement one of the recent advances in +portfolio optimisation – the application of hierarchical clustering models in allocation. -The code has been reproduced with modification from Lopez de Prado (2016). +All of the hierarchical classes have a similar API to ``EfficientFrontier``, though since +many hierarchical models currently don't support different objectives, the actual allocation +happens with a call to `optimize()`. + +Currently implemented: + +- ``HRPOpt`` implements the Hierarchical Risk Parity (HRP) portfolio. Code reproduced with + permission from Marcos Lopez de Prado (2016). """ import numpy as np import pandas as pd import scipy.cluster.hierarchy as sch import scipy.spatial.distance as ssd + from . import base_optimizer @@ -26,11 +33,15 @@ class HRPOpt(base_optimizer.BaseOptimizer): - ``tickers`` - str list - ``returns`` - pd.Series - - Output: ``weights`` - np.ndarray + - Output: + + - ``weights`` - np.ndarray + - ``clusters`` - linkage matrix corresponding to clustered assets. Public methods: - - ``hrp_portfolio()`` calculates weights using HRP + - ``optimize()`` calculates weights using HRP + - ``plot_dendrogram()`` plots the clusters - ``portfolio_performance()`` calculates the expected return, volatility and Sharpe ratio for the optimised portfolio. - ``set_weights()`` creates self.weights (np.ndarray) from a weights dict @@ -48,6 +59,7 @@ def __init__(self, returns): raise TypeError("returns are not a dataframe") self.returns = returns + self.clusters = None tickers = list(returns.columns) super().__init__(len(tickers), tickers) @@ -130,7 +142,7 @@ def _raw_hrp_allocation(cov, ordered_tickers): w[second_cluster] *= 1 - alpha # weight 2 return w - def hrp_portfolio(self): + def optimize(self): """ Construct a hierarchical risk parity portfolio @@ -143,18 +155,59 @@ def hrp_portfolio(self): # per https://stackoverflow.com/questions/18952587/ dist = ssd.squareform(((1 - corr) / 2) ** 0.5) - link = sch.linkage(dist, "single") - sort_ix = HRPOpt._get_quasi_diag(link) + self.clusters = sch.linkage(dist, "single") + sort_ix = HRPOpt._get_quasi_diag(self.clusters) ordered_tickers = corr.index[sort_ix].tolist() hrp = HRPOpt._raw_hrp_allocation(cov, ordered_tickers) weights = dict(hrp.sort_index()) self.set_weights(weights) return weights - def portfolio_performance(self, verbose=False, risk_free_rate=0.02): + def plot_dendrogram(self, show_tickers=True, filename=None, showfig=True): + """ + Plot the clusters in the form of a dendrogram. + + :param show_tickers: whether to use tickers as labels (not recommended for large portfolios), + defaults to True + :type show_tickers: bool, optional + :param filename: name of the file to save to, defaults to None (doesn't save) + :type filename: str, optional + :param showfig: whether to plt.show() the figure, defaults to True + :type showfig: bool, optional + :raises ImportError: if matplotlib is not installed + :return: matplotlib axis + :rtype: matplotlib.axes object + """ + + try: + import matplotlib.pyplot as plt + except (ModuleNotFoundError, ImportError): + raise ImportError("Please install matplotlib via pip or poetry") + + if self.clusters is None: + self.optimize() + + fig, ax = plt.subplots() + if show_tickers: + sch.dendrogram(self.clusters, labels=self.tickers, ax=ax) + plt.xticks(rotation=90) + plt.tight_layout() + else: + sch.dendrogram(self.clusters, ax=ax) + + if filename: + plt.savefig(fname=filename, dpi=300) + + if showfig: + plt.show() + + return ax + + def portfolio_performance(self, verbose=False, risk_free_rate=0.02, frequency=252): """ After optimising, calculate (and optionally print) the performance of the optimal - portfolio. Currently calculates expected return, volatility, and the Sharpe ratio. + portfolio. Currently calculates expected return, volatility, and the Sharpe ratio + assuming returns are daily :param verbose: whether performance should be printed, defaults to False :type verbose: bool, optional @@ -162,14 +215,17 @@ def portfolio_performance(self, verbose=False, risk_free_rate=0.02): The period of the risk-free rate should correspond to the frequency of expected returns. :type risk_free_rate: float, optional + :param frequency: number of time periods in a year, defaults to 252 (the number + of trading days in a year) + :type frequency: int, optional :raises ValueError: if weights have not been calcualted yet :return: expected return, volatility, Sharpe ratio. :rtype: (float, float, float) """ return base_optimizer.portfolio_performance( - self.returns.mean(), - self.returns.cov(), self.weights, + self.returns.mean() * frequency, + self.returns.cov() * frequency, verbose, risk_free_rate, ) diff --git a/pypfopt/objective_functions.py b/pypfopt/objective_functions.py index a35bc482..41b9c5c9 100644 --- a/pypfopt/objective_functions.py +++ b/pypfopt/objective_functions.py @@ -1,138 +1,172 @@ """ The ``objective_functions`` module provides optimisation objectives, including the actual objective functions called by the ``EfficientFrontier`` object's optimisation methods. -These methods are primarily designed for internal use during optimisation (via -scipy.optimize), and each requires a certain signature (which is why they have not been -factored into a class). For obvious reasons, any objective function must accept ``weights`` +These methods are primarily designed for internal use during optimisation and each requires +a different signature (which is why they have not been factored into a class). +For obvious reasons, any objective function must accept ``weights`` as an argument, and must also have at least one of ``expected_returns`` or ``cov_matrix``. -Because scipy.optimize only minimises, any objectives that we want to maximise must be -made negative. +The objective functions either compute the objective given a numpy array of weights, or they +return a cvxpy *expression* when weights are a ``cp.Variable``. In this way, the same objective +function can be used both internally for optimisation and externally for computing the objective +given weights. ``_objective_value()`` automatically chooses between the two behaviours. + +``objective_functions`` defaults to objectives for minimisation. In the cases of objectives +that clearly should be maximised (e.g Sharpe Ratio, portfolio return), the objective function +actually returns the negative quantity, since minimising the negative is equivalent to maximising +the positive. This behaviour is controlled by the ``negative=True`` optional argument. Currently implemented: -- negative mean return -- (regularised) negative Sharpe ratio -- (regularised) volatility -- negative quadratic utility -- negative CVaR (expected shortfall). Caveat emptor: this is very buggy. +- Portfolio variance (i.e square of volatility) +- Portfolio return +- Sharpe ratio +- L2 regularisation (minimising this reduces nonzero weights) +- Quadratic utility +- Transaction cost model (a simple one) """ import numpy as np -import scipy.stats +import cvxpy as cp -def negative_mean_return(weights, expected_returns): +def _objective_value(w, obj): + """ + Helper method to return either the value of the objective function + or the objective function as a cvxpy object depending on whether + w is a cvxpy variable or np array. + + :param w: weights + :type w: np.ndarray OR cp.Variable + :param obj: objective function expression + :type obj: cp.Expression + :return: value of the objective function OR objective function expression + :rtype: float OR cp.Expression + """ + if isinstance(w, np.ndarray): + if np.isscalar(obj): + return obj + elif np.isscalar(obj.value): + return obj.value + else: + return obj.value.item() + else: + return obj + + +def portfolio_variance(w, cov_matrix): """ - Calculate the negative mean return of a portfolio + Calculate the total portfolio variance (i.e square volatility). + + :param w: asset weights in the portfolio + :type w: np.ndarray OR cp.Variable + :param cov_matrix: covariance matrix + :type cov_matrix: np.ndarray + :return: value of the objective function OR objective function expression + :rtype: float OR cp.Expression + """ + variance = cp.quad_form(w, cov_matrix) + return _objective_value(w, variance) + - :param weights: asset weights of the portfolio - :type weights: np.ndarray +def portfolio_return(w, expected_returns, negative=True): + """ + Calculate the (negative) mean return of a portfolio + + :param w: asset weights in the portfolio + :type w: np.ndarray OR cp.Variable :param expected_returns: expected return of each asset - :type expected_returns: pd.Series + :type expected_returns: np.ndarray + :param negative: whether quantity should be made negative (so we can minimise) + :type negative: boolean :return: negative mean return :rtype: float """ - return -weights.dot(expected_returns) + sign = -1 if negative else 1 + mu = w @ expected_returns + return _objective_value(w, sign * mu) -def negative_sharpe( - weights, expected_returns, cov_matrix, gamma=0, risk_free_rate=0.02 -): +def sharpe_ratio(w, expected_returns, cov_matrix, risk_free_rate=0.02, negative=True): """ - Calculate the negative Sharpe ratio of a portfolio + Calculate the (negative) Sharpe ratio of a portfolio - :param weights: asset weights of the portfolio - :type weights: np.ndarray + :param w: asset weights in the portfolio + :type w: np.ndarray OR cp.Variable :param expected_returns: expected return of each asset - :type expected_returns: pd.Series - :param cov_matrix: the covariance matrix of asset returns - :type cov_matrix: pd.DataFrame - :param gamma: L2 regularisation parameter, defaults to 0. Increase if you want more - non-negligible weights - :type gamma: float, optional + :type expected_returns: np.ndarray + :param cov_matrix: covariance matrix + :type cov_matrix: np.ndarray :param risk_free_rate: risk-free rate of borrowing/lending, defaults to 0.02. The period of the risk-free rate should correspond to the frequency of expected returns. :type risk_free_rate: float, optional - :return: negative Sharpe ratio + :param negative: whether quantity should be made negative (so we can minimise) + :type negative: boolean + :return: (negative) Sharpe ratio :rtype: float """ - mu = weights.dot(expected_returns) - sigma = np.sqrt(np.dot(weights, np.dot(cov_matrix, weights.T))) - L2_reg = gamma * (weights ** 2).sum() - return -(mu - risk_free_rate) / sigma + L2_reg + mu = w @ expected_returns + sigma = cp.sqrt(cp.quad_form(w, cov_matrix)) + sign = -1 if negative else 1 + sharpe = (mu - risk_free_rate) / sigma + return _objective_value(w, sign * sharpe) -def volatility(weights, cov_matrix, gamma=0): - """ - Calculate the volatility of a portfolio. This is actually a misnomer because - the function returns variance, which is technically the correct objective - function when minimising volatility. - - :param weights: asset weights of the portfolio - :type weights: np.ndarray - :param cov_matrix: the covariance matrix of asset returns - :type cov_matrix: pd.DataFrame - :param gamma: L2 regularisation parameter, defaults to 0. Increase if you want more - non-negligible weights +def L2_reg(w, gamma=1): + r""" + L2 regularisation, i.e :math:`\gamma ||w||^2`, to increase the number of nonzero weights. + + :param w: asset weights in the portfolio + :type w: np.ndarray OR cp.Variable + :param gamma: L2 regularisation parameter, defaults to 1. Increase if you want more + non-negligible weights :type gamma: float, optional - :return: portfolio variance - :rtype: float + :return: value of the objective function OR objective function expression + :rtype: float OR cp.Expression """ - L2_reg = gamma * (weights ** 2).sum() - portfolio_volatility = np.dot(weights.T, np.dot(cov_matrix, weights)) - return portfolio_volatility + L2_reg + L2_reg = gamma * cp.sum_squares(w) + return _objective_value(w, L2_reg) -def negative_quadratic_utility( - weights, expected_returns, cov_matrix, risk_aversion, gamma=0 -): - """ - Calculate the (negative) quadratic utility of a portfolio. +def quadratic_utility(w, expected_returns, cov_matrix, risk_aversion, negative=True): + r""" + Quadratic utility function, i.e :math:`\mu - \frac 1 2 \delta w^T \Sigma w`. - :param weights: asset weights of the portfolio - :type weights: np.ndarray + :param w: asset weights in the portfolio + :type w: np.ndarray OR cp.Variable :param expected_returns: expected return of each asset - :type expected_returns: pd.Series - :param cov_matrix: the covariance matrix of asset returns - :type cov_matrix: pd.DataFrame - :param gamma: L2 regularisation parameter, defaults to 0. Increase if you want more - non-negligible weights - :type gamma: float, optional + :type expected_returns: np.ndarray + :param cov_matrix: covariance matrix + :type cov_matrix: np.ndarray + :param risk_aversion: risk aversion coefficient. Increase to reduce risk. + :type risk_aversion: float + :param negative: whether quantity should be made negative (so we can minimise). + :type negative: boolean + :return: value of the objective function OR objective function expression + :rtype: float OR cp.Expression """ - L2_reg = gamma * (weights ** 2).sum() - mu = weights.dot(expected_returns) - portfolio_volatility = np.dot(weights.T, np.dot(cov_matrix, weights)) - return -(mu - 0.5 * risk_aversion * portfolio_volatility) + L2_reg + sign = -1 if negative else 1 + mu = w @ expected_returns + variance = cp.quad_form(w, cov_matrix) + utility = mu - 0.5 * risk_aversion * variance + return _objective_value(w, sign * utility) -def negative_cvar(weights, returns, s=10000, beta=0.95, random_state=None): + +def transaction_cost(w, w_prev, k=0.001): """ - Calculate the negative CVaR. Though we want the "min CVaR portfolio", we - actually need to maximise the expected return of the worst q% cases, thus - we need this value to be negative. - - :param weights: asset weights of the portfolio - :type weights: np.ndarray - :param returns: asset returns - :type returns: pd.DataFrame or np.ndarray - :param s: number of bootstrap draws, defaults to 10000 - :type s: int, optional - :param beta: "significance level" (i. 1 - q), defaults to 0.95 - :type beta: float, optional - :param random_state: seed for random sampling, defaults to None - :type random_state: int, optional - :return: negative CVaR - :rtype: float + A very simple transaction cost model: sum all the weight changes + and multiply by a given fraction (default to 10bps). This simulates + a fixed percentage commission from your broker. + + :param w: asset weights in the portfolio + :type w: np.ndarray OR cp.Variable + :param w_prev: previous weights + :type w_prev: np.ndarray + :param k: fractional cost per unit weight exchanged + :type k: float + :return: value of the objective function OR objective function expression + :rtype: float OR cp.Expression """ - np.random.seed(seed=random_state) - # Calcualte the returns given the weights - portfolio_returns = (weights * returns).sum(axis=1) - # Sample from the historical distribution - dist = scipy.stats.gaussian_kde(portfolio_returns) - sample = dist.resample(s) - # Calculate the value at risk - var = portfolio_returns.quantile(1 - beta) - # Mean of all losses worse than the value at risk - return -sample[sample < var].mean() + return _objective_value(w, k * cp.norm(w - w_prev, 1)) diff --git a/pypfopt/risk_models.py b/pypfopt/risk_models.py index 4baaeed6..7e5b06f9 100644 --- a/pypfopt/risk_models.py +++ b/pypfopt/risk_models.py @@ -1,8 +1,6 @@ """ The ``risk_models`` module provides functions for estimating the covariance matrix given -historical returns. Because of the complexity of estimating covariance matrices -(and the importance of efficient computations), this module mostly provides a convenient -wrapper around the underrated `sklearn.covariance` module. +historical returns. The format of the data input is the same as that in :ref:`expected-returns`. @@ -19,6 +17,7 @@ - Oracle Approximating shrinkage - covariance to correlation matrix +- plot of the covariance matrix """ import warnings @@ -27,6 +26,24 @@ from .expected_returns import returns_from_prices +def _is_positive_semidefinite(matrix): + """ + Helper function to check if a given matrix is positive semidefinite. + Any method that requires inverting the covariance matrix will struggle + with a non-positive defininite matrix + + :param matrix: (covariance) matrix to test + :type matrix: np.ndarray, pd.DataFrame + :return: whether matrix is positive semidefinite + :rtype: bool + """ + try: + np.linalg.cholesky(matrix) + return True + except np.linalg.LinAlgError: + return False + + def sample_cov(prices, frequency=252): """ Calculate the annualised sample covariance matrix of (daily) asset returns. @@ -179,6 +196,49 @@ def cov_to_corr(cov_matrix): return pd.DataFrame(corr, index=cov_matrix.index, columns=cov_matrix.index) +def correlation_plot(cov_matrix, show_tickers=True, filename=None, showfig=True): + """ + Generate a basic plot of the correlation matrix, given a covariance matrix. + + :param cov_matrix: covariance matrix + :type cov_matrix: pd.DataFrame or np.ndarray + :param show_tickers: whether to use tickers as labels (not recommended for large portfolios), + defaults to True + :type show_tickers: bool, optional + :param filename: name of the file to save to, defaults to None (doesn't save) + :type filename: str, optional + :param showfig: whether to plt.show() the figure, defaults to True + :type showfig: bool, optional + :raises ImportError: if matplotlib is not installed + :return: matplotlib axis + :rtype: matplotlib.axes object + """ + try: + import matplotlib.pyplot as plt + except (ModuleNotFoundError, ImportError): + raise ImportError("Please install matplotlib via pip or poetry") + + corr = cov_to_corr(cov_matrix) + fig, ax = plt.subplots() + + cax = ax.imshow(corr) + fig.colorbar(cax) + + if show_tickers: + ax.set_xticks(np.arange(0, corr.shape[0], 1)) + ax.set_xticklabels(corr.index) + ax.set_yticks(np.arange(0, corr.shape[0], 1)) + ax.set_yticklabels(corr.index) + plt.xticks(rotation=90) + + if filename: + plt.savefig(fname=filename, dpi=300) + if showfig: + plt.show() + + return ax + + class CovarianceShrinkage: """ Provide methods for computing shrinkage estimates of the covariance matrix, using the @@ -214,7 +274,7 @@ def __init__(self, prices, frequency=252): self.S = self.X.cov().values self.delta = None # shrinkage constant - def format_and_annualise(self, raw_cov_array): + def _format_and_annualize(self, raw_cov_array): """ Helper method which annualises the output of shrinkage calculations, and formats the result into a dataframe @@ -247,7 +307,7 @@ def shrunk_covariance(self, delta=0.2): F = np.identity(N) * mu # Shrinkage shrunk_cov = delta * F + (1 - delta) * self.S - return self.format_and_annualise(shrunk_cov) + return self._format_and_annualize(shrunk_cov) def ledoit_wolf(self, shrinkage_target="constant_variance"): """ @@ -272,7 +332,7 @@ def ledoit_wolf(self, shrinkage_target="constant_variance"): else: raise NotImplementedError - return self.format_and_annualise(shrunk_cov) + return self._format_and_annualize(shrunk_cov) def _ledoit_wolf_single_factor(self): """ @@ -391,4 +451,4 @@ def oracle_approximating(self): """ X = np.nan_to_num(self.X.values) shrunk_cov, self.delta = self.sklearn.covariance.oas(X) - return self.format_and_annualise(shrunk_cov) + return self._format_and_annualize(shrunk_cov) diff --git a/pypfopt/value_at_risk.py b/pypfopt/value_at_risk.py deleted file mode 100644 index 2ac5feaa..00000000 --- a/pypfopt/value_at_risk.py +++ /dev/null @@ -1,99 +0,0 @@ -""" -The ``value_at_risk`` module allows for optimisation with a (conditional) -value-at-risk (CVaR) objective, which requires Monte Carlo simulation. -""" - -import pandas as pd -from . import base_optimizer -from . import objective_functions - -# Extra dependency -try: - import noisyopt -except (ModuleNotFoundError, ImportError): - raise ImportError("Please install noisyopt via pip or poetry") - - -class CVAROpt(base_optimizer.BaseScipyOptimizer): - - """ - A CVAROpt object (inheriting from BaseScipyOptimizer) provides a method for - optimising the CVaR (a.k.a expected shortfall) of a portfolio. - - Instance variables: - - - Inputs - - - ``tickers`` - str list - - ``returns`` - pd.DataFrame - - ``bounds`` - float tuple OR (float tuple) list - - - Optimisation parameters: - - - ``s`` - int (the number of Monte Carlo simulations) - - ``beta`` - float (the critical value) - - - Output: ``weights`` - np.ndarray - - Public methods: - - - ``min_cvar()`` - - ``normalize_weights()`` - - ``set_weights()`` creates self.weights (np.ndarray) from a weights dict - - ``clean_weights()`` rounds the weights and clips near-zeros. - - ``save_weights_to_file()`` saves the weights to csv, json, or txt. - """ - - def __init__(self, returns, weight_bounds=(0, 1)): - """ - :param returns: asset historical returns - :type returns: pd.DataFrame - :param weight_bounds: minimum and maximum weight of each asset OR single min/max pair - if all identical, defaults to (0, 1). Must be changed to (-1, 1) - for portfolios with shorting. - :type weight_bounds: tuple OR tuple list, optional - :raises TypeError: if ``returns`` is not a dataframe - """ - if not isinstance(returns, pd.DataFrame): - raise TypeError("returns are not a dataframe") - self.returns = returns - tickers = returns.columns - super().__init__(len(tickers), tickers, weight_bounds) - - @staticmethod - def _normalize_weights(raw_weights): - """ - Utility function to make all weights sum to 1 - - :param raw_weights: input weights which do not sum to 1 - :type raw_weights: np.array, pd.Series - :return: normalized weights - :rtype: np.array, pd.Series - """ - return raw_weights / raw_weights.sum() - - def min_cvar(self, s=10000, beta=0.95, random_state=None): - """ - Find the portfolio weights that minimises the CVaR, via - Monte Carlo sampling from the return distribution. - - :param s: number of bootstrap draws, defaults to 10000 - :type s: int, optional - :param beta: "significance level" (i. 1 - q), defaults to 0.95 - :type beta: float, optional - :param random_state: seed for random sampling, defaults to None - :type random_state: int, optional - :return: asset weights for the Sharpe-maximising portfolio - :rtype: dict - """ - args = (self.returns, s, beta, random_state) - result = noisyopt.minimizeSPSA( - objective_functions.negative_cvar, - args=args, - bounds=self.bounds, - x0=self.initial_guess, - niter=1000, - paired=False, - ) - self.weights = CVAROpt._normalize_weights(result["x"]) - return dict(zip(self.tickers, self.weights)) diff --git a/pyproject.toml b/pyproject.toml index f657a5f3..828a656a 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -1,6 +1,6 @@ [tool.poetry] name = "PyPortfolioOpt" -version = "0.5.5" +version = "1.0.0" description = "Financial portfolio optimisation in python" license = "MIT" authors = ["Robert Andrew Martin "] @@ -16,6 +16,7 @@ classifiers=[ "License :: OSI Approved :: MIT License", "Natural Language :: English", "Operating System :: OS Independent", + "Programming Language :: Python :: 3.8", "Programming Language :: Python :: 3.7", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3 :: Only", @@ -31,11 +32,12 @@ packages = [ {include = "pypfopt"} ] [tool.poetry.dependencies] python = "^3.6.0" -numpy = "^=1.14.3" +numpy = "^=1.17.0" scipy = "^=1.1.0" pandas = "^0.25.3" cvxpy = "^1.0.28" + [tool.poetry.dev-dependencies] pytest = "^4.6" flake8 = "^3.7" @@ -44,8 +46,8 @@ rope = "^0.14.0" ipython = "^7.13.0" [tool.poetry.extras] -noisyopt = ["^=0.2.2"] scikit-learn = ["^0.22"] +matplotlib = ["^3.2.0"] [build-system] requires = ["poetry>=0.12"] build-backend = "poetry.masonry.api" diff --git a/requirements.txt b/requirements.txt index d371aa8a..c5c389ae 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,6 +1,6 @@ joblib==0.14.1 -noisyopt==0.2.2 -numpy==1.16.5 +matplotlib==3.2.0 +numpy==1.17.3 pandas==0.25.3 pyparsing==2.4.5 python-dateutil==2.8.1 diff --git a/setup.py b/setup.py index 595db885..7026dd9c 100755 --- a/setup.py +++ b/setup.py @@ -9,7 +9,7 @@ setup( name="PyPortfolioOpt", - version="0.5.5", + version="1.0.0", description="Financial portfolio optimisation in python", long_description=desc, long_description_content_type="text/markdown", @@ -26,6 +26,7 @@ "License :: OSI Approved :: MIT License", "Natural Language :: English", "Operating System :: OS Independent", + "Programming Language :: Python :: 3.8", "Programming Language :: Python :: 3.7", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3 :: Only", @@ -34,7 +35,7 @@ "Topic :: Scientific/Engineering :: Mathematics", ], keywords="portfolio finance optimization quant trading investing", - install_requires=["numpy", "pandas", "scipy"], + install_requires=["numpy", "pandas", "scipy", "cvxpy"], setup_requires=["pytest-runner"], tests_require=["pytest"], python_requires=">=3.6", diff --git a/tests/test_base_optimizer.py b/tests/test_base_optimizer.py index 384b7671..d64fd6cc 100644 --- a/tests/test_base_optimizer.py +++ b/tests/test_base_optimizer.py @@ -2,71 +2,114 @@ import os import numpy as np import pytest -from pypfopt.efficient_frontier import EfficientFrontier +from pypfopt import EfficientFrontier +from pypfopt import exceptions from tests.utilities_for_tests import get_data, setup_efficient_frontier -def test_custom_upper_bound(): +def test_custom_bounds(): ef = EfficientFrontier( - *setup_efficient_frontier(data_only=True), weight_bounds=(0, 0.10) + *setup_efficient_frontier(data_only=True), weight_bounds=(0.02, 0.13) ) - ef.max_sharpe() - ef.portfolio_performance() - assert ef.weights.max() <= 0.1 + ef.min_volatility() + np.testing.assert_allclose(ef._lower_bounds, np.array([0.02] * ef.n_assets)) + np.testing.assert_allclose(ef._upper_bounds, np.array([0.13] * ef.n_assets)) + + assert ef.weights.min() >= 0.02 + assert ef.weights.max() <= 0.13 np.testing.assert_almost_equal(ef.weights.sum(), 1) -def test_custom_lower_bound(): +def test_custom_bounds_different_values(): + bounds = [(0.01, 0.13), (0.02, 0.11)] * 10 ef = EfficientFrontier( - *setup_efficient_frontier(data_only=True), weight_bounds=(0.02, 1) + *setup_efficient_frontier(data_only=True), weight_bounds=bounds ) - ef.max_sharpe() - assert ef.weights.min() >= 0.02 + ef.min_volatility() + assert (0.01 <= ef.weights[::2]).all() and (ef.weights[::2] <= 0.13).all() + assert (0.02 <= ef.weights[1::2]).all() and (ef.weights[1::2] <= 0.11).all() np.testing.assert_almost_equal(ef.weights.sum(), 1) + bounds = ((0.01, 0.13), (0.02, 0.11)) * 10 + assert EfficientFrontier( + *setup_efficient_frontier(data_only=True), weight_bounds=bounds + ) -def test_custom_bounds_same(): + +def test_weight_bounds_minus_one_to_one(): ef = EfficientFrontier( - *setup_efficient_frontier(data_only=True), weight_bounds=(0.03, 0.13) + *setup_efficient_frontier(data_only=True), weight_bounds=(-1, 1) ) - ef.max_sharpe() - assert ef.weights.min() >= 0.03 - assert ef.weights.max() <= 0.13 - np.testing.assert_almost_equal(ef.weights.sum(), 1) + assert ef.max_sharpe() + assert ef.min_volatility() + # TODO: fix + # assert ef.efficient_return(0.05) + # assert ef.efficient_risk(0.20) -def test_custom_bounds_different(): - bounds = [(0.01, 0.13), (0.02, 0.11)] * 10 + +def test_none_bounds(): + ef = EfficientFrontier( + *setup_efficient_frontier(data_only=True), weight_bounds=(None, 0.3) + ) + ef.min_volatility() + w1 = ef.weights + + ef = EfficientFrontier( + *setup_efficient_frontier(data_only=True), weight_bounds=(-1, 0.3) + ) + ef.min_volatility() + w2 = ef.weights + np.testing.assert_array_almost_equal(w1, w2) + + +def test_bound_input_types(): + bounds = [0.01, 0.13] ef = EfficientFrontier( *setup_efficient_frontier(data_only=True), weight_bounds=bounds ) - ef.max_sharpe() - assert (0.01 <= ef.weights[::2]).all() and (ef.weights[::2] <= 0.13).all() - assert (0.02 <= ef.weights[1::2]).all() and (ef.weights[1::2] <= 0.11).all() - np.testing.assert_almost_equal(ef.weights.sum(), 1) + assert ef + np.testing.assert_allclose(ef._lower_bounds, np.array([0.01] * ef.n_assets)) + np.testing.assert_allclose(ef._upper_bounds, np.array([0.13] * ef.n_assets)) + lb = np.array([0.01, 0.02] * 10) + ub = np.array([0.07, 0.2] * 10) + assert EfficientFrontier( + *setup_efficient_frontier(data_only=True), weight_bounds=(lb, ub) + ) bounds = ((0.01, 0.13), (0.02, 0.11)) * 10 assert EfficientFrontier( *setup_efficient_frontier(data_only=True), weight_bounds=bounds ) -def test_bounds_errors(): - with pytest.raises(ValueError): - EfficientFrontier( - *setup_efficient_frontier(data_only=True), weight_bounds=(0.06, 1) - ) +def test_bound_failure(): + # Ensure optimisation fails when lower bound is too high or upper bound is too low + ef = EfficientFrontier( + *setup_efficient_frontier(data_only=True), weight_bounds=(0.06, 0.13) + ) + with pytest.raises(exceptions.OptimizationError): + ef.min_volatility() + ef = EfficientFrontier( + *setup_efficient_frontier(data_only=True), weight_bounds=(0, 0.04) + ) + with pytest.raises(exceptions.OptimizationError): + ef.min_volatility() + + +def test_bounds_errors(): assert EfficientFrontier( *setup_efficient_frontier(data_only=True), weight_bounds=(0, 1) ) - with pytest.raises(ValueError): + with pytest.raises(TypeError): EfficientFrontier( *setup_efficient_frontier(data_only=True), weight_bounds=(0.06, 1, 3) ) - with pytest.raises(ValueError): + with pytest.raises(TypeError): + # Not enough bounds bounds = [(0.01, 0.13), (0.02, 0.11)] * 5 EfficientFrontier( *setup_efficient_frontier(data_only=True), weight_bounds=bounds @@ -75,7 +118,7 @@ def test_bounds_errors(): def test_clean_weights(): ef = setup_efficient_frontier() - ef.max_sharpe() + ef.min_volatility() number_tiny_weights = sum(ef.weights < 1e-4) cleaned = ef.clean_weights(cutoff=1e-4, rounding=5) cleaned_weights = cleaned.values() @@ -91,7 +134,7 @@ def test_clean_weights_short(): ef = EfficientFrontier( *setup_efficient_frontier(data_only=True), weight_bounds=(-1, 1) ) - ef.max_sharpe() + ef.min_volatility() # In practice we would never use such a high cutoff number_tiny_weights = sum(np.abs(ef.weights) < 0.05) cleaned = ef.clean_weights(cutoff=0.05) @@ -104,7 +147,7 @@ def test_clean_weights_error(): ef = setup_efficient_frontier() with pytest.raises(AttributeError): ef.clean_weights() - ef.max_sharpe() + ef.min_volatility() with pytest.raises(ValueError): ef.clean_weights(rounding=1.3) with pytest.raises(ValueError): @@ -114,7 +157,7 @@ def test_clean_weights_error(): def test_clean_weights_no_rounding(): ef = setup_efficient_frontier() - ef.max_sharpe() + ef.min_volatility() # ensure the call does not fail # in previous commits, this call would raise a ValueError cleaned = ef.clean_weights(rounding=None, cutoff=0) @@ -136,7 +179,7 @@ def test_efficient_frontier_init_errors(): def test_set_weights(): ef = setup_efficient_frontier() - w1 = ef.max_sharpe() + w1 = ef.min_volatility() test_weights = ef.weights ef.min_volatility() ef.set_weights(w1) @@ -145,7 +188,7 @@ def test_set_weights(): def test_save_weights_to_file(): ef = setup_efficient_frontier() - ef.max_sharpe() + ef.min_volatility() ef.save_weights_to_file("tests/test.txt") with open("tests/test.txt", "r") as f: file = f.read() diff --git a/tests/test_custom_objectives.py b/tests/test_custom_objectives.py index d5dc9083..6862e03d 100644 --- a/tests/test_custom_objectives.py +++ b/tests/test_custom_objectives.py @@ -1,45 +1,116 @@ import numpy as np -from tests.utilities_for_tests import setup_efficient_frontier +import cvxpy as cp +import pytest +from pypfopt import EfficientFrontier from pypfopt import objective_functions +from pypfopt import exceptions +from tests.utilities_for_tests import setup_efficient_frontier -def test_custom_objective_equal_weights(): +def test_custom_convex_equal_weights(): ef = setup_efficient_frontier() - def new_objective(weights): - return (weights ** 2).sum() + def new_objective(w): + return cp.sum(w ** 2) - ef.custom_objective(new_objective) + ef.convex_objective(new_objective) np.testing.assert_allclose(ef.weights, np.array([1 / 20] * 20)) -def test_custom_objective_min_var(): +def test_custom_convex_min_var(): ef = setup_efficient_frontier() ef.min_volatility() built_in = ef.weights # With custom objective ef = setup_efficient_frontier() - ef.custom_objective(objective_functions.volatility, ef.cov_matrix, 0) + ef.convex_objective( + objective_functions.portfolio_variance, cov_matrix=ef.cov_matrix + ) custom = ef.weights np.testing.assert_allclose(built_in, custom, atol=1e-7) -def test_custom_objective_sharpe_L2(): - ef = setup_efficient_frontier() - ef.gamma = 2 - ef.max_sharpe() +def test_custom_convex_objective_market_neutral_efficient_risk(): + target_risk = 0.19 + ef = EfficientFrontier( + *setup_efficient_frontier(data_only=True), weight_bounds=(-1, 1) + ) + ef.efficient_risk(target_risk, market_neutral=True) built_in = ef.weights - # With custom objective - ef = setup_efficient_frontier() - ef.custom_objective(objective_functions.negative_sharpe, - ef.expected_returns, ef.cov_matrix, 2) + # Recreate the market-neutral efficient_risk optimiser using this API + ef = EfficientFrontier( + *setup_efficient_frontier(data_only=True), weight_bounds=(-1, 1) + ) + ef.add_constraint(lambda x: cp.sum(x) == 0) + ef.add_constraint(lambda x: cp.quad_form(x, ef.cov_matrix) <= target_risk ** 2) + ef.convex_objective(lambda x: -x @ ef.expected_returns, weights_sum_to_one=False) custom = ef.weights np.testing.assert_allclose(built_in, custom, atol=1e-7) -def test_custom_logarithmic_barrier(): +def test_convex_sharpe_raises_error(): + # With custom objective + with pytest.raises(exceptions.OptimizationError): + ef = setup_efficient_frontier() + ef.convex_objective( + objective_functions.sharpe_ratio, + expected_returns=ef.expected_returns, + cov_matrix=ef.cov_matrix, + ) + + +def test_custom_convex_logarithmic_barrier(): + # 60 Years of Portfolio Optimisation, Kolm et al (2014) + ef = setup_efficient_frontier() + + def logarithmic_barrier(w, cov_matrix, k=0.1): + log_sum = cp.sum(cp.log(w)) + var = cp.quad_form(w, cov_matrix) + return var - k * log_sum + + w = ef.convex_objective(logarithmic_barrier, cov_matrix=ef.cov_matrix) + assert isinstance(w, dict) + assert set(w.keys()) == set(ef.tickers) + np.testing.assert_almost_equal(ef.weights.sum(), 1) + + np.testing.assert_allclose( + ef.portfolio_performance(), + (0.23978400459553223, 0.21100848889958182, 1.041588448605623), + ) + + +def test_custom_convex_deviation_risk_parity_error(): + # 60 Years of Portfolio Optimisation, Kolm et al (2014) + ef = setup_efficient_frontier() + + def deviation_risk_parity(w, cov_matrix): + n = cov_matrix.shape[0] + rp = (w * (cov_matrix @ w)) / cp.quad_form(w, cov_matrix) + return cp.sum_squares(rp - 1 / n) + + with pytest.raises(exceptions.OptimizationError): + ef.convex_objective(deviation_risk_parity, cov_matrix=ef.cov_matrix) + + +def test_custom_nonconvex_min_var(): + ef = setup_efficient_frontier() + ef.min_volatility() + original_vol = ef.portfolio_performance()[1] + + # With custom objective + ef = setup_efficient_frontier() + ef.nonconvex_objective( + objective_functions.portfolio_variance, objective_args=ef.cov_matrix + ) + custom_vol = ef.portfolio_performance()[1] + # Scipy should be close but not as good for this simple objective + np.testing.assert_almost_equal(custom_vol, original_vol, decimal=5) + assert original_vol < custom_vol + + +def test_custom_nonconvex_logarithmic_barrier(): # 60 Years of Portfolio Optimisation, Kolm et al (2014) ef = setup_efficient_frontier() @@ -48,43 +119,85 @@ def logarithmic_barrier(weights, cov_matrix, k=0.1): portfolio_volatility = np.dot(weights.T, np.dot(cov_matrix, weights)) return portfolio_volatility - k * log_sum - w = ef.custom_objective(logarithmic_barrier, ef.cov_matrix, 0.1) + w = ef.nonconvex_objective(logarithmic_barrier, objective_args=(ef.cov_matrix, 0.2)) assert isinstance(w, dict) assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) np.testing.assert_almost_equal(ef.weights.sum(), 1) -def test_custom_deviation_risk_parity(): - # 60 Years of Portfolio Optimisation, Kolm et al (2014) +def test_custom_nonconvex_deviation_risk_parity_1(): + # 60 Years of Portfolio Optimisation, Kolm et al (2014) - first definition ef = setup_efficient_frontier() def deviation_risk_parity(w, cov_matrix): - diff = w * np.dot(cov_matrix, w) - \ - (w * np.dot(cov_matrix, w)).reshape(-1, 1) + diff = w * np.dot(cov_matrix, w) - (w * np.dot(cov_matrix, w)).reshape(-1, 1) return (diff ** 2).sum().sum() - w = ef.custom_objective(deviation_risk_parity, ef.cov_matrix) + w = ef.nonconvex_objective(deviation_risk_parity, ef.cov_matrix) assert isinstance(w, dict) assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) np.testing.assert_almost_equal(ef.weights.sum(), 1) -def test_custom_utility_objective(): +def test_custom_nonconvex_deviation_risk_parity_2(): + # 60 Years of Portfolio Optimisation, Kolm et al (2014) - second definition + ef = setup_efficient_frontier() + + def deviation_risk_parity(w, cov_matrix): + n = cov_matrix.shape[0] + rp = (w * (cov_matrix @ w)) / cp.quad_form(w, cov_matrix) + return cp.sum_squares(rp - 1 / n).value + + w = ef.nonconvex_objective(deviation_risk_parity, ef.cov_matrix) + assert isinstance(w, dict) + assert set(w.keys()) == set(ef.tickers) + np.testing.assert_almost_equal(ef.weights.sum(), 1) + + +def test_custom_nonconvex_utility_objective(): ef = setup_efficient_frontier() def utility_obj(weights, mu, cov_matrix, k=1): return -weights.dot(mu) + k * np.dot(weights.T, np.dot(cov_matrix, weights)) - w = ef.custom_objective(utility_obj, ef.expected_returns, ef.cov_matrix, 1) + w = ef.nonconvex_objective( + utility_obj, objective_args=(ef.expected_returns, ef.cov_matrix, 1) + ) assert isinstance(w, dict) assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) np.testing.assert_almost_equal(ef.weights.sum(), 1) vol1 = ef.portfolio_performance()[1] # If we increase k, volatility should decrease - ef.custom_objective(utility_obj, ef.expected_returns, ef.cov_matrix, 2) + w = ef.nonconvex_objective( + utility_obj, objective_args=(ef.expected_returns, ef.cov_matrix, 3) + ) vol2 = ef.portfolio_performance()[1] assert vol2 < vol1 + + +def test_custom_nonconvex_objective_market_neutral_efficient_risk(): + # Recreate the market-neutral efficient_risk optimiser using this API + target_risk = 0.19 + ef = EfficientFrontier( + *setup_efficient_frontier(data_only=True), weight_bounds=(-1, 1) + ) + + weight_constr = {"type": "eq", "fun": lambda w: np.sum(w)} + risk_constr = { + "type": "eq", + "fun": lambda w: target_risk ** 2 - np.dot(w.T, np.dot(ef.cov_matrix, w)), + } + constraints = [weight_constr, risk_constr] + + ef.nonconvex_objective( + lambda w, mu: -w.T.dot(mu), + objective_args=(ef.expected_returns), + weights_sum_to_one=False, + constraints=constraints, + ) + np.testing.assert_allclose( + ef.portfolio_performance(), + (0.2309497754562942, target_risk, 1.1102600451243954), + atol=1e-6, + ) diff --git a/tests/test_discrete_allocation.py b/tests/test_discrete_allocation.py index 1f00e791..60214b66 100644 --- a/tests/test_discrete_allocation.py +++ b/tests/test_discrete_allocation.py @@ -40,7 +40,7 @@ def test_greedy_portfolio_allocation(): da = DiscreteAllocation(w, latest_prices) allocation, leftover = da.greedy_portfolio() - assert allocation == { + assert { "MA": 14, "FB": 12, "PFE": 51, @@ -49,7 +49,6 @@ def test_greedy_portfolio_allocation(): "BBY": 9, "SBUX": 6, "GOOG": 1, - "AMD": 1, } total = 0 @@ -68,7 +67,7 @@ def test_greedy_allocation_rmse_error(): latest_prices = get_latest_prices(df) da = DiscreteAllocation(w, latest_prices) da.greedy_portfolio() - np.testing.assert_almost_equal(da._allocation_rmse_error(), 0.0257368) + np.testing.assert_almost_equal(da._allocation_rmse_error(), 0.025762032436733803) def test_greedy_portfolio_allocation_short(): @@ -124,7 +123,7 @@ def test_greedy_allocation_rmse_error_short(): latest_prices = get_latest_prices(df) da = DiscreteAllocation(w, latest_prices) da.greedy_portfolio() - np.testing.assert_almost_equal(da._allocation_rmse_error(), 0.03306318) + np.testing.assert_almost_equal(da._allocation_rmse_error(), 0.033070015016740284) def test_greedy_portfolio_allocation_short_different_params(): @@ -154,13 +153,13 @@ def test_greedy_portfolio_allocation_short_different_params(): "XOM": 11, "BAC": -271, "GM": -133, - "GE": -355, - "SHLD": -923, - "AMD": -284, - "JPM": -6, - "T": -13, - "UAA": -7, - "RRC": -2, + "GE": -356, + "SHLD": -922, + "AMD": -285, + "JPM": -5, + "T": -14, + "UAA": -8, + "RRC": -3, } long_total = 0 short_total = 0 @@ -184,14 +183,14 @@ def test_lp_portfolio_allocation(): allocation, leftover = da.lp_portfolio() assert da.allocation == { - "AAPL": 5.0, - "FB": 11.0, - "BABA": 5.0, - "AMZN": 1.0, - "BBY": 7.0, - "MA": 14.0, - "PFE": 50.0, - "SBUX": 5.0, + "AAPL": 5, + "FB": 11, + "BABA": 5, + "AMZN": 1, + "BBY": 7, + "MA": 14, + "PFE": 50, + "SBUX": 5, } total = 0 for ticker, num in allocation.items(): @@ -209,7 +208,7 @@ def test_lp_allocation_rmse_error(): latest_prices = get_latest_prices(df) da = DiscreteAllocation(w, latest_prices) da.lp_portfolio() - np.testing.assert_almost_equal(da._allocation_rmse_error(verbose=False), 0.0170634) + np.testing.assert_almost_equal(da._allocation_rmse_error(verbose=False), 0.017070218149194846) def test_lp_portfolio_allocation_short(): @@ -224,24 +223,24 @@ def test_lp_portfolio_allocation_short(): allocation, leftover = da.lp_portfolio() assert da.allocation == { - "GOOG": 1.0, - "AAPL": 5.0, - "FB": 8.0, - "BABA": 5.0, - "WMT": 2.0, - "XOM": 2.0, - "BBY": 9.0, - "MA": 16.0, - "PFE": 46.0, - "SBUX": 9.0, - "GE": -43.0, - "AMD": -34.0, - "BAC": -32.0, - "GM": -16.0, - "T": -1.0, - "UAA": -1.0, - "SHLD": -110.0, - "JPM": -1.0, + "GOOG": 1, + "AAPL": 5, + "FB": 8, + "BABA": 5, + "WMT": 2, + "XOM": 2, + "BBY": 9, + "MA": 16, + "PFE": 46, + "SBUX": 9, + "GE": -43, + "AMD": -34, + "BAC": -32, + "GM": -16, + "T": -1, + "UAA": -1, + "SHLD": -110, + "JPM": -1, } long_total = 0 short_total = 0 @@ -265,7 +264,7 @@ def test_lp_allocation_rmse_error_short(): latest_prices = get_latest_prices(df) da = DiscreteAllocation(w, latest_prices) da.lp_portfolio() - np.testing.assert_almost_equal(da._allocation_rmse_error(), 0.02699558) + np.testing.assert_almost_equal(da._allocation_rmse_error(), 0.027018566693989568) def test_lp_portfolio_allocation_different_params(): @@ -282,17 +281,17 @@ def test_lp_portfolio_allocation_different_params(): allocation, leftover = da.lp_portfolio() assert da.allocation == { - "GOOG": 1.0, - "AAPL": 43.0, - "FB": 95.0, - "BABA": 44.0, - "AMZN": 4.0, - "AMD": 1.0, - "SHLD": 3.0, - "BBY": 69.0, - "MA": 114.0, - "PFE": 412.0, - "SBUX": 51.0, + "GOOG": 1, + "AAPL": 43, + "FB": 95, + "BABA": 44, + "AMZN": 4, + "AMD": 1, + "SHLD": 3, + "BBY": 69, + "MA": 114, + "PFE": 412, + "SBUX": 51, } total = 0 for ticker, num in allocation.items(): diff --git a/tests/test_efficient_frontier.py b/tests/test_efficient_frontier.py old mode 100755 new mode 100644 index c94efcc9..c676df1f --- a/tests/test_efficient_frontier.py +++ b/tests/test_efficient_frontier.py @@ -1,10 +1,15 @@ import warnings import numpy as np import pandas as pd +import cvxpy as cp import pytest -from pypfopt.efficient_frontier import EfficientFrontier -from tests.utilities_for_tests import get_data, setup_efficient_frontier +import scipy.optimize as sco + +from pypfopt import EfficientFrontier from pypfopt import risk_models +from pypfopt import objective_functions +from pypfopt import exceptions +from tests.utilities_for_tests import get_data, setup_efficient_frontier def test_data_source(): @@ -25,19 +30,215 @@ def test_returns_dataframe(): assert not ((returns_df > 1) & returns_df.notnull()).any().any() +def test_efficient_frontier_inheritance(): + ef = setup_efficient_frontier() + assert ef.clean_weights + assert ef.n_assets + assert ef.tickers + assert isinstance(ef._constraints, list) + assert isinstance(ef._lower_bounds, np.ndarray) + assert isinstance(ef._upper_bounds, np.ndarray) + + def test_portfolio_performance(): ef = setup_efficient_frontier() with pytest.raises(ValueError): ef.portfolio_performance() - ef.max_sharpe() - assert ef.portfolio_performance() + ef.min_volatility() + perf = ef.portfolio_performance() + assert isinstance(perf, tuple) + assert len(perf) == 3 + assert isinstance(perf[0], float) -def test_efficient_frontier_inheritance(): +def test_min_volatility(): ef = setup_efficient_frontier() - assert ef.clean_weights - assert isinstance(ef.initial_guess, np.ndarray) - assert isinstance(ef.constraints, list) + w = ef.min_volatility() + assert isinstance(w, dict) + assert set(w.keys()) == set(ef.tickers) + np.testing.assert_almost_equal(ef.weights.sum(), 1) + assert all([i >= 0 for i in w.values()]) + + np.testing.assert_allclose( + ef.portfolio_performance(), + (0.17931232481259154, 0.15915084514118694, 1.00101463282373), + ) + + +def test_min_volatility_tx_costs(): + # Baseline + ef = setup_efficient_frontier() + ef.min_volatility() + w1 = ef.weights + + # Pretend we were initally equal weight + ef = setup_efficient_frontier() + prev_w = np.array([1 / ef.n_assets] * ef.n_assets) + ef.add_objective(objective_functions.transaction_cost, w_prev=prev_w) + ef.min_volatility() + w2 = ef.weights + + # TX cost should pull closer to prev portfolio + assert np.abs(prev_w - w2).sum() < np.abs(prev_w - w1).sum() + + +def test_min_volatility_short(): + ef = EfficientFrontier( + *setup_efficient_frontier(data_only=True), weight_bounds=(None, None) + ) + w = ef.min_volatility() + assert isinstance(w, dict) + assert set(w.keys()) == set(ef.tickers) + np.testing.assert_almost_equal(ef.weights.sum(), 1) + np.testing.assert_allclose( + ef.portfolio_performance(), + (0.1721356467349655, 0.1555915367269669, 0.9777887019776287), + ) + + # Shorting should reduce volatility + volatility = ef.portfolio_performance()[1] + ef_long_only = setup_efficient_frontier() + ef_long_only.min_volatility() + long_only_volatility = ef_long_only.portfolio_performance()[1] + assert volatility < long_only_volatility + + +def test_min_volatility_L2_reg(): + ef = setup_efficient_frontier() + ef.add_objective(objective_functions.L2_reg, gamma=5) + weights = ef.min_volatility() + assert isinstance(weights, dict) + assert set(weights.keys()) == set(ef.tickers) + np.testing.assert_almost_equal(ef.weights.sum(), 1) + assert all([i >= 0 for i in weights.values()]) + + ef2 = setup_efficient_frontier() + ef2.min_volatility() + + # L2_reg should pull close to equal weight + equal_weight = np.full((ef.n_assets,), 1 / ef.n_assets) + assert ( + np.abs(equal_weight - ef.weights).sum() + < np.abs(equal_weight - ef2.weights).sum() + ) + + np.testing.assert_allclose( + ef.portfolio_performance(), + (0.2382083649754719, 0.20795460936504614, 1.049307662098637), + ) + + +def test_min_volatility_L2_reg_many_values(): + ef = setup_efficient_frontier() + ef.min_volatility() + # Count the number of weights more 1% + initial_number = sum(ef.weights > 0.01) + for _ in range(10): + ef.add_objective(objective_functions.L2_reg, gamma=0.05) + ef.min_volatility() + np.testing.assert_almost_equal(ef.weights.sum(), 1) + new_number = sum(ef.weights > 0.01) + # Higher gamma should reduce the number of small weights + assert new_number >= initial_number + initial_number = new_number + + +def test_min_volatility_L2_reg_limit_case(): + ef = setup_efficient_frontier() + ef.add_objective(objective_functions.L2_reg, gamma=1e10) + ef.min_volatility() + equal_weights = np.array([1 / ef.n_assets] * ef.n_assets) + np.testing.assert_array_almost_equal(ef.weights, equal_weights) + + +def test_min_volatility_L2_reg_increases_vol(): + # L2 reg should reduce the number of small weights + # but increase in-sample volatility. + ef_no_reg = setup_efficient_frontier() + ef_no_reg.min_volatility() + vol_no_reg = ef_no_reg.portfolio_performance()[1] + ef = setup_efficient_frontier() + ef.add_objective(objective_functions.L2_reg, gamma=2) + ef.min_volatility() + vol = ef.portfolio_performance()[1] + assert vol > vol_no_reg + + +def test_min_volatility_tx_costs_L2_reg(): + ef = setup_efficient_frontier() + prev_w = np.array([1 / ef.n_assets] * ef.n_assets) + ef.add_objective(objective_functions.transaction_cost, w_prev=prev_w) + ef.add_objective(objective_functions.L2_reg) + ef.min_volatility() + + np.testing.assert_allclose( + ef.portfolio_performance(), + (0.2316565265271545, 0.1959773703677164, 1.0800049318450338), + ) + + +def test_min_volatility_cvxpy_vs_scipy(): + # cvxpy + ef = setup_efficient_frontier() + ef.min_volatility() + w1 = ef.weights + + # scipy + args = (ef.cov_matrix,) + initial_guess = np.array([1 / ef.n_assets] * ef.n_assets) + result = sco.minimize( + objective_functions.portfolio_variance, + x0=initial_guess, + args=args, + method="SLSQP", + bounds=[(0, 1)] * 20, + constraints=[{"type": "eq", "fun": lambda x: np.sum(x) - 1}], + ) + w2 = result["x"] + + cvxpy_var = objective_functions.portfolio_variance(w1, ef.cov_matrix) + scipy_var = objective_functions.portfolio_variance(w2, ef.cov_matrix) + assert cvxpy_var <= scipy_var + + +def test_min_volatility_vs_max_sharpe(): + # Test based on issue #75 + expected_returns_daily = pd.Series( + [0.043622, 0.120588, 0.072331, 0.056586], index=["AGG", "SPY", "GLD", "HYG"] + ) + covariance_matrix = pd.DataFrame( + [ + [0.000859, -0.000941, 0.001494, -0.000062], + [-0.000941, 0.022400, -0.002184, 0.005747], + [0.001494, -0.002184, 0.011518, -0.000129], + [-0.000062, 0.005747, -0.000129, 0.002287], + ], + index=["AGG", "SPY", "GLD", "HYG"], + columns=["AGG", "SPY", "GLD", "HYG"], + ) + + ef = EfficientFrontier(expected_returns_daily, covariance_matrix) + ef.min_volatility() + vol_min_vol = ef.portfolio_performance(risk_free_rate=0.00)[1] + + ef.max_sharpe(risk_free_rate=0.00) + vol_max_sharpe = ef.portfolio_performance(risk_free_rate=0.00)[1] + + assert vol_min_vol < vol_max_sharpe + + +def test_min_volatility_nonconvex_objective(): + ef = setup_efficient_frontier() + ef.add_objective(lambda x: cp.sum((x + 1) / (x + 2) ** 2)) + with pytest.raises(exceptions.OptimizationError): + ef.min_volatility() + + +def test_min_volatility_nonlinear_constraint(): + ef = setup_efficient_frontier() + ef.add_constraint(lambda x: (x + 1) / (x + 2) ** 2 <= 0.5) + with pytest.raises(exceptions.OptimizationError): + ef.min_volatility() def test_max_sharpe_long_only(): @@ -45,14 +246,31 @@ def test_max_sharpe_long_only(): w = ef.max_sharpe() assert isinstance(w, dict) assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) np.testing.assert_almost_equal(ef.weights.sum(), 1) assert all([i >= 0 for i in w.values()]) np.testing.assert_allclose( ef.portfolio_performance(), - (0.3303554227420522, 0.21671629569400466, 1.4320816150358278), + (0.33035037367760506, 0.21671276571944567, 1.4320816434015786), + ) + + +def test_max_sharpe_long_weight_bounds(): + ef = EfficientFrontier( + *setup_efficient_frontier(data_only=True), weight_bounds=(0.03, 0.13) + ) + ef.max_sharpe() + np.testing.assert_almost_equal(ef.weights.sum(), 1) + assert ef.weights.min() >= 0.03 + assert ef.weights.max() <= 0.13 + + bounds = [(0.01, 0.13), (0.02, 0.11)] * 10 + ef = EfficientFrontier( + *setup_efficient_frontier(data_only=True), weight_bounds=bounds ) + ef.max_sharpe() + assert (0.01 <= ef.weights[::2]).all() and (ef.weights[::2] <= 0.13).all() + assert (0.02 <= ef.weights[1::2]).all() and (ef.weights[1::2] <= 0.11).all() def test_max_sharpe_short(): @@ -62,11 +280,10 @@ def test_max_sharpe_short(): w = ef.max_sharpe() assert isinstance(w, dict) assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) np.testing.assert_almost_equal(ef.weights.sum(), 1) np.testing.assert_allclose( ef.portfolio_performance(), - (0.4072375737868628, 0.24823079606119094, 1.5599900573634125) + (0.4072439477276246, 0.24823487545231313, 1.5599900981762558), ) sharpe = ef.portfolio_performance()[2] @@ -77,29 +294,31 @@ def test_max_sharpe_short(): assert sharpe > long_only_sharpe -def test_weight_bounds_minus_one_to_one(): - ef = EfficientFrontier( - *setup_efficient_frontier(data_only=True), weight_bounds=(-1, 1) - ) - assert ef.max_sharpe() - assert ef.min_volatility() - assert ef.efficient_return(0.05) - assert ef.efficient_risk(0.20) - - def test_max_sharpe_L2_reg(): ef = setup_efficient_frontier() - ef.gamma = 1 - w = ef.max_sharpe() - assert isinstance(w, dict) - assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) - np.testing.assert_almost_equal(ef.weights.sum(), 1) - assert all([i >= 0 for i in w.values()]) + ef.add_objective(objective_functions.L2_reg, gamma=5) + with warnings.catch_warnings(record=True) as w: + weights = ef.max_sharpe() + assert len(w) == 1 + + assert isinstance(weights, dict) + assert set(weights.keys()) == set(ef.tickers) + np.testing.assert_almost_equal(ef.weights.sum(), 1) + assert all([i >= 0 for i in weights.values()]) np.testing.assert_allclose( ef.portfolio_performance(), - (0.3062919877378972, 0.20291366982652356, 1.4109053765705188), + (0.2936875354933478, 0.22783545277575057, 1.2012508683744123), + ) + + ef2 = setup_efficient_frontier() + ef2.max_sharpe() + + # L2_reg should pull close to equal weight + equal_weight = np.full((ef.n_assets,), 1 / ef.n_assets) + assert ( + np.abs(equal_weight - ef.weights).sum() + < np.abs(equal_weight - ef2.weights).sum() ) @@ -108,8 +327,9 @@ def test_max_sharpe_L2_reg_many_values(): ef.max_sharpe() # Count the number of weights more 1% initial_number = sum(ef.weights > 0.01) - for a in np.arange(0.5, 5, 0.5): - ef.gamma = a + for _ in range(10): + print(initial_number) + ef.add_objective(objective_functions.L2_reg, gamma=0.05) ef.max_sharpe() np.testing.assert_almost_equal(ef.weights.sum(), 1) new_number = sum(ef.weights > 0.01) @@ -118,12 +338,21 @@ def test_max_sharpe_L2_reg_many_values(): initial_number = new_number -def test_max_sharpe_L2_reg_limit_case(): +def test_max_sharpe_L2_reg_different_gamma(): ef = setup_efficient_frontier() - ef.gamma = 1e10 + ef.add_objective(objective_functions.L2_reg, gamma=1) ef.max_sharpe() - equal_weights = np.array([1 / ef.n_assets] * ef.n_assets) - np.testing.assert_array_almost_equal(ef.weights, equal_weights) + + ef2 = setup_efficient_frontier() + ef2.add_objective(objective_functions.L2_reg, gamma=0.01) + ef2.max_sharpe() + + # Higher gamma should pull close to equal weight + equal_weight = np.array([1 / ef.n_assets] * ef.n_assets) + assert ( + np.abs(equal_weight - ef.weights).sum() + < np.abs(equal_weight - ef2.weights).sum() + ) def test_max_sharpe_L2_reg_reduces_sharpe(): @@ -132,10 +361,9 @@ def test_max_sharpe_L2_reg_reduces_sharpe(): ef_no_reg.max_sharpe() sharpe_no_reg = ef_no_reg.portfolio_performance()[2] ef = setup_efficient_frontier() - ef.gamma = 1 + ef.add_objective(objective_functions.L2_reg, gamma=2) ef.max_sharpe() sharpe = ef.portfolio_performance()[2] - assert sharpe < sharpe_no_reg @@ -147,15 +375,14 @@ def test_max_sharpe_L2_reg_with_shorts(): ef = EfficientFrontier( *setup_efficient_frontier(data_only=True), weight_bounds=(None, None) ) - ef.gamma = 1 + ef.add_objective(objective_functions.L2_reg) w = ef.max_sharpe() assert isinstance(w, dict) assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) np.testing.assert_almost_equal(ef.weights.sum(), 1) np.testing.assert_allclose( ef.portfolio_performance(), - (0.32360478341793864, 0.20241509658051923, 1.499911758296975), + (0.3076093180094401, 0.22415982749409985, 1.2830546901496447), ) new_number = sum(ef.weights > 0.01) assert new_number >= initial_number @@ -174,117 +401,91 @@ def test_max_sharpe_risk_free_rate(): assert new_sharpe >= initial_sharpe -def test_max_sharpe_input_errors(): - with pytest.raises(ValueError): - ef = EfficientFrontier( - *setup_efficient_frontier(data_only=True), gamma="2" - ) - - with warnings.catch_warnings(record=True) as w: - ef = EfficientFrontier( - *setup_efficient_frontier(data_only=True), gamma=-1) - assert len(w) == 1 - assert issubclass(w[0].category, UserWarning) - assert ( - str(w[0].message) - == "in most cases, gamma should be positive" - ) - - with pytest.raises(ValueError): - ef.max_sharpe(risk_free_rate="0.2") - - -def test_max_unconstrained_utility(): +def test_max_quadratic_utility(): ef = setup_efficient_frontier() - w = ef.max_unconstrained_utility(2) + w = ef.max_quadratic_utility(risk_aversion=2) assert isinstance(w, dict) assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) + np.testing.assert_almost_equal(ef.weights.sum(), 1) + np.testing.assert_allclose( ef.portfolio_performance(), - (1.3507326549906276, 0.8218067458322021, 1.6192768698230409) + (0.40064324249527605, 0.2917825266124642, 1.3045443362029479), ) ret1, var1, _ = ef.portfolio_performance() # increasing risk_aversion should lower both vol and return - ef.max_unconstrained_utility(10) + ef.max_quadratic_utility(10) ret2, var2, _ = ef.portfolio_performance() assert ret2 < ret1 and var2 < var1 -def test_max_unconstrained_utility_error(): - ef = setup_efficient_frontier() - with pytest.raises(ValueError): - ef.max_unconstrained_utility(0) - with pytest.raises(ValueError): - ef.max_unconstrained_utility(-1) - - -def test_min_volatility(): - ef = setup_efficient_frontier() - w = ef.min_volatility() - assert isinstance(w, dict) - assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) +def test_max_quadratic_utility_with_shorts(): + ef = EfficientFrontier( + *setup_efficient_frontier(data_only=True), weight_bounds=(-1, 1) + ) + ef.max_quadratic_utility() np.testing.assert_almost_equal(ef.weights.sum(), 1) - assert all([i >= 0 for i in w.values()]) + np.testing.assert_allclose( ef.portfolio_performance(), - (0.1791557243114251, 0.15915426422116669, 1.0000091740567905), + (1.3318330413711252, 1.0198436183533854, 1.2863080356272452), ) -def test_min_volatility_short(): +def test_max_quadratic_utility_market_neutral(): ef = EfficientFrontier( - *setup_efficient_frontier(data_only=True), weight_bounds=(None, None) + *setup_efficient_frontier(data_only=True), weight_bounds=(-1, 1) ) - w = ef.min_volatility() - assert isinstance(w, dict) - assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) - np.testing.assert_almost_equal(ef.weights.sum(), 1) + ef.max_quadratic_utility(market_neutral=True) + np.testing.assert_almost_equal(ef.weights.sum(), 0) np.testing.assert_allclose( ef.portfolio_performance(), - (0.1719799152621441, 0.1555954785460613, 0.9767630568850568), + (1.13434841843883, 0.9896404148973286, 1.1260134506071473), ) - # Shorting should reduce volatility - volatility = ef.portfolio_performance()[1] - ef_long_only = setup_efficient_frontier() - ef_long_only.min_volatility() - long_only_volatility = ef_long_only.portfolio_performance()[1] - assert volatility < long_only_volatility +def test_max_quadratic_utility_limit(): + # in limit of large risk_aversion, this should approach min variance. + ef = setup_efficient_frontier() + ef.max_quadratic_utility(risk_aversion=1e10) -def test_min_volatility_L2_reg(): + ef2 = setup_efficient_frontier() + ef2.min_volatility() + np.testing.assert_array_almost_equal(ef.weights, ef2.weights) + + +def test_max_quadratic_utility_L2_reg(): ef = setup_efficient_frontier() - ef.gamma = 1 - w = ef.min_volatility() - assert isinstance(w, dict) - assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) - np.testing.assert_almost_equal(ef.weights.sum(), 1) - assert all([i >= 0 for i in w.values()]) + ef.add_objective(objective_functions.L2_reg, gamma=5) + weights = ef.max_quadratic_utility() + assert isinstance(weights, dict) + assert set(weights.keys()) == set(ef.tickers) + np.testing.assert_almost_equal(ef.weights.sum(), 1) + assert all([i >= 0 for i in weights.values()]) np.testing.assert_allclose( ef.portfolio_performance(), - (0.23136193240984504, 0.1955259140191799, 1.0809919159314694), + (0.2602803268728476, 0.21603540587515674, 1.112226608872166), ) + ef2 = setup_efficient_frontier() + ef2.max_quadratic_utility() -def test_min_volatility_L2_reg_many_values(): + # L2_reg should pull close to equal weight + equal_weight = np.full((ef.n_assets,), 1 / ef.n_assets) + assert ( + np.abs(equal_weight - ef.weights).sum() + < np.abs(equal_weight - ef2.weights).sum() + ) + + +def test_max_quadratic_utility_error(): ef = setup_efficient_frontier() - ef.min_volatility() - # Count the number of weights more 1% - initial_number = sum(ef.weights > 0.01) - for a in np.arange(0.5, 5, 0.5): - ef.gamma = a - ef.min_volatility() - np.testing.assert_almost_equal(ef.weights.sum(), 1) - new_number = sum(ef.weights > 0.01) - # Higher gamma should reduce the number of small weights - assert new_number >= initial_number - initial_number = new_number + with pytest.raises(ValueError): + ef.max_quadratic_utility(0) + with pytest.raises(ValueError): + ef.max_quadratic_utility(-1) def test_efficient_risk(): @@ -292,11 +493,12 @@ def test_efficient_risk(): w = ef.efficient_risk(0.19) assert isinstance(w, dict) assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) np.testing.assert_almost_equal(ef.weights.sum(), 1) assert all([i >= 0 for i in w.values()]) np.testing.assert_allclose( - ef.portfolio_performance(), (0.2857747021087114, 0.19, 1.3988133092245933), atol=1e-6 + ef.portfolio_performance(), + (0.28577452556155075, 0.19, 1.3988132892376837), + atol=1e-6, ) @@ -304,28 +506,34 @@ def test_efficient_risk_error(): ef = setup_efficient_frontier() ef.min_volatility() min_possible_vol = ef.portfolio_performance()[1] - with pytest.raises(ValueError): + + ef = setup_efficient_frontier() + assert ef.efficient_risk(min_possible_vol + 0.01) + + ef = setup_efficient_frontier() + with pytest.raises(exceptions.OptimizationError): # This volatility is too low ef.efficient_risk(min_possible_vol - 0.01) def test_efficient_risk_many_values(): - ef = setup_efficient_frontier() - for target_risk in np.arange(0.16, 0.21, 0.30): + for target_risk in np.array([0.16, 0.21, 0.30]): + ef = setup_efficient_frontier() ef.efficient_risk(target_risk) np.testing.assert_almost_equal(ef.weights.sum(), 1) volatility = ef.portfolio_performance()[1] - assert abs(target_risk - volatility) < 0.05 + print(volatility) + assert abs(target_risk - volatility) < 1e-5 def test_efficient_risk_short(): ef = EfficientFrontier( - *setup_efficient_frontier(data_only=True), weight_bounds=(None, None) + *setup_efficient_frontier(data_only=True), weight_bounds=(-1, 1) ) w = ef.efficient_risk(0.19) + assert isinstance(w, dict) assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) np.testing.assert_almost_equal(ef.weights.sum(), 1) np.testing.assert_allclose( ef.portfolio_performance(), @@ -335,7 +543,7 @@ def test_efficient_risk_short(): sharpe = ef.portfolio_performance()[2] ef_long_only = setup_efficient_frontier() - ef_long_only.efficient_return(0.25) + ef_long_only.efficient_risk(0.19) long_only_sharpe = ef_long_only.portfolio_performance()[2] assert sharpe > long_only_sharpe @@ -343,59 +551,71 @@ def test_efficient_risk_short(): def test_efficient_risk_L2_reg(): ef = setup_efficient_frontier() - ef.gamma = 1 - w = ef.efficient_risk(0.19) - assert isinstance(w, dict) - assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) - np.testing.assert_almost_equal(ef.weights.sum(), 1) - assert all([i >= 0 for i in w.values()]) + ef.add_objective(objective_functions.L2_reg, gamma=5) + weights = ef.efficient_risk(0.19) + assert isinstance(weights, dict) + assert set(weights.keys()) == set(ef.tickers) + np.testing.assert_almost_equal(ef.weights.sum(), 1) + assert all([i >= 0 for i in weights.values()]) np.testing.assert_allclose( ef.portfolio_performance(), - (0.28437776398043807, 0.19, 1.3914587310224322), + (0.24087463760460398, 0.19, 1.162498090632486), atol=1e-6, ) + ef2 = setup_efficient_frontier() + ef2.efficient_risk(0.19) -def test_efficient_risk_L2_reg_many_values(): - ef = setup_efficient_frontier() - ef.efficient_risk(0.19) - # Count the number of weights more 1% - initial_number = sum(ef.weights > 0.01) - for a in np.arange(0.5, 5, 0.5): - ef.gamma = a - ef.efficient_risk(0.2) - np.testing.assert_almost_equal(ef.weights.sum(), 1) - new_number = sum(ef.weights > 0.01) - # Higher gamma should reduce the number of small weights - assert new_number >= initial_number - initial_number = new_number + # L2_reg should pull close to equal weight + equal_weight = np.full((ef.n_assets,), 1 / ef.n_assets) + assert ( + np.abs(equal_weight - ef.weights).sum() + < np.abs(equal_weight - ef2.weights).sum() + ) def test_efficient_risk_market_neutral(): ef = EfficientFrontier( *setup_efficient_frontier(data_only=True), weight_bounds=(-1, 1) ) - w = ef.efficient_risk(0.19, market_neutral=True) + w = ef.efficient_risk(0.21, market_neutral=True) assert isinstance(w, dict) assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) np.testing.assert_almost_equal(ef.weights.sum(), 0) assert (ef.weights < 1).all() and (ef.weights > -1).all() np.testing.assert_allclose( ef.portfolio_performance(), - (0.2309497469633197, 0.19, 1.1102605909328953), - atol=1e-6 + (0.2552600197428133, 0.21, 1.1202858085349783), + atol=1e-6, ) sharpe = ef.portfolio_performance()[2] ef_long_only = setup_efficient_frontier() - ef_long_only.efficient_return(0.25) + ef_long_only.efficient_risk(0.21) long_only_sharpe = ef_long_only.portfolio_performance()[2] assert long_only_sharpe > sharpe +def test_efficient_risk_market_neutral_L2_reg(): + ef = EfficientFrontier( + *setup_efficient_frontier(data_only=True), weight_bounds=(-1, 1) + ) + ef.add_objective(objective_functions.L2_reg) + + w = ef.efficient_risk(0.19, market_neutral=True) + assert isinstance(w, dict) + assert set(w.keys()) == set(ef.tickers) + np.testing.assert_almost_equal(ef.weights.sum(), 0) + assert (ef.weights < 1).all() and (ef.weights > -1).all() + + np.testing.assert_allclose( + ef.portfolio_performance(), + (0.10755645826336145, 0.11079556786108302, 0.7902523535340413), + atol=1e-6, + ) + + def test_efficient_risk_market_neutral_warning(): ef = setup_efficient_frontier() with warnings.catch_warnings(record=True) as w: @@ -413,19 +633,23 @@ def test_efficient_return(): w = ef.efficient_return(0.25) assert isinstance(w, dict) assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) np.testing.assert_almost_equal(ef.weights.sum(), 1) assert all([i >= 0 for i in w.values()]) np.testing.assert_allclose( - ef.portfolio_performance(), (0.25, 0.1738877891235972, 1.3226920714748545), atol=1e-6 + ef.portfolio_performance(), + (0.25, 0.1738852429895079, 1.3227114391408021), + atol=1e-6, ) def test_efficient_return_error(): ef = setup_efficient_frontier() max_ret = ef.expected_returns.max() + with pytest.raises(ValueError): - # This volatility is too low + ef.efficient_return(-0.1) + with pytest.raises(ValueError): + # This return is too high ef.efficient_return(max_ret + 0.01) @@ -446,10 +670,9 @@ def test_efficient_return_short(): w = ef.efficient_return(0.25) assert isinstance(w, dict) assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) np.testing.assert_almost_equal(ef.weights.sum(), 1) np.testing.assert_allclose( - ef.portfolio_performance(), (0.25, 0.1682647442258144, 1.3668935881968987) + ef.portfolio_performance(), (0.25, 0.16826225873038014, 1.3669137793315087) ) sharpe = ef.portfolio_performance()[2] @@ -462,34 +685,17 @@ def test_efficient_return_short(): def test_efficient_return_L2_reg(): ef = setup_efficient_frontier() - ef.gamma = 1 + ef.add_objective(objective_functions.L2_reg, gamma=1) w = ef.efficient_return(0.25) assert isinstance(w, dict) assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) np.testing.assert_almost_equal(ef.weights.sum(), 1) assert all([i >= 0 for i in w.values()]) np.testing.assert_allclose( - ef.portfolio_performance(), (0.25, 0.20032972845476912, 1.1481071819692497) + ef.portfolio_performance(), (0.25, 0.20033592447690426, 1.1480716731187948) ) -def test_efficient_return_L2_reg_many_values(): - ef = setup_efficient_frontier() - ef.efficient_return(0.25) - # Count the number of weights more 1% - initial_number = sum(ef.weights > 0.01) - for a in np.arange(0.5, 5, 0.5): - ef.gamma = a - ef.efficient_return(0.20) - np.testing.assert_almost_equal(ef.weights.sum(), 1) - assert all([i >= 0 for i in ef.weights]) - new_number = sum(ef.weights > 0.01) - # Higher gamma should reduce the number of small weights - assert new_number >= initial_number - initial_number = new_number - - def test_efficient_return_market_neutral(): ef = EfficientFrontier( *setup_efficient_frontier(data_only=True), weight_bounds=(-1, 1) @@ -497,12 +703,10 @@ def test_efficient_return_market_neutral(): w = ef.efficient_return(0.25, market_neutral=True) assert isinstance(w, dict) assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) np.testing.assert_almost_equal(ef.weights.sum(), 0) assert (ef.weights < 1).all() and (ef.weights > -1).all() np.testing.assert_almost_equal( - ef.portfolio_performance(), - (0.25, 0.20567621957479246, 1.1182624830289896) + ef.portfolio_performance(), (0.25, 0.20567263154580923, 1.1182819914898223) ) sharpe = ef.portfolio_performance()[2] ef_long_only = setup_efficient_frontier() @@ -512,6 +716,7 @@ def test_efficient_return_market_neutral(): def test_efficient_return_market_neutral_warning(): + # This fails ef = setup_efficient_frontier() with warnings.catch_warnings(record=True) as w: ef.efficient_return(0.25, market_neutral=True) @@ -530,12 +735,11 @@ def test_max_sharpe_semicovariance(): w = ef.max_sharpe() assert isinstance(w, dict) assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) np.testing.assert_almost_equal(ef.weights.sum(), 1) assert all([i >= 0 for i in w.values()]) np.testing.assert_allclose( ef.portfolio_performance(), - (0.2972237371625498, 0.06443267303123411, 4.302533545801584) + (0.2972184894480104, 0.06443145011260347, 4.302533762060766), ) @@ -548,44 +752,46 @@ def test_max_sharpe_short_semicovariance(): w = ef.max_sharpe() assert isinstance(w, dict) assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) np.testing.assert_almost_equal(ef.weights.sum(), 1) np.testing.assert_allclose( ef.portfolio_performance(), - (0.3564654865246848, 0.07202031837368413, 4.671813373260894) + (0.3564305116656491, 0.07201282488003401, 4.671813836300796), ) -def test_min_volatilty_semicovariance_L2_reg(): +def test_min_volatilty_shrunk_L2_reg(): df = get_data() ef = setup_efficient_frontier() - ef.gamma = 1 - ef.cov_matrix = risk_models.semicovariance(df, benchmark=0) + ef.add_objective(objective_functions.L2_reg) + + ef.cov_matrix = risk_models.CovarianceShrinkage(df).ledoit_wolf( + shrinkage_target="constant_correlation" + ) w = ef.min_volatility() + assert isinstance(w, dict) assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) np.testing.assert_almost_equal(ef.weights.sum(), 1) assert all([i >= 0 for i in w.values()]) np.testing.assert_allclose( ef.portfolio_performance(), - (0.23803779483710888, 0.0962263031034166, 2.265885603053655) + (0.23127405291296832, 0.19563921371709164, 1.079916694096337), ) -def test_efficient_return_semicovariance(): +def test_efficient_return_shrunk(): df = get_data() ef = setup_efficient_frontier() - ef.cov_matrix = risk_models.semicovariance(df, benchmark=0) - w = ef.efficient_return(0.12) + ef.cov_matrix = risk_models.CovarianceShrinkage(df).ledoit_wolf( + shrinkage_target="single_factor" + ) + w = ef.efficient_return(0.22) assert isinstance(w, dict) assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) np.testing.assert_almost_equal(ef.weights.sum(), 1) assert all([i >= 0 for i in w.values()]) np.testing.assert_allclose( - ef.portfolio_performance(), - (0.11999999997948813, 0.06948386215256849, 1.4391830977949114) + ef.portfolio_performance(), (0.22, 0.0849639369932322, 2.353939884117318) ) @@ -596,29 +802,27 @@ def test_max_sharpe_exp_cov(): w = ef.max_sharpe() assert isinstance(w, dict) assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) np.testing.assert_almost_equal(ef.weights.sum(), 1) assert all([i >= 0 for i in w.values()]) np.testing.assert_allclose( ef.portfolio_performance(), - (0.3678835305574766, 0.17534146043561463, 1.9840346355802103) + (0.3678817256187322, 0.1753405505478982, 1.9840346373481956), ) def test_min_volatility_exp_cov_L2_reg(): df = get_data() ef = setup_efficient_frontier() - ef.gamma = 1 + ef.add_objective(objective_functions.L2_reg) ef.cov_matrix = risk_models.exp_cov(df) w = ef.min_volatility() assert isinstance(w, dict) assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) np.testing.assert_almost_equal(ef.weights.sum(), 1) assert all([i >= 0 for i in w.values()]) np.testing.assert_allclose( ef.portfolio_performance(), - (0.24340406492258035, 0.17835396894670616, 1.2525881326999546) + (0.2434082300792007, 0.17835412793427002, 1.2526103694192867), ) @@ -631,11 +835,10 @@ def test_efficient_risk_exp_cov_market_neutral(): w = ef.efficient_risk(0.19, market_neutral=True) assert isinstance(w, dict) assert set(w.keys()) == set(ef.tickers) - assert set(w.keys()) == set(ef.expected_returns.index) np.testing.assert_almost_equal(ef.weights.sum(), 0) assert (ef.weights < 1).all() and (ef.weights > -1).all() np.testing.assert_allclose( ef.portfolio_performance(), - (0.39089308906686077, 0.19, 1.9520670176494717), - atol=1e-6 + (0.3908928033782067, 0.18999999995323363, 1.9520673866815672), + atol=1e-6, ) diff --git a/tests/test_hrp.py b/tests/test_hrp.py index 54858844..2096a998 100644 --- a/tests/test_hrp.py +++ b/tests/test_hrp.py @@ -1,6 +1,6 @@ import numpy as np import pytest -from pypfopt.hierarchical_risk_parity import HRPOpt +from pypfopt import HRPOpt from tests.utilities_for_tests import get_data @@ -8,7 +8,7 @@ def test_hrp_portfolio(): df = get_data() returns = df.pct_change().dropna(how="all") hrp = HRPOpt(returns) - w = hrp.hrp_portfolio() + w = hrp.optimize() assert isinstance(w, dict) assert set(w.keys()) == set(df.columns) np.testing.assert_almost_equal(sum(w.values()), 1) @@ -21,7 +21,7 @@ def test_portfolio_performance(): hrp = HRPOpt(returns) with pytest.raises(ValueError): hrp.portfolio_performance() - hrp.hrp_portfolio() + hrp.optimize() assert hrp.portfolio_performance() diff --git a/tests/test_objective_functions.py b/tests/test_objective_functions.py index a18822a9..f89bf6bc 100644 --- a/tests/test_objective_functions.py +++ b/tests/test_objective_functions.py @@ -6,22 +6,37 @@ from tests.utilities_for_tests import get_data -def test_negative_mean_return_dummy(): +def test_volatility_dummy(): + w = np.array([0.4, 0.4, 0.2]) + data = np.diag([0.5, 0.8, 0.9]) + test_var = objective_functions.portfolio_variance(w, data) + np.testing.assert_almost_equal(test_var, 0.244) + + +def test_volatility(): + df = get_data() + S = sample_cov(df) + w = np.array([1 / df.shape[1]] * df.shape[1]) + var = objective_functions.portfolio_variance(w, S) + np.testing.assert_almost_equal(var, 0.04498224489292057) + + +def test_portfolio_return_dummy(): w = np.array([0.3, 0.1, 0.2, 0.25, 0.15]) e_rets = pd.Series([0.19, 0.08, 0.09, 0.23, 0.17]) - negative_mu = objective_functions.negative_mean_return(w, e_rets) - assert isinstance(negative_mu, float) - assert negative_mu < 0 - np.testing.assert_almost_equal(negative_mu, -w.dot(e_rets)) - np.testing.assert_almost_equal(negative_mu, -(w * e_rets).sum()) + mu = objective_functions.portfolio_return(w, e_rets, negative=False) + assert isinstance(mu, float) + assert mu > 0 + np.testing.assert_almost_equal(mu, w.dot(e_rets)) + np.testing.assert_almost_equal(mu, (w * e_rets).sum()) -def test_negative_mean_return_real(): +def test_portfolio_return_real(): df = get_data() e_rets = mean_historical_return(df) w = np.array([1 / len(e_rets)] * len(e_rets)) - negative_mu = objective_functions.negative_mean_return(w, e_rets) + negative_mu = objective_functions.portfolio_return(w, e_rets) assert isinstance(negative_mu, float) assert negative_mu < 0 assert negative_mu == -w.dot(e_rets) @@ -29,68 +44,49 @@ def test_negative_mean_return_real(): np.testing.assert_almost_equal(-e_rets.sum() / len(e_rets), negative_mu) -def test_negative_sharpe(): +def test_sharpe_ratio(): df = get_data() e_rets = mean_historical_return(df) S = sample_cov(df) w = np.array([1 / len(e_rets)] * len(e_rets)) - sharpe = objective_functions.negative_sharpe(w, e_rets, S) + sharpe = objective_functions.sharpe_ratio(w, e_rets, S) assert isinstance(sharpe, float) assert sharpe < 0 sigma = np.sqrt(np.dot(w, np.dot(S, w.T))) - negative_mu = objective_functions.negative_mean_return(w, e_rets) + negative_mu = objective_functions.portfolio_return(w, e_rets) np.testing.assert_almost_equal(sharpe * sigma - 0.02, negative_mu) # Risk free rate increasing should lead to negative Sharpe increasing. - assert sharpe < objective_functions.negative_sharpe( - w, e_rets, S, risk_free_rate=0.1 - ) + assert sharpe < objective_functions.sharpe_ratio(w, e_rets, S, risk_free_rate=0.1) + +def test_L2_reg_dummy(): + gamma = 2 + w = np.array([0.1, 0.2, 0.3, 0.4]) + L2_reg = objective_functions.L2_reg(w, gamma=gamma) + np.testing.assert_almost_equal(L2_reg, gamma * np.sum(w * w)) -def test_negative_quadratic_utility(): + +def test_quadratic_utility(): df = get_data() e_rets = mean_historical_return(df) S = sample_cov(df) w = np.array([1 / len(e_rets)] * len(e_rets)) - utility = objective_functions.negative_quadratic_utility( - w, e_rets, S, risk_aversion=3 - ) + utility = objective_functions.quadratic_utility(w, e_rets, S, risk_aversion=3) assert isinstance(utility, float) assert utility < 0 - mu = -objective_functions.negative_mean_return(w, e_rets) - variance = np.dot(w, np.dot(S, w.T)) + mu = objective_functions.portfolio_return(w, e_rets, negative=False) + variance = objective_functions.portfolio_variance(w, S) np.testing.assert_almost_equal(-utility + 3 / 2 * variance, mu) -def test_volatility_dummy(): - w = np.array([0.4, 0.4, 0.2]) - data = np.diag([0.5, 0.8, 0.9]) - test_var = objective_functions.volatility(w, data) - np.testing.assert_almost_equal(test_var, 0.244) - +def test_transaction_costs(): + old_w = np.array([0.1, 0.2, 0.3]) + new_w = np.array([-0.3, 0.1, 0.2]) -def test_volatility(): - df = get_data() - S = sample_cov(df) - w = np.array([1 / df.shape[1]] * df.shape[1]) - var = objective_functions.volatility(w, S) - np.testing.assert_almost_equal(var, 0.04498224489292057) - - -def test_cvar(): - df = get_data() - returns = df.pct_change().dropna(how="all") - w = np.array([1 / df.shape[1]] * df.shape[1]) - cvar0 = objective_functions.negative_cvar(w, returns, s=5000, random_state=0) - assert cvar0 > 0 - cvar1 = objective_functions.negative_cvar( - w, returns, s=5000, beta=0.98, random_state=0 - ) - assert cvar1 > 0 - - # Nondeterministic - cvar2 = objective_functions.negative_cvar(w, returns, s=5000, random_state=1) - assert not cvar0 == cvar2 + k = 0.1 + tx_cost = k * np.abs(old_w - new_w).sum() + assert tx_cost == objective_functions.transaction_cost(new_w, old_w, k=k) diff --git a/tests/test_risk_models.py b/tests/test_risk_models.py index 2c99f6e6..3220bda3 100644 --- a/tests/test_risk_models.py +++ b/tests/test_risk_models.py @@ -34,6 +34,7 @@ def test_sample_cov_real_data(): assert S.index.equals(df.columns) assert S.index.equals(S.columns) assert S.notnull().all().all() + assert risk_models._is_positive_semidefinite(S) def test_sample_cov_type_warning(): @@ -67,6 +68,7 @@ def test_semicovariance(): assert S.index.equals(df.columns) assert S.index.equals(S.columns) assert S.notnull().all().all() + assert risk_models._is_positive_semidefinite(S) S2 = risk_models.semicovariance(df, frequency=2) pd.testing.assert_frame_equal(S / 126, S2) @@ -90,6 +92,7 @@ def test_exp_cov_matrix(): assert S.index.equals(df.columns) assert S.index.equals(S.columns) assert S.notnull().all().all() + assert risk_models._is_positive_semidefinite(S) S2 = risk_models.exp_cov(df, frequency=2) pd.testing.assert_frame_equal(S / 126, S2) @@ -112,6 +115,10 @@ def test_min_cov_det(): assert S.index.equals(df.columns) assert S.index.equals(S.columns) assert S.notnull().all().all() + # Min cov det is NOT positive semidefinite for this example. + # Warning has been added to docs. + # assert risk_models._is_positive_semidefinite(S) + S2 = risk_models.min_cov_determinant(df, frequency=2, random_state=8) pd.testing.assert_frame_equal(S / 126, S2) @@ -146,6 +153,7 @@ def test_shrunk_covariance(): assert list(shrunk_cov.index) == list(df.columns) assert list(shrunk_cov.columns) == list(df.columns) assert not shrunk_cov.isnull().any().any() + assert risk_models._is_positive_semidefinite(shrunk_cov) def test_shrunk_covariance_extreme_delta(): @@ -180,6 +188,7 @@ def test_ledoit_wolf_default(): assert list(shrunk_cov.index) == list(df.columns) assert list(shrunk_cov.columns) == list(df.columns) assert not shrunk_cov.isnull().any().any() + assert risk_models._is_positive_semidefinite(shrunk_cov) def test_ledoit_wolf_single_index(): @@ -191,6 +200,7 @@ def test_ledoit_wolf_single_index(): assert list(shrunk_cov.index) == list(df.columns) assert list(shrunk_cov.columns) == list(df.columns) assert not shrunk_cov.isnull().any().any() + assert risk_models._is_positive_semidefinite(shrunk_cov) def test_ledoit_wolf_constant_correlation(): @@ -202,6 +212,7 @@ def test_ledoit_wolf_constant_correlation(): assert list(shrunk_cov.index) == list(df.columns) assert list(shrunk_cov.columns) == list(df.columns) assert not shrunk_cov.isnull().any().any() + assert risk_models._is_positive_semidefinite(shrunk_cov) def test_ledoit_wolf_raises_not_implemented(): @@ -220,3 +231,4 @@ def test_oracle_approximating(): assert list(shrunk_cov.index) == list(df.columns) assert list(shrunk_cov.columns) == list(df.columns) assert not shrunk_cov.isnull().any().any() + assert risk_models._is_positive_semidefinite(shrunk_cov) diff --git a/tests/test_value_at_risk.py b/tests/test_value_at_risk.py deleted file mode 100644 index 34f927d4..00000000 --- a/tests/test_value_at_risk.py +++ /dev/null @@ -1,53 +0,0 @@ -import numpy as np -import pytest -from pypfopt.value_at_risk import CVAROpt -from tests.utilities_for_tests import get_data - - -def test_init_cvar(): - df = get_data() - returns = df.pct_change().dropna(how="all") - vr = CVAROpt(returns) - assert list(vr.tickers) == list(df.columns) - - # Inheritance - assert vr.bounds == ((0, 1),) * len(df.columns) - assert vr.clean_weights - assert isinstance(vr.initial_guess, np.ndarray) - assert isinstance(vr.constraints, list) - - -def test_init_cvar_errors(): - df = get_data() - returns = df.pct_change().dropna(how="all") - with pytest.raises(ValueError): - vr = CVAROpt(returns, weight_bounds=(0.5, 1)) - with pytest.raises(AttributeError): - vr = CVAROpt(returns) - vr.clean_weights() - with pytest.raises(TypeError): - vr = CVAROpt(returns.values) - returns_list = df.values.tolist() - with pytest.raises(TypeError): - vr = CVAROpt(returns_list) - - -def test_cvar_weights(): - df = get_data() - returns = df.pct_change().dropna(how="all") - vr = CVAROpt(returns) - w = vr.min_cvar(s=100, random_state=0) - assert isinstance(w, dict) - assert set(w.keys()) == set(df.columns) - assert set(w.keys()) == set(vr.tickers) - np.testing.assert_almost_equal(vr.weights.sum(), 1) - - -def test_cvar_bounds(): - # TODO - pass - - -def test_cvar_beta(): - # TODO - pass