We consider the equilibrium drawdown distribution for a biased random walk — in the context of a repeated investment game, the drawdown at a given time is how much has been lost relative to the maximum capital held up to that time. We show that in the tail, this is exponential. Further, when mean drift is small, this has an exponent that is universal in form, depending only on the mean and standard deviation of the step distribution. We give simulation examples in python consistent with the results.
We give a short introduction to neural networks and the backpropagation algorithm for training neural networks. Our overview is brief because we assume familiarity with partial derivatives, the chain rule, and matrix multiplication.
We also hope this post will be a quick reference for those already familiar with the notation used by Andrew Ng in his course on “Neural Networks and Deep Learning”, the first in the deeplearning.ai series on Coursera. That course provides but doesn’t derive the vectorized form of the backpropagation equations, so we hope to fill in that small gap while using the same notation.
Here, we highlight one of the most important benefits of tax protected accounts (eg Traditional and Roth IRAs and 401ks). Specifically, we review the fact that not having to pay taxes on any investment growth that occurs while the money is held in the account results in compounding / exponential growth with a larger exponent than would be obtained in a traditional account.
We illustrate the application of two linear compression algorithms in python: Principal component analysis (PCA) and least-squares feature selection. Both can be used to compress a passed array, and they both work by stripping out redundant columns from the array. The two differ in that PCA operates in a particular rotated frame, while the feature selection solution operates directly on the original columns. As we illustrate below, PCA always gives a stronger compression. However, the feature selection solution is often comparably strong, and its output has the benefit of being relatively easy to interpret — a virtue that is important for many applications.
This is a tutorial post relating to our python feature selection package,
linselect. The package allows one to easily identify minimal, informative feature subsets within a given data set.
Here, we demonstrate
linselect‘s basic API by exploring the relationship between the daily percentage lifts of 50 tech stocks over one trading year. We will be interested in identifying minimal stock subsets that can be used to predict the lifts of the others.
This is a demonstration walkthrough, with commentary and interpretation throughout. See the package docs folder for docstrings that succinctly detail the API.
- Load the data and examine some stock traces
- FwdSelect, RevSelect; supervised, single target
- FwdSelect, RevSelect; supervised, multiple targets
- FwdSelect, RevSelect; unsupervised
The data and a Jupyter notebook containing the code for this demo are available on our github, here.
linselect package can be found on our github, here.
It has been quite awhile since I have posted, largely because soon after I started my job at Square I had a child! I hope to have some newer blog post soon. But along those lines I want to share a blog post I did with a coworker (Juan Hernandez) for Square that gives a taste of some of the cool data science work we have been up to. This post covers work we did to create a framework for making models interpretable.