Machine Recognition in Finance-Features and Indicators
By Richard Hoppe, ITRAC.com
Recent advancements in machine learning have demonstrated that the market can be timed with enough accuracy to significantly improve the returns relative to the “buy and hold” strategy. We will demonstrate a way to accomplish this, by uncovering proprietary “black box” software and systems. This software (see ITRAC.com) has been used by large international banks and financial institutions to direct trading. We present and discuss the main concepts, approaches and algorithm below.
Beating the market is hard to do. Most money managers fail to produce consistent, above average returns. Managers might exceed a benchmark’s returns for some limited period of time, but eventually luck runs out. This article describes a method that consistently improves performance in the market, over extended periods, even during difficult and volatile times. This has been made possible with recent advances in deep neural networks and learning algorithms.
The motivation for disclosing this approach is in part due to frustration at the lack of rigor and discipline of some contemporary forecasting approaches. For example, looking at the stars and determining what the market may do tomorrow seems very wrong to us.
Then how do you determine how to make above “normal” returns? You can do your own research, write you own algorithms, verify your work, and if you have enough time, energy and luck, perhaps discover an edge. Or you could follow some of the many commercially available systems long enough to determine if they really do have an edge. Unfortunately, it takes a long time to distinguish luck from skill in this business. I’m sure you’ve heard the saying “everyone’s a genius in a bull market.” Unless you’ve monitored performance through several difficult market periods, (i.e., 1987, 2008, etc), you don’t really know much about the expected performance in the next difficult market. Your best option is to look under the hood of a system which appears to be successful and attempt to understand its methodology. If the methods and techniques are something you understand and agree with, and they appear mathematically sound and statistically correct, then perhaps you might weigh that advice more heavily in your investment decisions.
Contemporary technical analysis involves the human understanding of patterns and charts. The effectiveness of technical analysis is a matter of much controversy, mostly because it’s subjective interpretation. A myriad of price patterns such as: triangles, continuations, reversals, doji hammers, etc. that exist.
A quick search of Wiki yields additional examples like:
● Double top and double bottoms
● Flag and pennants
● Head and shoulders
● Island reversals
● Price channels
● Triple top and triple bottoms
● Wedge patterns
As an example, here is a graphic of the “flagpole” and “pennant” pattern:
The problem with this traditional approach has been the difficulty of specifying the chart patterns in a manner that permits objective testing. All of the patterns shown above are not pinned to either a time or a price axis. They consist of two dimensional patterns with no constraints on time or price. These patterns can occur over two minutes, three days, or four weeks. It’s like seeing things in the clouds.
Assuming one does subjectively “recognize” one of these patterns-- how often, how much, and when would it result in a profitable trade? Very rarely do we get this kind of detailed information. Technical analysis should be approached in a methodical and disciplined manner. It should be restricted to objective methods that can be simulated on historical data. In addition, the performances indicated by testing should be correctly evaluated.
Let’s look at a better methodology that uses less subjectivity. IntelliTrade, Inc. has a near term (next day) S&P 500 Index forecaster called ITRAC. This high quality forecast gives either a “long” or “flat” position signal each evening. This information is then used to modulate position sizes to both increase returns and reduce draw downs. Position changes occur on average less than once a week, so it is not necessary to make frequent position changes.
The information is used to enhance performance relative to a “buy and hold” market strategy. It can be applied to the S&P 500 index via futures, ETF’s, or baskets of equities acting as a proxy to the index, etc. Users can fully modulate their holdings based on the forecast, or can scale in and out of positions as lightly or aggressively as they choose. ITRAC uses computer programs that learn statistically relevant information from past price patterns to make inferences about future price changes in the following way.
Each day that the S&P 500 index is traded, the new closing price is appended to the end of the data stream. The system is run on the updated data, and an output is produced. The output is a number from 0 to 10, which represents the likelihood of the S&P500 index being higher on its next close. A 0 output is taken as least likely to be higher, while a 10 is taken as most likely to close higher on the next traded day. Typically, if that output is above a user set threshold, then a long position is established on the S&P500 index with an after-market trade on an ETF, or a futures contract, etc. We won’t discuss the mechanics of the trading, or money management issues here. Even though money management and trading mechanics are important subjects, a good forecast seems much more elusive and valuable these days.
The following shows how ITRAC establishes a principled classification of patterns, in order to objectively specify, reproduce and observe the statistical outcomes. ITRAC uses techniques recently referred to as “deep learning.” Deep learning is part of a broader family of machine learning methods based on learning representations of data. An observation (e.g., an image) can be represented in many ways such as a vector values per pixel, or in a more abstract way as regions of particular shape, etc. Some representations are better than others at simplifying the learning task. One of the promises of deep learning is replacing handcrafted features with efficient algorithms for feature learning.
Let’s begin by defining a feature called a price “trace.” A price trace is a 2-dimensional plot of closing price against time, the horizontal axis being time, in days, and the vertical axis being the percent change of price, relative to the last price in the series. A trace can be any length of time, for instance, the last 60 days. A typical 60 day “price trace” could look like this:
Note that this trace could have occurred anytime in the past, or it could be current. Think of it as any 60 day segment of the price. Note also that any number of these segments can be drawn on the graph, all plotted relative to their own last price in the series:
These traces can be thought of as “patterns” or “features.” Each S&P 500 closing price has its own history or “trace” associated with it. It also has its own future price(s) associated with it. The task is to correlate the past “trace” with its future price, in order to make a forecast.
Let’s call the surface that these traces are drawn on a “MAP”. It is simply a matrix, grid, Cartesian coordinates or x-y plot. The important thing to note is that all of the traces focus into a single point at the far right, because all of the traces are plotted relative to their own closing price at the far right.
The next image shows thousands of “traces” from the S&P 500-time series, all drawn on a “map”, focused at the far right. If the price of each particular trace on the next day was higher than the reference closing price, it was colored green. If the next day's close was lower, it was colored red.
This clearly shows that most of the traces that resulted in a higher close on the next day (concentrated in the predominantly green areas) approached today’s closing price along a path from the lower left. This would at first confirm the notion of price momentum. Note that much of the red area (indicating a future price lower than today’s close) exists in the upper left portion of the map’s trace area. This indicates that falling prices in that time scale were likely to be lower.
In other words, by simply projecting today’s “trace” onto the map surface, and noting if it was located in more red, or more green area as it approached today, one could infer the likely future price with some as yet unknown accuracy. The advantage of this pattern recognition approach is that is it completely objective, reproducible, and testable. It is completely defined mathematically, with no subjective interpretations of time scale or price scale. Hopefully the general concept of using a trace as a representation for pattern recognition and binary classification tasks has been demonstrated.
Let’s further refine the concept. Suppose in addition to coloring each cell(pixel) red, (which had a lower future price outcome), or green (which had a higher future price outcome), we instead average a +1 into each cell along the trace’s path if it resulted in a higher close, and averaged a -1 into each cell whose trace resulted in a lower price in the future. This would form a 3-dimensional map surface. The higher ”ridges” would represent area’s along trace paths where past prices were more likely to result in a higher future price, and the valleys or red areas representing likely lower future prices. By laying a new trace on the map, and summing all the cells that the new trace crosses, we can get a number that is correlated with the likely outcome or classification of that trace: win or loss. The maps encode discrete win/loss frequency distributions for various time periods.
What we are really doing here is a form of machine learning. By training the map (a perceptron grid) with exemplars of what happened in the past, we are training the map to represent in a statistical fashion the likelihood of a positive or negative outcome. This is part of a statistical binary classifier algorithm which forms the basis of the deep learning systems that are used to make short term S&P 500 forecasts by ITRAC.
Next, let’s look at a map where a sine wave was used to form the traces instead of price data. Note the high and low areas of the map, and imagine how laying any new trace on the surface would yield its expected short term outcome as described above:
In the above case, a sine wave is completely deterministic, and is easy to forecast with these methods. ITRAC uses various deterministic time-series’ to calibrate and validate its forecasting algorithms.
Next, let’s look at a map generated from pseudo-random time series. As is visible, there are no significant ridges or valleys in the map surface, mostly just noise, indicating a highly random input time-series data, with very little opportunity to forecast reliably.
It should be fairly clear, after studying the above three maps, that S&P 500 index prices are neither completely deterministic, nor are they completely random. They exist somewhere between those two “states”. They are predictable, but not perfectly so. Modern computer tools allow looking at data, with machine learning, data visualization, deep neural nets, and statistical data manipulation and mining libraries. These tools open up new opportunities to those who have the access to and skills to manipulate them.
Consider that a map as described above can be thought of as a “retina” onto which price traces can be projected. In this sense, each map is like an artificial eye, which can be trained to “see” market opportunity using statistical pattern recognition via machine learning algorithms. Consider also, that there are many ways to project the index price history traces onto the artificial retinas. The trace depth (number of history days) can be altered. The y-axis scaling can be altered. Both of these have big impacts on the binary classification output of the maps surface. There are many other time series transforms which can be applied to the data stream before it is projected onto the retina, including non-linear Fourier, Laplace and Wavelet transforms. Each of these data transformations allows the “eye” to specialize in seeing different patterns that may be presented to it. It is then “trained” as discussed above to classify the image with the smallest error rate in binary classification, i.e., win/loss.
Below is a surface simultaneously trained with 16 different maps, or retinas on its surface. Each map uses a different scale, focal point, trace depth or other transformation to improve overall classification accuracy. Importantly, those map building parameters are optimized using Genetic Algorithms. So what we have are many machine trained artificial “eyes” whose characteristics have been evolved through genetic time to optimize their ability to classify market time series data. These perceptron fields or “retinas” form the lowest level of IntelliTrade’s deep learning neural network.
It is obvious to see the concentration of red and green areas on the 16 maps above, clearly indicating the non-random nature of index price changes. If the price changes were sequentially independent and random, the maps would be much flatter without concentrations of green or red areas, and much more distributed as shown in the random series map earlier in this article.
Because these map surfaces are projected onto 2 dimensional grids, virtually any pattern can be represented on them, be it flag, pennant, head and shoulders, or whatever. If a particular pattern was more likely to result in a higher closing price in the future, it can be “seen” by the map, and importantly, quantified. Now we can know about the likelihood of ANY chart pattern producing a desired outcome. The maps subsume all of the specific patterns that conventional Technical Analysis uses, because of its general representational ability. The “maps” are powerful and profitable time series classification tools.
Now that we have a general understanding of maps, and how they act as highly effective binary classifiers, we can move on to the concept of an “artificial trader” or “AT”.
IntelliTrade uses what are called “artificial traders”. Each AT has many “eyes” as described above, which allow the AT to recognize statistically profitable patterns. In addition to evolving the parameters or variables that control the eye traits, each AT has trading and risk control parameters that are simultaneously co-evolved using Genetic Algorithms (GA). These trading control parameters include things like: thresholds for establishing new positions, number of lots per trade, number of lots allowed on at one time, maximum drawdowns permitted, and other trading control parameters. The GA target fitness metric is a variant of n-fold cross validated [risk adjusted] returns.
Each accepted trader’s performance is then monitored in real time for many months (walk forward validation testing). Only after an AT’s edge can be verified with statistical significance, will it be allowed to enter the “trading pool” with other verified AT’s. Each AT in the trading pool is permitted to vote on whether to establish a long position in the S&P500 or not. The trading pool consists of hundreds of genetically evolved artificial traders. Their votes are weighted, summed, and scaled. This is the top layer of a deep learning neural net which has been described to this point. If the final system output (between 0 and 10) is above a threshold, a new long position may be established.
The following is a representation of ITRAC’s performance relative to the “buy and hold” strategy over the last 10 years:
Unlike the “buy and hold” equity strategy which exposes you to EVERY market turndown, ITRAC is only long the market about 20% of the time, on average. It patiently sits out, waiting for the opportunity to take a quick one-day profit. It strikes quickly and then retreats to the sidelines waiting for the next opportunity. It has an uncanny ability to “see” those opportunities with its genetically evolved pattern recognition “eyes” and trading parameters as described above. It’s not always correct… that’s a pipe dream. But it is very, very good.
This has been an introductory look at the systems which have been developed over several decades. Please visit ITRAC.com for more detailed performance specifications and additional information on the system described above. In quick summary, it delivers about a 2:1 win/loss ratio advantage using just one day holds, significantly improving returns and Sharpe Ratio relative to the “buy and hold” strategy. It’s like playing a roulette wheel with twice as many red as black colors. (You’d probably want to bet on the color red.) Unfortunately, that roulette wheel doesn’t really exist.
THE SPECIAL OFFERGet Your Next Day S&P 500 Forecaster with ITRAC Here!