As an information scientist, probably the greatest issues about working with DataRobot prospects is the sheer number of extremely attention-grabbing questions that come up. Not too long ago, a potential buyer requested me how I reconcile the truth that DataRobot has a number of very profitable funding banks utilizing DataRobot to reinforce the P&L of their buying and selling companies with my feedback that machine studying fashions aren’t at all times nice at predicting monetary asset costs. Peek into our dialog to study when machine studying does—and doesn’t—work nicely in monetary markets use instances.
Why is machine studying capable of carry out nicely in excessive frequency buying and selling functions, however is so dangerous at predicting asset costs longer-term?
Whereas there have been some successes within the trade utilizing machine studying for worth prediction, they’ve been few and much between. As a rule of thumb, the shorter the prediction time horizon, the higher the chances of success.
Usually talking, market making use instances that DataRobot (and different machine studying approaches) excel at share a number of of the next traits:
For ahead worth prediction: a really quick prediction horizon (sometimes inside the subsequent one to 10 seconds), the provision of fine order ebook information, and an acknowledgment that even a mannequin that’s 55%–60% correct is helpful—it’s finally a proportion sport.
For worth discovery (e.g., establishing an acceptable worth illiquid securities, predicting the place liquidity can be positioned, and figuring out acceptable hedge ratios) in addition to extra usually: the existence of fine historic commerce information on the property to be priced (e.g., TRACE, Asian bond market reporting, ECNs’ commerce historical past) in addition to a transparent set of extra liquid property which can be utilized as predictors (e.g., extra liquid credit, bond futures, swaps markets, and so forth.).
For counterparty conduct prediction: some type of structured information which comprises not solely received trades but in addition unsuccessful requests/responses.
Throughout functions: an info edge, for example from commanding a big share of the circulation in that asset class, or from having buyer conduct information that can be utilized.
Areas the place any type of machine studying will battle are sometimes characterised by a number of of those facets:
Quickly altering regimes, behaviors and drivers: a key purpose why longer-term predictions are so exhausting. We fairly often discover that the important thing mannequin drivers change very commonly in most monetary markets, with a variable that’s a helpful indicator for one week or month having little info content material within the subsequent. Even in profitable functions, fashions are re-trained and re-deployed very commonly (sometimes no less than weekly).
Rare information: a traditional instance right here is month-to-month or much less frequent information. In such instances, the conduct being modeled sometimes adjustments so usually that by the point that sufficient coaching information for machine studying has accrued (24 months or above), the market is in a special regime. For what it’s value, a couple of of our prospects have certainly had some success at, for example, inventory choice utilizing predictions on a one-month horizon, however they’re (understandably) not telling us how they’re doing it.
Sparse information:the place there’s inadequate information accessible to get an excellent image of the market in mixture, similar to sure OTC markets the place there aren’t any good ECNs.
An absence of predictors: basically, information on previous conduct of the variable being predicted (e.g., costs) isn’t sufficient. You additionally want information describing the drivers of that variable (e.g., order books, flows, expectations, positioning). Previous efficiency just isn’t indicative of future outcomes… .
Restricted historical past of comparable regimes:as a result of machine studying fashions are all about recognising patterns in historic information, new markets or property may be very troublesome for ML fashions. That is recognized in academia because the “chilly begin drawback.” There are numerous methods to cope with it, however none of them are good.
Not really being a machine studying drawback: Worth-at-Danger modeling is the traditional instance right here—VaR isn’t a prediction of something, it’s a statistical summation of simulation outcomes. That stated, predicting the result of a simulation is an ML drawback, and there are some good ML functions in pricing complicated, path-dependent derivatives.
Lastly, and except for the above, a vital success think about any machine studying use case which shouldn’t be underestimated is the involvement ofsuccesful and motivated individuals (sometimes quants and typically information scientists) who perceive the information (and how you can manipulate it), enterprise processes, and worth levers. Success is normally pushed by such individuals finishing up many iterative experiments on the issue at hand, which is finally the place our platform is available in. As mentioned, we massively speed up that strategy of experimentation. There’s loads that may be automated in machine studying, however area information can’t be.
To summarize: it’s truthful to say that the chance of success in buying and selling use instances is positively correlated with the frequency of the buying and selling (or no less than negatively with the holding interval/horizon) with a couple of exceptions to show the rule. It’s additionally value making an allowance for that machine studying is commonly higher at second-order use instances similar to predicting the drivers of markets, for example, occasion threat and, to some extent, volumes, moderately than first-order worth predictions— topic to the above caveats.
Concerning the writer
Managing Director, Monetary Markets Information Science
Peter leads DataRobot’s monetary markets information science apply and works carefully with fintech, banking, and asset administration shoppers on quite a few high-ROI use instances for the DataRobot AI Platform. Previous to becoming a member of DataRobot, he gained twenty-five years’ expertise in senior quantitative analysis, portfolio administration, buying and selling, threat administration and information science roles at funding banks and asset managers together with Morgan Stanley, Warburg Pincus, Goldman Sachs, Credit score Suisse, Lansdowne Companions and Invesco, in addition to spending a number of years as a accomplice at a start-up world equities hedge fund. Peter has an M.Sc. in Information Science from Metropolis, College of London, an MBA from Cranfield College Faculty of Administration, and a B.Sc. in Accounting and Monetary Evaluation from the College of Warwick. His paper, “Searching Excessive and Low: Visualising Shifting Correlations in Monetary Markets”, was printed within the July 2018 problem of Pc Graphics Discussion board.