Training

Use order book data, instead of derived OHLC + volume data. Therefore, for training and prediction, use data that looks like this:

  • Split the data into a time series of a certain size (size is a parameter to tune).
  • Cluster the time series data into K clusters (K is a parameter to tune). It's assumed that clusters with some natural trends would appear (sharp drop/rise in price and so on).
  • For each cluster, train the regression and classifier to predict the price and price change, respectively.