ML

ML as a Trader. Phase One - my PoC for Beginners

6 min read

Article image

Hey guys! Here is my next read.

Like many other developers, I often like to look for useful information on the Internet. And like many other developers, I don’t like it when there’s too much text with too little code, especially when I expect to see the opposite. So I will try and only add the text to something I want you to pay attention to. Well, maybe also to something I have extra thoughts on. Let’s get started.


ML for Trading: How to Use Machine Learning to Make Better Trading Decisions

ML for trading

So, I’m a software developer who is really interested in trading. I won’t write anything about my trading experience per se – I just like stock trading and how it works. I also like ML (so-called AI). Naturally, I’ve started to wonder, why can’t you make an ML algorithm that helps to earn* money in trading. So I’ve conducted some research on domains and ML algorithms that can be applied to it.

As we know, for each task there is only one algorithm, and trading is no exception. So here comes the MAIN and very important NOTE for you guys to understand before you continue reading or even start coding:

  • My PoC work is not for long-term investors, even for week-long trading.
  • My PoC work is for less than INTRADAY time range trading, I would even say it’s for per-minute trading (so-called ‘scalping’) – sometimes beginner traders start with it, and some stay and work in such a manner (regime).
  • My PoC work is not for longer than 1-hour trading
  • It can be used as one of the tech tools to understand the current situation and its movement trend

WHY?

Because:

  • The data I use for the ML algorithm prediction is sampled only with the next FEATURES - date/time, open, and volume. There’s no external data like Elon Mask tweets, weather forecast in NY or other info for fundamental analysis.

  • The information which the PoC prediction gives is helpful, but I can hardly believe that we can put the Tech analysis train on the track of long-term trading. We have a lot of examples where stocks/cryptocurrencies collapsed sharply in one day just because of some external reasons. And our ML cannot include those reasons as data for now.

  • You have time for a reaction in the next 15 mins if you see stock trends changing and decide to go short or long.

  • Long story short – there is a smaller percentage of external factors that can influence the trend for the next 15 minutes. However, more can happen within a day or a week

THE TARGET

Our target is to create/use/remake and train an ML algorithm, which sees* the last 2 hours' specific Stock behavior and gives us a prediction of where the trend will go for the next 15 minutes. Spoiler alert: I did try to make one algorithm that will work for any stoke we feed in – didn’t work; better to focus and get good accuracy for one defined Stock. Also, you can experiment with 30 or 60 minutes. However, it can’t predict exact prices for each minute for the next hour or half an hour. For this, we would need more features in the dataset, a different ML algorithm, and more hardware resources. Then we might reach good accuracy. I personally haven’t reached it yet. Besides, to write a trading bot or even trade manually, it’s enough to know* the trend and has some confidence in it.

In this article, Phase One, we will use historical data (Phase 2, 3 and the rest will be about stress and validation tests using the trading platform and its API).

THE CODE:

the code

Used Google Colab for PoC connected to Google Drive where the dataset is.

data

We’ve divided all data sets into a training Set (until 2018-02-06) and we will test on Data after 2018-02-06 (this data our algorithm has never seen*). With this test set, we can compare how LST would predict and what real stock price is.

Get training data from gDrive:

training data

Let’s see what we have:

results

Configuration params and very simple helpers to understand how we generate a dataset of the right dimension:

сonfiguration params
dataset сonfiguration
refining dataset configuration
defining dataset presentation

We choose just one stock from many (others are cut off) so that our NN will understand the psychology of one Stock: behavior when a great shark comes in, when the crowd is in a good mood or a bad mood, when the panic or euphoria begins, and other factors which we as humans can not notice or understand in the midst of a big chunk of data, but the algorithm can.

We don’t have many features but having price (in our case Open) and volume (the amount of trading for the current timeframe) can be a good starting point.

adding prince and volume

Here we can compare Open price and Volume for the chosen timeframe, to have a better understanding of how it’s traded.

open price and volume comparison
managing timeframes in code
A-stock Open Price

Volume

Volume line graph

Define scaler (you can experiment with different scalers):

define scaler

For the training set, we use scaled X and not scaled Y:

training set
training

Why did we choose LSTM? Long story short – this NN algorithm works with sequential data like time. You also can experiment with LSTM architecture to reach better results:

LSTM

We also set some early stopping - which means - if training doesn’t improve anymore with prediction accuracy and loss - we stop the training on the current epoch.

early stopping

Start training:

training
model training visualization

Let`s test it with data in time that NN has never seen before - future time for it*.

Import Test data:

test data import

Generate test set:

test set

Generate dataset with scaled X:

dataset with scaled X

Get predicted data:

predicted data

Get the right format for the plot:

right format for the plot

We have different timeframes predictions, let’s see a few of them:

timeframes predictions

RED - Real trend line movement

GREEN - predicted

Result 1:

result 1
stock price prediction

Result 2 - trendline down

trendline down
stock price

Result 3 - a trendline a bit up

a trendline a bit up
result 3

Again, we don’t predict the exact price in this time frame. We predict where the trendline moves, which helps us to understand whether to go Long or Short.

Share your software needs with us

Full Name*
Email*
About project*

Budget in USD

By submitting this form I agree with the Privacy Policy

What happens after you get in touch?

  • Intro call

    During a 30-minute meeting, our domain expert dives into your business and describes the steps for future collaboration.

  • Free discovery workshop

    Together with you, we clarify the requirements and define the user flow, feature list, and project risks. After that, we set up an engagement process to make your journey smooth.

  • Project planning

    Based on the info gathered and your business objectives, we provide the implementation plan, timelines and estimations for your project.