Meet Pam

Project files repo

Pamela, you can call her Pam, recently started Forex trading.  She is a complete beginner but she is extremely hard working so I believe she can be profitable soon. Pam, unlike most forex trader, is not human.  PAM's name derives from Price Action Machine, an artificial neural network that currently learns from candlestick patterns on currency pairs.  She has a simple neural network structure as such:

Pamela is an ANN
Pamela is an ANN

With number of independent (feature) variables and number of nodes in the hidden layer.  The connections between input to hidden and to output consists of two matrix of weights.  We can call them Input Weights, a matrix (Connection between input and hidden) and Output Weights, a matrix (Connection between hidden and outer).  The hidden layer receives the Input weights and inputs, calculate output using the sigmoid transfer function:

Finally, the output from each node is evaluated with the Output Weights into one answer, transferred using sigmoid and this tells of Pam's confidence level of whether the forex pair will go up with  1 = absolutely certain and 0 = absolutely not certain.  There are a lot more calculations to this NN than here but it is a simple feedforward/backpropagated network that can have complex non-linear fits.  Much of this can be learned on Coursera for free here.

Pam's Learning Subject

Technical analysts coin the term price action as in the current patterns of movement a price chart is showing for a specific security.  Popular methods include Japanese candlesticks, Bar chart, Point & Figure, etc.  Price actions are the hallmark of technical analysis for some traders as they describe it as the only indicator able to show market sentiments in real time.  I wanted to know if a machine can pick up Price Action trading as good as a human trader can.


Let be a feature in the feature vector.  We can define 7 features based on traditional japanese candlestick criteria:

- Length of the Upper wick of the candle

- Length of the Lower wick of the candle

- Degree of dominance of wicks by subtracting from

- Body length of the candle

- Total length of the candle include wicks

- Previous period return in binary 1 = Positive ret, 0 =Negative ret

- Previous period Body length

Much more features can be added but I wanted Pam to take it slow and add more learning topics as she progresses.  Our data answer set is a vector of binary , where 1 = next period positive return and 0 = equal next period negative return

Training & Diagnosing Pam

Pam was fed 3439 training examples for the currency pair USDJPY from Questrade Metatrader platform.  The time frame of training example is between August 11, 2014 and June 19, 2012.  The candles/feature vector are 4 hour in length for each example.  Pam learnt off of a regularized logit regression cost function. fmincg is a custom function found here.

options = optimset('GradObj','on', 'MaxIter', 300);
costFunction = @(p) cost(X, Y, n, j, p, lambda);
[t_theta, J] = fmincg(costFunction, theta, options);

Pam's first run of training returned an error (MSE) of 0.6925, really crappy.  I figured I might need to teach her some more complex topics like polynomials.  I generated code to add an additional 28 features.  After that Pam was retrained and had an MSE of approx. 0.6908.  Clearly this model was overfitted so I got Pam to test her training on an out-of-sample test.  She went to the training example to randomly select 20% of training example as her validation set and trained off of the rest

Here's a snippet of the code

cv_size = ceil(size(X,1)*0.2);
Xval = zeros(cv_size, size(X,2));
Yval = zeros(cv_size, size(Y,2));
for i=1:cv_size
  rn = ceil((size(X,1)-1) * rand() + 1); % Random range
  Xval(i,:) = X(rn,:);
  Yval(i,:) = Y(rn,:);
  % Delete validation example from training set
  X(rn, :) = []; 
  Y(rn, :) = []; 

Since I'm sure the model has been overfitted, I can adjust Pam's regularization lambda to penalize for higher weights.  I know that as lambda increases, training errors increase but an overfitted model would decrease in validation errors.  I wanted to find the most optimal test through the learning curve.  I set lambda from 1 to 50 and let it loop and test on both training and validation set:

From Overfit to perfect to underfit
From Overfit to perfect to underfit

The green line is the Validation set error and blue line to be training error.  The lines tell that at first, while training error is low, Pam failed to generalize and did bad on the validation set.  It reached a point of optimality and then diverged again as the lambdas were forcing Pam to underfit the data.


Re-applying this concept, I added Number of nodes into the lambda mix in case they were intercorrelated and reran the training but only recorded the validation cost.  The surface plot shows that sample optimality is reached at


In the next post, I will be putting in more features as a friend suggested since he knows more about price action.  I will also let Pam make a few test trades to see how she do based off of pure price action trading!  Furthermore, I may teach Pam more polynomial features to see if that improves fit. Stay tuned! 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *