Backpropagation

 

The Fall of Artificial Neural Networks: XOR gates

In the 1969 book ‘Perceptrons: an introduction to computational geometry’, Marvin Minsky and Seymour Papert demonstrated that single-layer Artificial Neural Networks could not even implement an XOR (‘exclusive or’) logical function. This was a big disappointment. In the history of Artificial Neural Networks, this is seen as a significant contributor to the ‘AI winter’ of reduced interest (and hence also of reduced funding) in them.

The Rise of Artificial Neural Networks: Back-Propagation

The backpropagation algorithm effectively solved the exclusive-or problem in that:

  • To implement XORs required one or more hidden layers in the network (between the inputs and the output layer).
  • The backpropagation algorithm enabled multi-layer networks to be trained.

This contributed to a resurgence of interest in Artificial Neural Networks. Backpropagation was invented independently a number of times, most notably by Paul Werbos (1974), Rumelhart, Hinton and Williams (1986) and Yann LeCun (1987).

Watch Victor Lavrenko’s Youtube for more technical details on the XOR problem…

The Backpropagation Code

The purpose of this post is to provide example code for the backpropagation algorithm and demonstrate that it can be solve the XOR problem.

As noted elsewhere:

  • The code here is unapologetically ‘unpythonic’.
  • If you do not have a Python application installed, you can open the online https://repl.it/languages/python3interpreter in a new window and use that. All code fragments are combined at the end of this piece into a single listing that can be copy-pasted into the interpreter.

As well as being unpythonic, the code here differs from typical implementations in that it can handle more than 2 layers. The code can be configured for any full-connected feed-forward network of any number of layers greater than 1 and any number of neurons for each layer.

Some Housekeeping

Firstly, let’s sort out some housekeeping. Here are 2 functions so that:

  • We can pause the run to see things before they disappear off the top of the screen. (We can stop if we type ‘n’)
  • We can control how much information gets printed out by varying a ‘verbosity’ variable value.
def prompted_pause(s):
    import sys
    ok = input(s)
    if ok=="n" or ok=="N":
        print("Stopping here")
        sys.exit()

verbosity = 1

def print_info(v,s, end="DEFAULT"):
    if verbosity >= v:
        if end == "DEFAULT":
            print(s) # With newline
        elif end == "":
            print(s, end="") # Without newline
        else:
            print(s, end)

Where there is a call to print_info(3, “Blah”), the 3 means that the message “Blah” will only get printed out if the verbosity level is 3 or more. Across the whole program below, verbosity levels are such that:

  • If verbosity is set to 1, it will only print out the minimal.
  • If verbosity is set to 2, it will only print out more.
  • If verbosity is set to 3, it will only print out the minimal.

The Application

The neural network will be trained to behave like a ‘full adder’ circuit. This is a common building block in digital electronic circuits. It adds up three 1-bit numbers to produces a 2-bit output number (range 0…3). The ‘CI’ and ‘CO’ signals are the carry-in and carry-out respectively. As an example application, by chaining 32 of these circuits together (connecting the CO output of one full adder to the CI input of another) we get a circuit that adds two 32-bit numbers together.

http://cs.smith.edu/dftwiki/index.php/Xilinx_ISE_Four-Bit_Adder_in_Verilog

Full Adder circuit

The full adder has been chosen because:

  • It contains at least one XOR gate (it has 2), to demonstrate that a multilayer network can learn this non-linearly-separable behaviour, and
  • It has more than one output (it has 2), to provide Python code that is more generalised.

This is not a good example of what a neural network could be used for. Here, there are only 8 possible combinations of inputs. Any 3-in 2-out (combinatorial) function can be defined with just 16 bits of information.

The training set is the same as the test set and just defines the LINK truth table of a full adder. After training, the network will be tested against all the 8 possible input combinations. A more appropriate application is where then number of possible input combinations is much greater than the number of vectors it can be trained and tested against.

# Full Adder example:
Training_Set = [
    # A  B  CI   S  CO
    [[0, 0, 0], [0, 0]],
    [[0, 0, 1], [0, 1]],
    [[0, 1, 0], [0, 1]],
    [[0, 1, 1], [1, 0]],
    [[1, 0, 0], [0, 1]],
    [[1, 0, 1], [1, 0]],
    [[1, 1, 0], [1, 0]],
    [[1, 1, 1], [1, 1]]
]
# Bit assignments...
SUM   = 1
CARRY = 0

For example, there are 2 bits set to 1 in the input [0, 1, 1] so the sum is 2 which is binary ‘10’, so the output vector is [1, 0].

A Neuron

We now define the code for a single neuron. For each neuron, we need to have:

  • A list of weights, one for each neuron input (from the layer below).
  • A bias – this behaves in the same way as a weight except the input is a constant ‘1’.
  • A gradient ∂E/∂z. This is used for training the network.

The FeedForward_Neuron function calculates a neuron’s output y, based on its inputs x, and its bias b. A sum-of-products is formed:

z = Σwi.xi + b

and the output y is derived from that using the Sigmoid function:

y = σ(z)

The Sigmoid function provides the non-linearity that allows the network to learn non-linear relationships such as the ‘XOR’ function (compare this with the simple, linear network in ‘Fish, Chips, Ketchup’).

import math

class Neuron:
    def __init__(self, bias):
        self.B = bias
        self.W = []
        self.dEdz = 0.0

""" The logistic function """
def Sigmoid(z):
    return 1 / (1 + math.exp(-z))

""" Generate neuron output from inputs"""
def FeedForward_Neuron(inputs, bias, weights):
    z = bias
    for i in range(len(inputs)):
        z += inputs[i] * weights[i]
    return Sigmoid(z)

We start with the list of weights being empty; we will fill these in as we build up the network of neurons.

As in previous posts, the code is not Pythonic. Here, there are no ‘def’ functions (‘methods’) defined within any class. All functions are outside which means they require all the information used to be passed as parameters to the function. This is to make it clear what the dependencies are. Python examples of back-propagation available elsewhere on the interweb will used classes properly and use vector operations where I have used for loops.

A Layer of Neurons

A neuronal layer is then just an array of neurons. The biases and weights of the neurons all get initialized to random values before any training is done.

Updating a neuronal layer is just updating each neuron in turn.

import random

class NeuronLayer:
    def __init__(self, num_neurons, num_inputs):
        self.Neuron = [] # Build up a list of neurons
        for n in range(0, num_neurons):
            print_info(3,"  Neuron[%d]" % (n))
            # Add a neuron to the layer, with a random bias
            self.Neuron.append(Neuron(random.random()))
            print_info(3,"    Bias = %.3f" % self.Neuron[n].B)
            # Give it random weights
            for i in range(0, num_inputs):
                self.Neuron[n].W.append(random.random()) # Initialized randomly
                print_info(3,"    Weight[%d] = %.3f" % (i, self.Neuron[n].W[i]))

def FeedForward_Layer(inputs, layer):
    outputs = []
    for neuron in layer.Neuron:
        neuron.X = inputs
        y = FeedForward_Neuron(neuron.X, neuron.B, neuron.W)
        neuron.Y = y
        outputs.append(y)
    return outputs

The Neural Network

A complete multilayer network can then be created, with:

  • a particular number of inputs and outputs,
  • a particular number of layers
  • a particular number of neurons in each layer.

And we can use this network by feeding it input signals which propagate up through the layers to then return the outputs.

The application calls for 3 inputs and 2 outputs. Typically, the number of layers is 2 but you can configure for more than this (for so-called ‘deep’ networks). Here, we configure the network as follows:

num_inputs   = 3
num_outputs  = 2
num_neurons_in_layer = [4, num_outputs] # num. neurons in each layer from inputs up to output
# The num. neurons in the top (output) layer is the same as the num. output ports
output_layer = len(num_neurons_in_layer)-1 # Layer number

The num_neurons_in_layer variable defines the number of layers as well as the number of neurons in each layer. You can experiment with the number of neurons.

To actually create the network, we use:

Net = []
for L in range(len(num_neurons_in_layer)):
    if L==0: # Input layer
        i = num_inputs
    else:
        i = num_neurons_in_layer[L-1] # (Fully connected to lower layer)
    print_info(1, "Create layer %d with %d neurons and %d inputs" % (L, num_neurons_in_layer[L], i))
    Net.append(NeuronLayer(num_neurons = num_neurons_in_layer[L], num_inputs = i))

Feed-Forward

For actual usage, we just apply the inputs then update each layer in turn from the input layer forward to the output layer.

def FeedForward_Net(inputs, Net):
    for L in range(len(Net)): # Up through all layers
        print_info(3, "  Feed-Forward layer Net[%d]" % L)
        if L==0:
            y = FeedForward_Layer(inputs, Net[L])
        else:
            y = FeedForward_Layer(y, Net[L])
    return y

Testing

In the trivial example here, we test the network by applying all input combinations, as defined in the truth table training set.

def Test_Network(Net, Training_Set):
    print("Test Network:")
    for i in range(8):
        Training_Input, Training_Output = Training_Set[i]
        print("  %d+%d+%d" % (Training_Input[0], Training_Input[1], Training_Input[2]), end="")
        result = FeedForward_Net(Training_Input, Net)
        rounded_result = [round(result[0]), round(result[1])]
        print(" = %d%d"       % (rounded_result[CARRY], rounded_result[SUM]), end="")
        print(" (%.3f, %.3f)" % (        result[CARRY],         result[SUM]), end="")
        if rounded_result == Training_Output:
            print(" correct")
        else:
            print(" bad")

Not surprisingly, the network does not behave as desired before it is trained. There will be just a 50:50 chance that the output will be correct.

Test_Network(Net, Training_Set)

Training with Back-Propagation

Now we want to train the network to behave according to the application (in this case, to behave like a full adder circuit). We train using the ‘back-propagation’ algorithm. This involves:

  1. Applying the inputs for a particular training set and propagate these forward to produce the outputs.
  2. Seeing how the outputs differ from what you want them to be (what the training set outputs say they should be). The mismatch is called the ‘error’ E.
  3. For each neuron in the output layer, working out what change to the signal z (remember: z = Σwi.xi + b and y = σ(z)) would be needed to make the output correct (i.e. make E=0). This is ∂E/∂z, the ‘partial derivative’ of the error with respect to z.
  4. For each layer working from that output layer back to the input layer, repeat the above operation for each neuron. Setting ∂E/∂z will require using the weights and ∂E/∂z values of the neurons of the higher layers. (We are propagating the error derivatives back through the layers.)
  5. We update the weights of each neuron by deriving a partial derivative of the error with respect to the weight ∂E/∂w (derived from the ∂E/∂z values already calculated). We adjust each weight by a small fraction of this change, determined by the ‘learning rate‘) ε so that a weight w is changed to become w+ ε.∂E/∂w. We do the same with the biases ∂E/∂b.

We perform the above operations for each item in the training set in turn. Over many iterations, the weights converge on values that produce the desired behaviour (hopefully); this is called ‘gradient descent’.

As an example, consider how to modify the weight w12 that connects neuron n1 in layer 1 to a neuron n2 in layer 2 where this is in a 3-layer network. The error derivatives ∂E/∂z3 in the higher layer, 3, have already been calculated.

The error derivative for n2 is calculated using the ‘chain rule’, multiplying the derivatives of everything along the signal path:

∂E/∂z2 = ∂E/∂z3 . ∂z3/∂y2 . ∂y2/∂z2.

This result is used for both:

  1. Continuing to propagate back to lower layers (in this case, just layer 1), and
  2. Calculating the derivative for the weight adjustment:

∂E/∂w12 = ∂E/∂z2 . ∂z2/∂w12.

For more details, I recommend Matt Mazur’s…excellent worked-out example and Pythonic Python code.

def calc_dEdy(target, output):
    return -(target - output)

# Derivative of the sigmoid function:
def dydz(y):
    return y * (1 - y)

def calc_dEdz(target, output):
    # Are these vectors or scalars?
    return calc_dEdy(target, output) * dydz(output);

def dzdw(x):
    return x # z=sum(w[i].x[i]) therefore dz/dw[i]=x[i]

LEARNING_RATE = 0.5 # Often denoted by some Greek letter - often epsilon

# Uses 'online' learning, ie updating the weights after each training case
def Train_Net(Net, training_inputs, training_outputs):
    # 0. Feed-forward to generate outputs
    print_info(2,"  Feed-forward")
    FeedForward_Net(training_inputs, Net)

    for L in reversed(range(len(Net))): # Back through all layers
        print_info(2,"  Back-prop layer Net[%d]" % (L))
        if L == output_layer: # Output layer
            # 1. Back-propagation: Set Output layer neuron dEdz
            for o in range(len(Net[L].Neuron)): # For each output layer neuron
                print_info(3,"    Back-prop Net[%d].Neuron[%d]" % (L, o))
                print_info(3,"    %d" % (training_outputs[o]))
                print_info(3,"    calc_dEdz(%.3f, %.3f)" % (training_outputs[o], Net[L].Neuron[o].Y))
                Net[L].Neuron[o].dEdz = calc_dEdz(training_outputs[o], Net[L].Neuron[o].Y)
        else:
            # 2. Back-propagation: Set Hidden layer neuron dE/dz = Sum dE/dz * dz/dy = Sum dE/dz * wih
            for h in range(len(Net[L].Neuron)):
                print_info(3,"    Back-prop Net[%d].Neuron[%d]" % (L, h))
                dEdy = 0
                for output_neuron in range(len(Net[L+1].Neuron)):
                    dEdy += Net[L+1].Neuron[output_neuron].dEdz * Net[L+1].Neuron[output_neuron].W[h]
                Net[L].Neuron[h].dEdz = dEdy * dydz(Net[L].Neuron[h].Y)
    # 3. Update output layer neuron biases and weights: dE/dw = dE/dz * dz/dw
    for L in range(len(Net)): # Up through all layers
        print_info(2,"  Update weights in layer Net[%d]" % (L))
        for n in range(len(Net[L].Neuron)):
            dEdb = Net[L].Neuron[n].dEdz * 1.0
            # dE/db = dE/dz * dz/db; dz/db=1 (the bias is like a weight with a constant input of 1)
            Net[L].Neuron[n].B -= LEARNING_RATE * dEdb # db = epsilon * dE/db
            for w in range(len(Net[L].Neuron[n].W)):
                dEdw = Net[L].Neuron[n].dEdz * dzdw(Net[L].Neuron[n].X[w])
                Net[L].Neuron[n].W[w] -= LEARNING_RATE * dEdw # dw = epsilon * dE/dw

We train the network until it is ‘good enough’. For that, we need a measure of how good (or how bad) the network is performing whilst we are training. That measure is derived by the Total_Error function. In this simple example, there are only 8 possible combinations of inputs so, at each training round, a training vector is randomly selected from the 8.

""" For reporting progress (to see if it working, or how well it is learning)"""
def Total_Error(Net, training_sets):
    Etotal = 0
    """ Use the first 8 training vectors as the validation set """
    num_validation_vectors = 8 # There are only 8 vectors in the Full-Adder example
    for t in range(num_validation_vectors):
        training_inputs, training_outputs = training_sets[t]
        FeedForward_Net(training_inputs, Net)
        Etotal += 0.5*(training_outputs[0] - Net[output_layer].Neuron[0].Y)**2
        Etotal += 0.5*(training_outputs[1] - Net[output_layer].Neuron[1].Y)**2
    return Etotal

Etotal = 0.0
for i in range(0, 100000):
    Training_Input, Training_Output = random.choice(Training_Set)
    if i%100==99:
        print_info(1,"Training iteration %d" % i, end="")
        print_info(3,"  %d+%d+%d" % (Training_Input[0], Training_Input[1], Training_Input[2]), end="")
        print_info(3,"  =  %d%d" % (Training_Output[CARRY], Training_Output[SUM]), end="")
        print_info(1,"")
    Train_Net(Net, Training_Input, Training_Output)
    if i%100==99:
        Etotal = Total_Error(Net, Training_Set)
        print_info(1,"  Validation E = %.3f" % Etotal)
        if Etotal < 0.02:
            break

Testing the Trained Network

Then we test the network again to see how well it has been trained.

Test_Network(Net, Training_Set)

With the error threshold to stop training fixed at 0.02, you can experiment with changing the size and depth of the network and seeing how many training iterations it takes to get to that error threshold.

An example output is given below – the beginning and end at least…

#########################################
Create Neural Net
#########################################
Create layer 0 with 4 neurons and 3 inputs
Create layer 1 with 2 neurons and 4 inputs
#########################################
Testing
#########################################
Continue?y
Test Network:
  0+0+0 = 11 (0.834, 0.829) bad
  0+0+1 = 11 (0.864, 0.851) bad
  0+1+0 = 11 (0.877, 0.874) bad
  0+1+1 = 11 (0.893, 0.886) bad
  1+0+0 = 11 (0.857, 0.847) bad
  1+0+1 = 11 (0.880, 0.864) bad
  1+1+0 = 11 (0.890, 0.884) bad
  1+1+1 = 11 (0.901, 0.893) correct
#########################################
Training
#########################################
Continue?y
Training iteration 99
  Validation E = 1.972
Training iteration 199
  Validation E = 1.881
Training iteration 299
  Validation E = 2.123
…
Training iteration 10099
  Validation E = 0.020
Training iteration 10199
  Validation E = 0.020
Test Network:
  0+0+0 = 00 (0.014, 0.080) correct
  0+0+1 = 01 (0.029, 0.942) correct
  0+1+0 = 01 (0.028, 0.946) correct
  0+1+1 = 10 (0.972, 0.065) correct
  1+0+0 = 01 (0.029, 0.936) correct
  1+0+1 = 10 (0.980, 0.062) correct
  1+1+0 = 10 (0.976, 0.061) correct
  1+1+1 = 11 (0.997, 0.922) correct

All together

Piecing all this code together so we have a single file to run…

print("#########################################")
print("Reporting/control")
print("#########################################")

def prompted_pause(s):
    import sys
    ok = input(s)
    if ok=="n" or ok=="N":
        print("Stopping here")
        sys.exit()

verbosity = 1

def print_info(v,s, end="DEFAULT"):
    if verbosity >= v:
        if end == "DEFAULT":
            print(s) # With newline
        elif end == "":
            print(s, end="") # Without newline
        else:
            print(s, end)

"""
   #########################################
   Application: Full adder
   #########################################
"""

# Full Adder example:
Training_Set = [
    # A  B  CI   S  CO
    [[0, 0, 0], [0, 0]],
    [[0, 0, 1], [0, 1]],
    [[0, 1, 0], [0, 1]],
    [[0, 1, 1], [1, 0]],
    [[1, 0, 0], [0, 1]],
    [[1, 0, 1], [1, 0]],
    [[1, 1, 0], [1, 0]],
    [[1, 1, 1], [1, 1]]
]
# Bit assignments...
SUM   = 1
CARRY = 0

print("#########################################")
print("Create Neural Net")
print("#########################################")

import math

class Neuron:
    def __init__(self, bias):
        self.B = bias
        self.W = []
        self.dEdz = 0.0

""" The logistic function """
def Sigmoid(z):
    return 1 / (1 + math.exp(-z))

""" Generate neuron output from inputs"""
def FeedForward_Neuron(inputs, bias, weights):
    z = bias
    for i in range(len(inputs)):
        z += inputs[i] * weights[i]
    return Sigmoid(z)

import random

class NeuronLayer:
    def __init__(self, num_neurons, num_inputs):
        self.Neuron = [] # Build up a list of neurons
        for n in range(0, num_neurons):
            print_info(3,"  Neuron[%d]" % (n))
            # Add a neuron to the layer, with a random bias
            self.Neuron.append(Neuron(random.random()))
            print_info(3,"    Bias = %.3f" % self.Neuron[n].B)
            # Give it random weights
            for i in range(0, num_inputs):
                self.Neuron[n].W.append(random.random()) # Initialized randomly
                print_info(3,"    Weight[%d] = %.3f" % (i, self.Neuron[n].W[i]))

def FeedForward_Layer(inputs, layer):
    outputs = []
    for neuron in layer.Neuron:
        neuron.X = inputs
        y = FeedForward_Neuron(neuron.X, neuron.B, neuron.W)
        neuron.Y = y
        outputs.append(y)
    return outputs

"""
A complete multilayer network can then be created,
"""

# Configuration...
num_inputs   = 3
num_outputs  = 2
num_neurons_in_layer = [4, num_outputs] # num. neurons in each layer from inputs up to output
# The num. neurons in the top (output) layer is the same as the num. output ports
output_layer = len(num_neurons_in_layer)-1 # Layer number

Net = []
for L in range(len(num_neurons_in_layer)):
    if L==0: # Input layer
        i = num_inputs
    else:
        i = num_neurons_in_layer[L-1] # (Fully connected to lower layer)
    print_info(1, "Create layer %d with %d neurons and %d inputs" % (L, num_neurons_in_layer[L], i))
    Net.append(NeuronLayer(num_neurons = num_neurons_in_layer[L], num_inputs = i))

def FeedForward_Net(inputs, Net):
    for L in range(len(Net)): # Up through all layers
        print_info(3, "  Feed-Forward layer Net[%d]" % L)
        if L==0:
            y = FeedForward_Layer(inputs, Net[L])
        else:
            y = FeedForward_Layer(y, Net[L])
    return y

print("#########################################")
print("Testing")
print("#########################################")

prompted_pause("Continue?")
def Test_Network(Net, Training_Set):
    print("Test Network:")
    for i in range(8):
        Training_Input, Training_Output = Training_Set[i]
        print("  %d+%d+%d" % (Training_Input[0], Training_Input[1], Training_Input[2]), end="")
        result = FeedForward_Net(Training_Input, Net)
        rounded_result = [round(result[0]), round(result[1])]
        print(" = %d%d"       % (rounded_result[CARRY], rounded_result[SUM]), end="")
        print(" (%.3f, %.3f)" % (        result[CARRY],         result[SUM]), end="")
        if rounded_result == Training_Output:
            print(" correct")
        else:
            print(" bad")

Test_Network(Net, Training_Set)

print("#########################################")
print("Training")
print("#########################################")

prompted_pause("Continue?")

def calc_dEdy(target, output):
    return -(target - output)

# Derivative of the sigmoid function:
def dydz(y):
    return y * (1 - y)

def calc_dEdz(target, output):
    # Are these vectors or scalars?
    return calc_dEdy(target, output) * dydz(output);

def dzdw(x):
    return x # z=sum(w[i].x[i]) therefore dz/dw[i]=x[i]

LEARNING_RATE = 0.5 # Often denoted by some Greek letter - often epsilon

# Uses 'online' learning, ie updating the weights after each training case
def Train_Net(Net, training_inputs, training_outputs):
    # 0. Feed-forward to generate outputs
    print_info(2,"  Feed-forward")
    FeedForward_Net(training_inputs, Net)

    for L in reversed(range(len(Net))): # Back through all layers
        print_info(2,"  Back-prop layer Net[%d]" % (L))
        if L == output_layer: # Output layer
            # 1. Back-propagation: Set Output layer neuron dEdz
            for o in range(len(Net[L].Neuron)): # For each output layer neuron
                print_info(3,"    Back-prop Net[%d].Neuron[%d]" % (L, o))
                print_info(3,"    %d" % (training_outputs[o]))
                print_info(3,"    calc_dEdz(%.3f, %.3f)" % (training_outputs[o], Net[L].Neuron[o].Y))
                Net[L].Neuron[o].dEdz = calc_dEdz(training_outputs[o], Net[L].Neuron[o].Y)
        else:
            # 2. Back-propagation: Set Hidden layer neuron dE/dz = Sum dE/dz * dz/dy = Sum dE/dz * wih
            for h in range(len(Net[L].Neuron)):
                print_info(3,"    Back-prop Net[%d].Neuron[%d]" % (L, h))
                dEdy = 0
                for output_neuron in range(len(Net[L+1].Neuron)):
                    dEdy += Net[L+1].Neuron[output_neuron].dEdz * Net[L+1].Neuron[output_neuron].W[h]
                Net[L].Neuron[h].dEdz = dEdy * dydz(Net[L].Neuron[h].Y)
    # 3. Update output layer neuron biases and weights: dE/dw = dE/dz * dz/dw
    for L in range(len(Net)): # Up through all layers
        print_info(2,"  Update weights in layer Net[%d]" % (L))
        for n in range(len(Net[L].Neuron)):
            dEdb = Net[L].Neuron[n].dEdz * 1.0
            # dE/db = dE/dz * dz/db; dz/db=1 (the bias is like a weight with a constant input of 1)
            Net[L].Neuron[n].B -= LEARNING_RATE * dEdb # db = epsilon * dE/db
            for w in range(len(Net[L].Neuron[n].W)):
                dEdw = Net[L].Neuron[n].dEdz * dzdw(Net[L].Neuron[n].X[w])
                Net[L].Neuron[n].W[w] -= LEARNING_RATE * dEdw # dw = epsilon * dE/dw

""" For reporting progress (to see if it working, or how well it is learning)"""
def Total_Error(Net, training_sets):
    Etotal = 0
    """ Use the first 8 training vectors as the validation set """
    num_validation_vectors = 8 # There are only 8 vectors in the Full-Adder example
    for t in range(num_validation_vectors):
        training_inputs, training_outputs = training_sets[t]
        FeedForward_Net(training_inputs, Net)
        Etotal += 0.5*(training_outputs[0] - Net[output_layer].Neuron[0].Y)**2
        Etotal += 0.5*(training_outputs[1] - Net[output_layer].Neuron[1].Y)**2
    return Etotal

Etotal = 0.0
for i in range(0, 100000):
    Training_Input, Training_Output = random.choice(Training_Set)
    if i%100==99:
        print_info(1,"Training iteration %d" % i, end="")
        print_info(3,"  %d+%d+%d" % (Training_Input[0], Training_Input[1], Training_Input[2]), end="")
        print_info(3,"  =  %d%d" % (Training_Output[CARRY], Training_Output[SUM]), end="")
        print_info(1,"")
    Train_Net(Net, Training_Input, Training_Output)
    if i%100==99:
        Etotal = Total_Error(Net, Training_Set)
        print_info(1,"  Validation E = %.3f" % Etotal)
        if Etotal < 0.02:
            break

"""
See how it behaves now, after training.
"""
Test_Network(Net, Training_Set)
Advertisements
This entry was posted in Uncategorized and tagged , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s