I wanted to see the distribution of the weights inside a neural net to see what was going on. I ran the code twice, trained with a different sample text each time. The first sample text was the simple sentence "thebrownfoxjumpsoverthefence." repeated many times. The second text was the short story The Last Question by Isaac Asimov. These text samples and the code are included in a .zip file at the end of this post.
The "fox" sample text was trained with 3 hidden layers with 128 nodes each. The code converged relatively quickly, see the output after 0 epochs:
and after 20 epochs:
The network is able to converge upon the correct answer with few errors.
I also plotted a histogram of the weights below. At first, the weights are grouped together, presumably near their initialized values. After several epochs, the weights become more normally distributed about 0.
The more complicated sample text was run with 3 hidden layers of 512 nodes each. See the output at epoch 0:
and epoch 20:
The network is clearly learning the formatting of the document (frequent line breaks) and the text is looking more like english, even though it doesn't quite become intelligible after only 20 epochs. Despite not reaching convergence, the graphs of the weights look quite similar to how they did for the fox text.
The histogram is taller because there are way more weights with the larger network, but the distribution of the weights is normal, despite not reaching convergence yet. The reason for this must be that the weights tend towards a normal distribution, but those weights still need to be shuffled around to the correct places in the network to get the correct answer.
-------------
The code and training files
No comments:
Post a Comment