I used lstm_text_generation.py as a baseline, and I modified it to test for stateful=true and stateful=false. In order to make the code run for stateful=true, I needed to make some changes.
First, I needed to include a parameter batch_input_shape() instead of simply input_shape(), which needed to be passed the batch size, like this:
model.fit(X, y, batch_size=batch_Size, nb_epoch=1,callbacks=[history])Another requirement of using stateful=true is that the number of samples passed as input need to be evenly divisible by the batch size. This was problematic since it limited which batch sizes I could use, but I tried a batch size of 518, which is a factor of the number of samples in each iteration, 15022.
This solution didn't quite work, because I got another error:
ValueError: non-broadcastable output operand with shape (1,512) doesn't match the broadcast shape (518,512)512 refers to the number of nodes in each hidden layer of the neural net. The error seems to be a matrix multiplication error. When the number of nodes in each layer was made 518 (from the original value of 512) to match the batch size, I received a similar error.
ValueError: non-broadcastable output operand with shape (1,518) doesn't match the broadcast shape (518,518)I could not figure out a proper solution to this problem, so I settled with a batch size of 1 so the matrix multiplication would work. This dramatically increased the computation time, so in order to run the test in a matter of hours instead of days I changed the number of nodes in the neural net to 256 from 512. I also ran only 10 iterations rather than the original 60. My results are below:
Notice for both graphs the loss function is increasing. For stateful=True, the loss increases monotonically, whereas with stateful=False, there are some downward changes, but the trend is overall increasing. This is the opposite of what I would expect, because the loss should decrease as the network learns. It's possible it needed more iterations, or that the smaller network kept it from learning, but I'm not yet sure of the reason for the loss increasing.