This is fantastic.

1 min readAug 25, 2017

This is fantastic. Thanks for taking the time to write this (and the whole series). One tiny thing which I found a bit confusing and perhaps could do with a little bit more of a clarification:

Output will be the last state of every layer in the network as an LSTMStateTuple stored incurrent_state as well as a tensor states_series with the shape [batch_size, truncated_backprop_length, state_size] containing the hidden state of the last layer across all time-steps.

Could possibly be expanded as:

Output will be the internal state (both cell state and hidden state) of every layer in the network for the final timestep as a tuple (for each layer) of LSTMStateTuple stored incurrent_state as well as a tensor states_series with the shape [batch_size, truncated_backprop_length, state_size] containing the output of the last layer for each time-step.

(The bold bits are not for emphasis, they’re just to indicate which bits I changed).

This doesn’t contradict what you say, just avoids some ambiguity (at least it wasn’t clear to me just from reading it).

Finally, with batch_size=3 , state_size=3 and truncated_backprop_length=3 it’s a bit tricky to read the diagrams, since so many dimensions are of size 3! If say batch_size was 4 and state_size was 5 it could be much more immediately obvious.

Written by Memo Akten

Responses (1)