# Explanation of min-char-rnn.py

In this post, we will try to explain the lossFun function in Karpathy's min-char-rnn implementation. The source code can be found here: https://gist.github.com/karpathy/d4dee566867f8291f086. There are two parts in the lossFun function:
1. forward pass
2. backpropagation
The weight updates for Wxh, Why are to some extent straightforward. On the other hand, the updates on Whh is more complicated. One of the reasons is that this weight is shared by all the hidden layers. From the implementation code here(https://gist.github.com/karpathy/d4dee566867f8291f086#file-min-char-rnn-py-L48-L58), we definitely can see some kind of "accumulation of gradient". In the section below, we will explain how we can derive the expression of the dh and the dhnext. For the illustration purpose, we will assume there is only one neuron in the hidden layer. Therefore, the weight matrix becomes a scalar.

----- END -----

Welcome to join reddit self-learning community.

Want some fun stuff?