Tensorflow - Loss Increases To Nan
I am going though Udacity's Deep Learning Course. The interesting thing that I am observing is that for same dataset, my 1 layer Neural Network works perfectly fine, but when I add
Solution 1:
This is because the Relu activation function causes the exploding gradient. Therefore you need to reduce the learning rate accordingly (in your case its the starter_learning_rate). Moreover, you can try a different activation function also.
Here, (In simple multi-layer FFNN only ReLU activation function doesn't converge) is a similar problem as your case. Follow the answer and you will understand.
Hope this helps.
Post a Comment for "Tensorflow - Loss Increases To Nan"