What is logits, softmax and softmax_cross_entropy_with_logits ?
The Logits function works on the unscaled yield of prior layers and that the relative scale to comprehend the units is straight. That is to say, specifically, the sum of the input may not equivalent 1, that the qualities are not probabilities
tf.nn.softmax creates only the result of applying the softmax function to an input tensor. The softmax “squishes” the inputs of info so that sum(input) = 1; it’s a method for normalizing. The state of yield of a softmax is the equivalent as the info – it just standardizes the qualities. The output of softmax can be translated as probabilities.
a = tf.constant(np.array([[.1, .3, .5, .9]])) print s.run(tf.nn.softmax(a)) [[ 0.16838508 0.205666 0.25120102 0.37474789]]
Conversely, tf.nn.softmax_cross_entropy_with_logits registers the cross entropy of the outcome in the wake of applying the softmax function (however it does everything together in an all the more numerically watchful way). It’s like the result of:
sm = tf.nn.softmax(x) ce = cross_entropy(sm)
Cross entropy is a rundown metric – it wholes over the elements. The output of tf.nn.softmax_cross_entropy_with_logits on a shape [2,5] tensor is of shape [2,1] (the main measurement is treated as the batch).
In the event that you need to do enhancement to limit the cross entropy, and you’re softmaxing after your last layer, you should utilize tf.nn.softmax_cross_entropy_with_logits as opposed to doing it without anyone else’s help, since it covers numerically unsteady corner cases in the scientifically right way. Else, you’ll wind up hacking it by including little epsilons all over.
Assume you have two tensors, where y_hat contains registered scores for each class (for instance, from y = W*x +b) and y_true contains one-hot encoded true labels.
y_hat = ... # Predicted label, e.g. y = tf.matmul(X, W) + b y_true = ... # True label, one-hot encoded
In the event that you translate the scores in y_hat as unnormalized log probabilities, at that point they are logits.
Also, the total cross-entropy loss processed in this manner:
y_hat_softmax = tf.nn.softmax(y_hat) total_loss = tf.reduce_mean(-tf.reduce_sum(y_true * tf.log(y_hat_softmax), ))
Is basically comparable to the total cross-entropy loss processed with the function softmax_cross_entropy_with_logits():
total_loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y_hat, y_true))
The tf.nn.softmax registers the forward propagation through a softmax layer. You utilize it during evaluation of the model when you figure the probabilities that the model outputs.
tf.nn.softmax_cross_entropy_with_logits processes the expense for a softmax layer. It is just utilized during training.
logits are the unnormalized log probabilities output the model (the qualities output before the softmax standardization is connected to them).