# What is logits, softmax and softmax_cross_entropy_with_logits ?

What is logits, softmax and softmax_cross_entropy_with_logits ?

Asked on November 15, 2018 in

The Logits function works on the unscaled yield of prior layers and that the relative scale to comprehend the units is straight. That is to say, specifically, the sum of the input may not equivalent 1, that the qualities are not probabilities

tf.nn.softmax creates only the result of applying the softmax function to an input tensor. The softmax “squishes” the inputs of info so that sum(input) = 1; it’s a method for normalizing. The state of yield of a softmax is the equivalent as the info – it just standardizes the qualities. The output of softmax can be translated as probabilities.

```a = tf.constant(np.array([[.1, .3, .5, .9]]))
print s.run(tf.nn.softmax(a))
[[ 0.16838508 0.205666 0.25120102 0.37474789]]
```

Conversely, tf.nn.softmax_cross_entropy_with_logits registers the cross entropy of the outcome in the wake of applying the softmax function (however it does everything together in an all the more numerically watchful way). It’s like the result of:

```sm = tf.nn.softmax(x)
ce = cross_entropy(sm)
```

Cross entropy is a rundown metric – it wholes over the elements. The output of tf.nn.softmax_cross_entropy_with_logits on a shape [2,5] tensor is of shape [2,1] (the main measurement is treated as the batch).

In the event that you need to do enhancement to limit the cross entropy, and you’re softmaxing after your last layer, you should utilize tf.nn.softmax_cross_entropy_with_logits as opposed to doing it without anyone else’s help, since it covers numerically unsteady corner cases in the scientifically right way. Else, you’ll wind up hacking it by including little epsilons all over.

Assume you have two tensors, where y_hat contains registered scores for each class (for instance, from y = W*x +b) and y_true contains one-hot encoded true labels.

```y_hat = ... # Predicted label, e.g. y = tf.matmul(X, W) + b
y_true = ... # True label, one-hot encoded
```

In the event that you translate the scores in y_hat as unnormalized log probabilities, at that point they are logits.

Also, the total cross-entropy loss processed in this manner:

```y_hat_softmax = tf.nn.softmax(y_hat)
total_loss = tf.reduce_mean(-tf.reduce_sum(y_true * tf.log(y_hat_softmax), [1]))
```

Is basically comparable to the total cross-entropy loss processed with the function softmax_cross_entropy_with_logits():

```total_loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y_hat, y_true))
```