What’s the difference between sparse_softmax_cross_entropy_with_logits and softmax_cross_entropy_with_logits ?
- The classes of a given label is mutually exclusive. if their probabilities need not be. All that is required is that each row of labels is a valid probability distribution. While they are not the computation of the gradient will be inaccurate.
- The probability of a given label is considered exclusive. There is, soft classes are not allowed, and the labels vector should provide a single specific index for the true class for each row of logits.