Hey Valborsf, thanks for checking out this post. You’re right this is how Keras computes Binary Cross-Entropy Loss when simple inputs are given andfrom_logits=False
. Things change a bit when outputs from Sigmoid layers are provided. When from_logits=False
and output is from a Sigmoid layer then the Binary Cross-Entropy function uses the Logits passed into the Sigmoid node for computing Binary Cross-Entropy Loss.
Check out the following lines from the TF 2.1 docs:
def binary_crossentropy(target, output, from_logits=False):
"""Binary crossentropy between an output tensor and a target tensor.
Arguments:
target: A tensor with the same shape as `output`.
output: A tensor.
from_logits: Whether `output` is expected to be a logits tensor.
By default, we consider that `output`
encodes a probability distribution.
Returns:
A tensor.
"""
if not from_logits:
if (isinstance(output, (ops.EagerTensor, variables_module.Variable)) or
output.op.type != 'Sigmoid'):
epsilon_ = _constant_to_tensor(epsilon(), output.dtype.base_dtype)
output = clip_ops.clip_by_value(output, epsilon_, 1. - epsilon_)# Compute cross entropy from probabilities.
bce = target * math_ops.log(output + epsilon())
bce += (1 - target) * math_ops.log(1 - output + epsilon())
return -bce
else:
# When sigmoid activation function is used for output operation, we
# use logits from the sigmoid function directly to compute loss in order
# to prevent collapsing zero when training.
assert len(output.op.inputs) == 1
output = output.op.inputs[0]
return nn.sigmoid_cross_entropy_with_logits(labels=target, logits=output)
The “else” part is the segment that inspired this blog post. When outputs are from a Sigmoid Layer they need to be converted to logits!
The Keras docs are still a bit misleading in this case. You can confirm this yourself too if you are willing to go through a debugger.
I hope, this answer helps.
PS. The results from in your computation differ slightly because of NumPy, even though you set all your arrays as Float32 still when you do computation with NumPy that overflows such as during division by ‘m’ or taking ‘log’ and ‘exp’ NumPy automatically changes the data type to Float64.