// median (L1) loss: minimize absolute differences auto loss_grad = float delegate(float[] nn_out, ref Obs o, ref float[] grads) { auto pred = nn_out[0]; // this is the predictions of the net // after forward-prop if(pred > o.label) // gradient of |pred - label| with respect to pred grads[0] = 1.0f; else grads[0] = -1.0f; return fabs(pred - o.label); // return loss value so it's monitored // during training } net.learn(data, loss_grad, ...);
2017 Netflix, Inc.
The library supports already implemented loss functions, as well as a callback-based way to specify a custom loss.
The losses pre-implemented are: logistic, square, multinomial.
For these losses, if an attribute .weight is found in the row, it will be used to weight the loss during MLE.
If one wants to specify a custom loss function, one has to implement a gradient callback of the form S delegate(R net_out, ref T ex, ref V[] grad) which is expected to populate in grad the gradient of the loss on datapoint ex with respect to the output of the net net_out.
S is void or numeric (float, double, int...). If numeric, the callback is expected to return the loss value on training sample ex for monitoring purposes.
R is float[] or NeuralNet. If float[], the net is expected to have a single leaf and the callback receives the predictions of the leaf after forward-prop. If NeuralNet, the callback receives a reference of the net after forward-prop. Useful in case the loss function depends on multiple layers values.
T is the templatized row. This row needs at minimum to have an attribute starting with the name feature to be able to perform forward-prop.
V is float[] or SparseF[]. If float, the backpropagation will be ran densely. If SparseF[], the last layer will be sparsely backpropagated. More efficient when the gradient is sparse and the output dimension large.