
The library supports already implemented loss functions, as well as a callback-based way to specify a custom loss.

The losses pre-implemented are: logistic, square, multinomial.

For these losses, if an attribute .weight is found in the row, it will be used to weight the loss during MLE.

If one wants to specify a custom loss function, one has to implement a gradient callback of the form S delegate(R net_out, ref T ex, ref V[] grad) which is expected to populate in grad the gradient of the loss on datapoint ex with respect to the output of the net net_out.

S is void or numeric (float, double, int...). If numeric, the callback is expected to return the loss value on training sample ex for monitoring purposes.

R is float[] or NeuralNet. If float[], the net is expected to have a single leaf and the callback receives the predictions of the leaf after forward-prop. If NeuralNet, the callback receives a reference of the net after forward-prop. Useful in case the loss function depends on multiple layers values.

T is the templatized row. This row needs at minimum to have an attribute starting with the name feature to be able to perform forward-prop.

V is float[] or SparseF[]. If float, the backpropagation will be ran densely. If SparseF[], the last layer will be sparsely backpropagated. More efficient when the gradient is sparse and the output dimension large.



auto get_grad(string loss, V args)
Undocumented in source. Be warned that the author may not have intended to support it.


// median (L1) loss: minimize absolute differences
auto loss_grad = float delegate(float[] nn_out, ref Obs o, ref float[] grads)
   auto pred = nn_out[0]; // this is the predictions of the net
   // after forward-prop
   if(pred > o.label) // gradient of |pred - label| with respect to pred
       grads[0] = 1.0f;
       grads[0] = -1.0f;

   return fabs(pred - o.label); // return loss value so it's monitored
   // during training
net.learn(data, loss_grad, ...);
