Adds L1 regularization to a Linear layer.
This is implemented as a proximal operator during SGD.
// basic L1 regularization: loss = 0.03 * | W | auto l1 = Linear(5).prior(L1Prior(0.03)); // same, but centered around a non-zero matrix: loss = 0.03 * | W - W_p | auto l2 = Linear(5).prior(L1Prior(0.03, W_p));
See Implementation
Adds L1 regularization to a Linear layer.
This is implemented as a proximal operator during SGD.