Implementation of different stochastic optimizers.
The default parallelization strategy over the cores is Hogwild!. This is a lock-free strategy where race conditions will occur. This means that the library is non-deterministic when training a network as soon as there is more than one core involved. Hogwild! will work as long as the data access pattern is sparse enough, which means that if you have too few dense parameters to learn and too many cores, the optimization can fail.