Abstract: Improving the generalization performance of deep neural networks (DNNs) trained by minibatch stochastic gradient descent (SGD) has raised lots of concerns from deep learning practitioners.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results