Dauphin, Y. et al. Identifying and attacking the saddle point problem in high- dimensional non-convex optimization. In Proc. Advances in Neural Information Processing Systems 27 2933–2941 (2014).
Choromanska, A., Henaff, M., Mathieu, M., Arous, G. B. & LeCun, Y. The loss surface of multilayer networks.
K. He, X. Zhang, S. Ren, J. Sun, Delving deep into rectifiers: surpassing human- level performance on ImageNet classification, 2015, /arXiv:1502.01852S.
简单堆叠网络深度测试准确率会更差(degradation problem): R. K. Srivastava, K. Greff, and J. Schmidhuber. Highway networks. arXiv:1505.00387, 2015. Deep Residual Learning for Image Recognition arXiv:1512.03385v1