https://en.wikipedia.org/wiki/Duality_(optimization)
https://arxiv.org/pdf/1311.6091.pdf A Primal-Dual Method for Training Recurrent Neural Networks Constrained by the Echo-State Property
https://arxiv.org/abs/1708.00523 Gradient Descent using Duality Structures