Antonio Orvieto |

I am principal investigator (PI) at the ELLIS Institute Tübingen and independent group leader the MPI for Intelligent Systems. Faculty for CLS, ELLIS, IMPRS-IS PhD Programs.

My goal is to improve efficiency and accessibility of deep learning technologies in science and engineering by pioneering new architectures and training techniques grounded in theoretical knowledge. My work encompasses two main areas: understanding the intricacies of large-scale optimization dynamics and designing innovative architectures and powerful optimizers capable of handling complex data. Central to my studies is exploring innovative techniques for decoding patterns in complex sequential data, with implications spanning biology, neuroscience, natural language processing, and music generation.

I recently received my PhD from the Data Analytics Lab at ETH Zürich under the supervision of Prof. Dr. Thomas Hofmann and Dr. Aurelien Lucchi. Prior to this, I obtained my master degree in Robotics, Systems and Control from ETH. During my PhD, I interned at DeepMind London, Meta (FAIR) Seattle, MILA, and Inria Paris. During my master and PhD, I was involved in several computational systems biology projects at ETH, such as SignalX. I also regularly help in rare diseases research with bioinformatic analysis of genome sequence data from EEC syndrome patients.

For more details, you can check my curriculum vitae.

In my free time, I travel and read philosophy books. My favorite authors are Kierkegaard, Lévi-Strauss, Meister Eckhart and Nietzsche. I also play a few instruments: I studied cello in Venice/Klaghenfurt for more than 10 years, and right now I am learning the oboe. I occasionally bring out my transverse flute and acoustic bass.

▸ Recent Talks (on SSMs!)

April 2024: The mathematics of data streams workshop, US
April 2024: University of Michigan, US
March 2024: AWS AI Fundamental Research Reading Group, US
March 2024: AstraZeneca Centre for AI, UK
Jan 2024: EPFL, CH Dec 2023: Oxford Department of Statistics, UK
Nov 2023: Google DeepMind UK
Nov 2023: Inria Paris, FR
Nov 2023: Tübingen AI Center, DE

▸ Preprints

On the low-shot transferability of [V]-Mamba, 2024
D. Misra, J. Gala, A. Orvieto

Why do Learning Rates Transfer? Reconciling Optimization and Scaling Limits for Deep Learning, 2024.
A. Meterez, L. Noci, T. Hofmann, A. Orvieto

Theoretical Foundations of Deep Selective State-Space Models, 2024.
N. Muca Cirone, A. Orvieto, B. Walker, C. Salvi, T. Lyons

Recurrent Distance-Encoding Neural Networks for Graph Representation Learning, 2023.
Y. Ding, A. Orvieto, B. He, T. Hofmann

On the Universality of Linear Recurrences Followed by Nonlinear Projections, 2023.
A. Orvieto, S. De, C. Gulcehre, R. Pascanu, S. L. Smith

An Accelerated Lyapunov Function for Polyak’s Heavy-Ball on Convex Quadratics, 2023
A. Orvieto

▸ Publications

SDEs for Minimax Optimization, AISTATS 2024
E. Monzio Compagnoni, A.Orvieto, H.Kersting, F. Proske, A. Lucchi

Resurrecting Recurrent Neural Networks for Long Sequences, ICML 2023 (Oral)
A. Orvieto, S. L Smith, A. Gu, A. Fernando, C. Gulcehre, R. Pascanu, S. De

An SDE for Modeling SAM: Theory and Insights, ICML 2023
E. Monzio Compagnoni, L. Biggio, A. Orvieto, H. Kersting, F. N. Proske, A. Lucchi

Mean First Exit Times of Ornstein-Uhlenbeck Processes in High Dimensions,
Journal of Physics A: Mathematical and Theoretical, 2023
H. Kersting, A. Orvieto, F. Proske, A. Lucchi

Achieving a Better Stability-Plasticity Trade-off via Auxiliary Networks in Continual Learning, CVPR, 2023
S. Kim, L. Noci, A. Orvieto, T. Hofmann

Explicit Regularization in Overparametrized Models via Noise Injection, AISTATS, 2023
A. Orvieto*, A. Raj*, H. Kersting*, F. Bach

On the Effectiveness of Randomized Signatures as Reservoir for Learning Rough Dynamics, IJCNN, 2023.
E. Monzio Compagnoni, A. Scampicchio, L. Biggio, A. Orvieto, T. Hofmann, J. Teichmann

On the Theoretical Properties of Noise Correlation in SGD, NeurIPS, 2022
H. Kersting, A. Orvieto, F. Bach, F. Proske, A. Lucchi

Signal Propagation in Transformers: Theoretical Perspectives
and the Role of Rank Collapse, NeurIPS, 2022
L. Noci*, S. Anagnostidis*, L. Biggio*, A. Orvieto*, S. Pal Singh*, A. Lucchi

Dynamics of SGD with Stochastic Polyak Stepsizes:
Truly Adaptive Variants and Convergence to Exact Solution, NeurIPS, 2022.
A. Orvieto, S. Lacoste-Julien, N. Loizou

Analysis of Pharmacological Modulation of Senescence in Human Epithelial Stem Cells
Journal of Cellular and Molecular Medicine, 2022.
V. Barbaro, A. Orvieto, et al.

Anticorrelated Noise Injection for Improved Generalization, ICML, 2022.
A. Orvieto*, H. Kersting*, F. Proske, F. Bach, A. Lucchi

Faster Single-loop Algorithms for Minimax Opt. without Strong Concavity, AISTATS, 2022.
J. Yang, A. Orvieto, A. Lucchi, N. He

Vanishing Curvature in Randomly Initialized Deep ReLU Networks, AISTATS, 2022.
A. Orvieto*, J. Kohler*, D. Pavllo, T. Hofmann, A. Lucchi

On the Second-order Convergence of Random Search Methods, NeurIPS, 2021.
A. Lucchi*, A. Orvieto*, Adamos Solomou*

Rethinking the Variational Interpretation of Nesterov’s Method, NeurIPS, 2021.
P. Zhang*, A. Orvieto*, Hadi Daneshmand

Learning explanations that are hard-to-vary, ICLR, 2021.
G. Parascandolo*, A. Neitz*, A. Orvieto, L. Gresele, B. Schölkopf

Revisiting the Role of Symplectic Numerical Integration on Acceleration and Stability in Convex Optimization, AISTATS, 2021.
P. Zhang, A. Orvieto, H. Daneshmand, R. Smith, T. Hofmann

Momentum Improves Optimization on Riemannian Manifolds, AISTATS, 2021.
F. Alimisis, A. Orvieto, G. Becigneul, A. Lucchi

An Accelerated DFO Algorithm for Finite-sum Convex Functions, ICML, 2020.
C. Yuwen, A. Orvieto, A. Lucchi

Continuous-time Acceleration in Riemannian Optimization, AISTATS, 2020.
F. Alimisis, A. Orvieto, G. Becigneul, A. Lucchi

Shadowing Properties of Optimization Algorithms, NeurIPS, 2019.
A. Orvieto, A. Lucchi

Continuous-time Models for Stochastic Optimization Algorithms, NeurIPS, 2019.
A. Orvieto, A. Lucchi

The Role of Memory in Stochastic Optimization, UAI, 2019.
A. Orvieto, J. Kohler, A. Lucchi

▸ Workshops

Escaping Random Teacher Initialization Enhances Signal Propagation and Representations, NeurIPS 3ML Workshop, 2023.
F. Sarnthein, S. Pal Singh, A. Orvieto, T. Hofmann

On the Advantage of Lion Compared to signSGD with Momentum, ICML High-dimensional Learning Dynamics Workshop, 2023.
A. Noiato, A. Orvieto

A New Adaptive Method for Minimizing Non-negative Losses, ICML High-dimensional Learning Dynamics Workshop, 2023.
A. Orvieto, L. Xiao

Batch-size Selection by Stochastic Optimal Control, NeurIPS HITY Workshop, 2022.
J. Zhao, A. Lucchi, F. N. Proske, A. Orvieto, H. Kersting

Achieving a Better Stability-Plasticity Tradeoff in Continual Learning, NeurIPS MetaLearn Workshop, 2022.
S. Kim, L. Noci, A. Orvieto, T. Hofmann

Should you follow the gradient flow? ICML Continuous-time Perspectives workshop, 2022.
Xiang Li · Antonio Orvieto

Enhancing Unit-Tests for Invariance Discovery, ICML Spurious Correlations workshop, 2022.
P. De Bartolomeis, A. Orvieto, G. Parascandolo

Empirics on the expressiveness of Randomized Signature, NeurIPS DLDE workshop, 2021.
E. Monzio Compagnoni, L. Biggio, A. Orvieto

Two-Level K-FAC Preconditioning for Deep Learning, NeurIPS OPT workshop, 2020.
N. Tselepidis, J. Kohler, A. Orvieto

▸ Patents
Setting Method For Threaded Connection by means of Impact Wrench,
Inventors: M. Alberding, D. Bralla, A. Orvieto
Current Assignee: Hilti AG
European Patent Office, 2019, Publication number: 3501740

“Verum, sine mendacio certum et verissimum, quod est inferius, est sicut quod est superius, et quod est superius, est sicut quod est inferius: ad perpetranda miracula rei unius.”

Hermes Trismegistus, Emerald Table

“He studied the leaves of the tiny plant; how daintily, with what strange intelligence they were arranged around the stem. Virgil’s verses were beautiful, and he loved them; still, there was more than one verse in Virgil that was not half as clear and intelligent, beautiful and meaningful as the spiraled order of those tiny leaves climbing the stem. What pleasure, what ecstasy, what a delightful, noble, meaningful task it would be for a man to be able to create just one such flower! But no man was able to do that—no hero, no emperor, no pope or saint!”

Narcissus and Goldmund, Hermann Hesse

“While the heart beats, bruise it–it is your only opportunity; while the eye can still turn towards you with moist, timid entreaty, freeze it with an icy unanswering gaze; while the ear, that delicate messenger to the inmost sanctuary of the soul, can still take in the tones of kindness, put it off with hard civility, or sneering compliment, or envious affectation of indifference; while the creative brain can still throb with the sense of injustice, with the yearning for brotherly recognition–make haste–oppress it with your ill-considered judgements, your trivial comparisons, your careless misrepresentations. The heart will by and by be still–“ubi saeva indignatio ulterius cor lacerare nequit“; the eye will cease to entreat; the ear will be deaf; the brain will have ceased from all wants as well as from all work. Then your charitable speeches may find vent; then you may remember and pity the toil and the struggle and the failure; then you may give due honour to the work achieved; then you may find extenuation for errors, and may consent to bury them.”

George Eliot, The Lifted Veil

“And so proceed, and do not be afraid, without considering whether this is right, lest you take false steps. For if a painter, having to give the first stroke of his pen, were to consider all the others, he would conclude nothing. If someone were to go to a city and consider how to take the first step, he would conclude nothing. Therefore man must follow the first inspiration and go forward; then one goes where one must, and that is all right.”

Meister Eckhart, Gott hat die Armen (sorry for the bad translation)