Mgr. Petr Šimánek

Publikace

Investigation into the Training Dynamics of Learned Optimizers

Autoři
Sobotka, J.; Šimánek, P.; Vašata, D.
Rok
2024
Publikováno
Proceedings of the 16th International Conference on Agents and Artificial Intelligence. Setúbal: Science and Technology Publications, Lda, 2024. p. 135-146. vol. 3. ISSN 2184-433X. ISBN 978-989-758-680-4.
Typ
Stať ve sborníku
Anotace
Optimization is an integral part of modern deep learning. Recently, the concept of learned optimizers has emerged as a way to accelerate this optimization process by replacing traditional, hand-crafted algorithms with meta-learned functions. Despite the initial promising results of these methods, issues with stability and generalization still remain, limiting their practical use. Moreover, their inner workings and behavior under different conditions are not yet fully understood, making it difficult to come up with improvements. For this reason, our work examines their optimization trajectories from the perspective of network architecture symmetries and parameter update distributions. Furthermore, by contrasting the learned optimizers with their manually designed counterparts, we identify several key insights that demonstrate how each approach can benefit from the strengths of the other.

Investigation into Training Dynamics of Learned Optimizers (Student Abstract)

Autoři
Sobotka, J.; Šimánek, P.
Rok
2024
Publikováno
Proceedings of the 38th AAAI Conference on Artificial Intelligence. Menlo Park: AAAI Press, 2024. p. 23657-23658. vol. 38. ISSN 2374-3468. ISBN 978-1-57735-887-9.
Typ
Stať ve sborníku
Anotace
Modern machine learning heavily relies on optimization, and as deep learning models grow more complex and data-hungry, the search for efficient learning becomes crucial. Learned optimizers disrupt traditional handcrafted methods such as SGD and Adam by learning the optimization strategy itself, potentially speeding up training. However, the learned optimizers' dynamics are still not well understood. To remedy this, our work explores their optimization trajectories from the perspective of network architecture symmetries and proposed parameter update distributions.

Overcoming Long Inference Time of Nearest Neighbors Analysis in Regression and Uncertainty Prediction

Rok
2024
Publikováno
SN Computer Science. 2024, 5(5), ISSN 2662-995X.
Typ
Článek
Anotace
The intuitive approach of comparing like with like, forms the basis of the so-called nearest neighbor analysis, which is central to many machine learning algorithms. Nearest neighbor analysis is easy to interpret, analyze, and reason about. It is widely used in advanced techniques such as uncertainty estimation in regression models, as well as the renowned k-nearest neighbor-based algorithms. Nevertheless, its high inference time complexity, which is dataset size dependent even in the case of its faster approximated version, restricts its applications and can considerably inflate the application cost. In this paper, we address the problem of high inference time complexity. By using gradient-boosted regression trees as a predictor of the labels obtained from nearest neighbor analysis, we demonstrate a significant increase in inference speed, improving by several orders of magnitude. We validate the effectiveness of our approach on a real-world European Car Pricing Dataset with approximately rows for both residual cost and price uncertainty prediction. Moreover, we assess our method’s performance on the most commonly used tabular benchmark datasets to demonstrate its scalability. The link is to github repository where the code is available: https://github.com/koutefra/uncertainty_experiments.

Unlocking Nature’s Design through Neural Cellular Automata

Rok
2024
Publikováno
ALIFE 2024: Proceedings of the 2024 Artificial Life Conference. Cambridge: The MIT Press, 2024. p. 122-124.
Typ
Stať ve sborníku
Anotace
This study presents Dynamics Identification via Neural Cellular Automata (DINCA), an enhancement of Neural Cellular Automata (NCA) for modeling reaction-diffusion systems. The main advantage of DINCA is its ability to estimate the parameters of the reaction-diffusion equations that govern the examined system, using minimal data. We demonstrate the method’s application potential by showing its ability to model leopard pattern formation, by learning on only three images, while revealing the governing reaction-diffusion equations. This positions NCA-based methodologies as a viable tool for inferring partial differential equations.

Weather4cast at NeurIPS 2022: Super-Resolution Rain Movie Prediction under Spatio-temporal Shifts

Autoři
Gruca, A.; Serva, F.; Lliso, L.; Pihrt, J.; Raevskiy, R.; Šimánek, P.
Rok
2023
Publikováno
Proceedings of the NeurIPS 2022 Competitions Track. Proceedings of Machine Learning Research, 2023. p. 292-312. Proceedings of Machine Learning Research. vol. 220. ISSN 2640-3498.
Typ
Stať ve sborníku
Anotace
Weather4cast again advanced modern algorithms in AI and machine learning through a highly topical interdisciplinary competition challenge: The prediction of hi-res rain radar movies from multi-band satellite sensors, requiring data fusion, multi-channel video frame prediction, and super-resolution. Accurate predictions of rain events are becoming ever more critical, with climate change increasing the frequency of unexpected rainfall. The resulting models will have a particular impact where costly weather radar is not available. We here present highlights and insights emerging from the thirty teams participating from over a dozen countries. To extract relevant patterns, models were challenged by spatio-temporal shifts. Geometric data augmentation and test-time ensemble models with a suitable smoother loss helped this transfer learning. Even though, in ablation, static information like geographical location and elevation was not linked to performance, the general success of models incorporating physics in this competition suggests that approaches combining machine learning with application domain knowledge seem a promising avenue for future research. Weather4cast will continue to explore the powerful benchmark reference data set introduced here, advancing competition tasks to quantitative predictions, and exploring the effects of metric choice on model performance and qualitative prediction properties.

Learning to Optimize with Dynamic Mode Decomposition

Rok
2022
Publikováno
2022 International Joint Conference on Neural Networks (IJCNN). Vienna: IEEE Industrial Electronic Society, 2022. p. 1-8. ISSN 2161-4407. ISBN 978-1-7281-8671-9.
Typ
Stať ve sborníku
Anotace
Designing faster optimization algorithms is of ever-growing interest. In recent years, learning to learn methods that learn how to optimize demonstrated very encouraging results. Current approaches usually do not effectively include the dynamics of the optimization process during training. They either omit it entirely or only implicitly assume the dynamics of an isolated parameter. In this paper, we show how to utilize the dynamic mode decomposition method for extracting informative features about optimization dynamics. By employing those features, we show that our learned optimizer generalizes much better to unseen optimization problems in short. The improved generalization is illustrated on multiple tasks where training the optimizer on one neural network generalizes to different architectures and distinct datasets.

Spatiotemporal Prediction of Vehicle Movement Using Artificial Neural Networks

Rok
2022
Publikováno
Proceedings of 2022 IEEE Intelligent Vehicles Symposium (IV). Piscataway: IEEE, 2022. p. 734-739. ISSN 1931-0587. ISBN 978-1-6654-8821-1.
Typ
Stať ve sborníku
Anotace
Prediction of the movement of all traffic participants is a very important task in autonomous driving. Well-predicted behavior of other cars and actors is crucial for safety. A sequence of bird’s-eye view artificially rasterized frames are used as input to neural networks which are trained to predict the future behavior of the participants. The Lyft Motion Prediction for Autonomous Vehicles dataset is explored and adapted for this task. We developed and applied a novel approach where the prediction problem is viewed as a problem of spatiotemporal prediction and we use methods based on convolutional recurrent neural networks.