Ing. Miroslav Čepek, Ph.D.

Combining Local and Global Weather Data to Improve Forecast Accuracy for Agriculture

Authors

Koutenský, F.; Pihrt, J.; Čepek, M.; Rybář, V.; Šimánek, P.; Kepka, M.; Jedlička, K.; Charvát, K.

Year

2024

Published

Communication Papers of the 19th Conference on Computer Science and Intelligence Systems. Institute of Electrical and Electronics Engineers Inc., 2024. p. 77-82. Annals of Computer Science and Intelligence Systems. vol. 41. ISSN 2300-5963. ISBN 978-83-973291-0-2.

Type

Proceedings paper

DOI

10.15439/2024F5990

Departments

Department of Applied Mathematics

Annotation

Accurate local weather forecasting is vital for farmers to optimize crop yields and manage resources effectively, but existing forecasts often lack the precision required locally. This study explores the potential of combining data from local weather stations with global forecasts and reanalysis data to improve the accuracy of local weather predictions. We propose integrating the HadISD data set, which contains data from 27 stations in the Czech Republic, with the Global Forecast System predictions and ERA5-Land reanalysis data. Our goal is to improve 24-hour weather forecasts using Multilayer Perceptrons, CatBoost, and Long Short-Term Memory neural networks. The findings demonstrate that combining local weather station data with global forecasts and incorporating ERA5-Land reanalysis data improves the accuracy of weather predictions in specific locations. Notably, using deep learning to estimate ERA5-Land data and including this estimation in the final model reduced the forecasting error by 59\%. This advancement holds promise in optimizing agricultural practices and mitigating weather-related risks in the region.

Machine Learning Based Tool for Automated Sperm Cell Tracking and Sperm Bundle Detection

Authors

Hořenín, J.; Magdanz, V.; Khalil, I.S.M.; Klingner, A.; Kovalenko, A.; Čepek, M.

Year

2024

Published

Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track. Cham: Springer, 2024. p. 19-32. Lecture Notes in Computer Science. vol. 14950. ISSN 2945-9133. ISBN 978-3-031-70380-5.

Type

Proceedings paper

DOI

10.1007/978-3-031-70381-2_2

Departments

Department of Applied Mathematics

Annotation

This study introduces a novel machine learning-based methodology for automated detection and tracking of sperm cells within microscopic video recordings, aiming to elucidate the dynamics and motion patterns of individual sperm cells as well as sperm cell bundles. At first, the method identifies sperm cells across successive frames within a video sequence, facilitating the reconstruction of each cell's trajectory over time. Subsequently, we introduce a classification algorithm that distinguishes between solitary sperm cells, clusters of adjacent cells, and cohesive sperm cell bundles, addressing a gap in existing methodologies. Finally, we employ three conventional metrics for velocity assessment: Straight Line Velocity (VSL) and Average Path Velocity (VAP) and Curvilinear velocity (VCL), to quantify the movement speed of both individual sperm cells and bundles. The approach represents a significant advancement in the automated analysis of sperm motility and aggregation phenomena, providing a robust tool for researchers to study sperm behavior with enhanced accuracy and efficiency. The integration of machine learning techniques in sperm cell detection and tracking offers promising insights into reproductive biology and fertility studies.

Overcoming Long Inference Time of Nearest Neighbors Analysis in Regression and Uncertainty Prediction

Authors

Koutenský, F.; Šimánek, P.; Čepek, M.; Kovalenko, A.

Year

2024

Published

SN Computer Science. 2024, 5(5), ISSN 2662-995X.

Type

Article

DOI

10.1007/s42979-024-02670-2

Departments

Department of Applied Mathematics

Annotation

The intuitive approach of comparing like with like, forms the basis of the so-called nearest neighbor analysis, which is central to many machine learning algorithms. Nearest neighbor analysis is easy to interpret, analyze, and reason about. It is widely used in advanced techniques such as uncertainty estimation in regression models, as well as the renowned k-nearest neighbor-based algorithms. Nevertheless, its high inference time complexity, which is dataset size dependent even in the case of its faster approximated version, restricts its applications and can considerably inflate the application cost. In this paper, we address the problem of high inference time complexity. By using gradient-boosted regression trees as a predictor of the labels obtained from nearest neighbor analysis, we demonstrate a significant increase in inference speed, improving by several orders of magnitude. We validate the effectiveness of our approach on a real-world European Car Pricing Dataset with approximately rows for both residual cost and price uncertainty prediction. Moreover, we assess our method’s performance on the most commonly used tabular benchmark datasets to demonstrate its scalability. The link is to github repository where the code is available: https://github.com/koutefra/uncertainty_experiments.

Adapting the Size of Artificial Neural Networks Using Dynamic Auto-Sizing

Authors

Cahlík, V.; Kordík, P.; Čepek, M.

Year

2022

Published

IEEE 17th International Conference on Computer Science and Information Technologies. Dortmund: IEEE, 2022. p. 592-596. ISBN 979-8-3503-3431-9.

Type

Proceedings paper

DOI

10.1109/CSIT56902.2022.10000471

Departments

Department of Applied Mathematics

Annotation

We introduce dynamic auto-sizing, a novel approach to training artificial neural networks which allows the models to automatically adapt their size to the problem domain. The size of the models can be further controlled during the learning process by modifying the applied strength of regularization. The ability of dynamic auto-sizing models to expand or shrink their hidden layers is achieved by periodically growing and pruning entire units such as neurons or filters. For this purpose, we introduce weighted L1 regularization, a novel regularization method for inducing structured sparsity. Besides analyzing the behavior of dynamic auto-sizing, we evaluate predictive performance of models trained using the method and show that such models can provide a predictive advantage over traditional approaches.

Meta-learning approach to neural network optimization

Authors

Kordík, P.; Koutník, J.; Drchal, J.; Kovářík, O.; Čepek, M.; Šnorek, M.

Year

2010

Published

Neural Networks. 2010, 2010 (23)(4), 568-582. ISSN 0893-6080.

Type

Article

DOI

10.1016/j.neunet.2010.02.003

Departments

Department of Theoretical Computer Science

Annotation

Optimization of neural network topology, weights and neuron transfer functions for given data set and problem is not an easy task. In this article, we focus primarily on building optimal feed-forward neural network classifier for i.i.d. data sets. We apply metalearning principles to the neural network structure and function optimization. We show that diversity promotion, ensembling, self-organization and induction are beneficial for the problem. We combine several different neuron types trained by various optimization algorithms to build a supervised feedforward neural network called Group of Adaptive Models Evolution (GAME). The approach was tested on wide number of benchmark data sets. The experiments show that the combination of different optimization algorithms in the network is the best choice when the performance is averaged over several real-world problems.

The Effect of Modelling Method to the Inductive Preprocessing Algorithm

Authors

Čepek, M.; Kordík, P.; Šnorek, M.

Year

2010

Published

Proceedings of 3rd International Conference on Inductive Modelling 2010. Kiev: Ukr. INTEI, 2010. pp. 131-138.

Type

Proceedings paper

Departments

Department of Theoretical Computer Science

Annotation

The data preprocessing is very important part of the knowledge discovery process. Data mining systems con- tains tens of preprocessing methods (for example methods for missing data imputation, data reduction, discretization, data enrichment, etc...) and usually it is not clear which methods to use. The selection of preprocessing methods appropriate for particular dataset needs strong experience and a lot of experimenting. In this paper we will test influence of modelling method which is the corner stone of Inductive Preprocessing Algorithm. Modelling method is used to evaluate evolved sequence of the preprocessing methods. In this paper we compare four modelling methods in respect to final achieved accuracy. The tested modelling methods are Polynomial model, Decision Tree, SVM and Logistic Function Classifier. To test our automatic preprocessing utilize several real-world datasets available from UCI Machine learning repository.

Testing of Inductive Preprocessing Algorithm

Authors

Čepek, M.; Kordík, P.; Šnorek, M.

Year

2009

Published

Proceedings of the 3rd International Workshop on Inductive Modelling 2009. Kiev: Ukr. INTEI, 2009. pp. 13-18.

Type

Proceedings paper

Departments

Department of Theoretical Computer Science

Annotation

The data preprocessing is very important part of the knowledge discovery process. Data mining systems contains tens of preprocessing methods (for example methods for missing data imputation, data reduction, discretization, data enrichment, etc...) and usually it is not clear which methods to use. The selection of preprocessing methods appropriate for particular dataset needs strong experience and a lot of experimenting. In this paper we will test our extension of inductive approach to data preprocessing. We developed inductive preprocessing method which utilizes genetic algorithm to compose from scratch a sequence of preprocessing methods which fits to the data and allows successful model to be created. To test our automatic preprocessing utilize several real-world datasets available from UCI Machine learning repository.

Ing. Miroslav Čepek, Ph.D.

Publications

Combining Local and Global Weather Data to Improve Forecast Accuracy for Agriculture

Machine Learning Based Tool for Automated Sperm Cell Tracking and Sperm Bundle Detection

Overcoming Long Inference Time of Nearest Neighbors Analysis in Regression and Uncertainty Prediction

Adapting the Size of Artificial Neural Networks Using Dynamic Auto-Sizing

Meta-learning approach to neural network optimization

The Effect of Modelling Method to the Inductive Preprocessing Algorithm

Testing of Inductive Preprocessing Algorithm