Ing. Miroslav Čepek, Ph.D.

Publications

Overcoming Long Inference Time of Nearest Neighbors Analysis in Regression and Uncertainty Prediction

Year
2024
Published
SN Computer Science. 2024, 5(5), ISSN 2662-995X.
Type
Article
Annotation
The intuitive approach of comparing like with like, forms the basis of the so-called nearest neighbor analysis, which is central to many machine learning algorithms. Nearest neighbor analysis is easy to interpret, analyze, and reason about. It is widely used in advanced techniques such as uncertainty estimation in regression models, as well as the renowned k-nearest neighbor-based algorithms. Nevertheless, its high inference time complexity, which is dataset size dependent even in the case of its faster approximated version, restricts its applications and can considerably inflate the application cost. In this paper, we address the problem of high inference time complexity. By using gradient-boosted regression trees as a predictor of the labels obtained from nearest neighbor analysis, we demonstrate a significant increase in inference speed, improving by several orders of magnitude. We validate the effectiveness of our approach on a real-world European Car Pricing Dataset with approximately rows for both residual cost and price uncertainty prediction. Moreover, we assess our method’s performance on the most commonly used tabular benchmark datasets to demonstrate its scalability. The link is to github repository where the code is available: https://github.com/koutefra/uncertainty_experiments.

Adapting the Size of Artificial Neural Networks Using Dynamic Auto-Sizing

Authors
Cahlík, V.; Kordík, P.; Čepek, M.
Year
2022
Published
IEEE 17th International Conference on Computer Science and Information Technologies. Dortmund: IEEE, 2022. p. 592-596. ISBN 979-8-3503-3431-9.
Type
Proceedings paper
Annotation
We introduce dynamic auto-sizing, a novel approach to training artificial neural networks which allows the models to automatically adapt their size to the problem domain. The size of the models can be further controlled during the learning process by modifying the applied strength of regularization. The ability of dynamic auto-sizing models to expand or shrink their hidden layers is achieved by periodically growing and pruning entire units such as neurons or filters. For this purpose, we introduce weighted L1 regularization, a novel regularization method for inducing structured sparsity. Besides analyzing the behavior of dynamic auto-sizing, we evaluate predictive performance of models trained using the method and show that such models can provide a predictive advantage over traditional approaches.

Meta-learning approach to neural network optimization

Authors
Kordík, P.; Koutník, J.; Drchal, J.; Kovářík, O.; Čepek, M.; Šnorek, M.
Year
2010
Published
Neural Networks. 2010, 2010 (23)(4), 568-582. ISSN 0893-6080.
Type
Article
Annotation
Optimization of neural network topology, weights and neuron transfer functions for given data set and problem is not an easy task. In this article, we focus primarily on building optimal feed-forward neural network classifier for i.i.d. data sets. We apply metalearning principles to the neural network structure and function optimization. We show that diversity promotion, ensembling, self-organization and induction are beneficial for the problem. We combine several different neuron types trained by various optimization algorithms to build a supervised feedforward neural network called Group of Adaptive Models Evolution (GAME). The approach was tested on wide number of benchmark data sets. The experiments show that the combination of different optimization algorithms in the network is the best choice when the performance is averaged over several real-world problems.

The Effect of Modelling Method to the Inductive Preprocessing Algorithm

Authors
Čepek, M.; Kordík, P.; Šnorek, M.
Year
2010
Published
Proceedings of 3rd International Conference on Inductive Modelling 2010. Kiev: Ukr. INTEI, 2010. pp. 131-138.
Type
Proceedings paper
Annotation
The data preprocessing is very important part of the knowledge discovery process. Data mining systems con- tains tens of preprocessing methods (for example methods for missing data imputation, data reduction, discretization, data enrichment, etc...) and usually it is not clear which methods to use. The selection of preprocessing methods appropriate for particular dataset needs strong experience and a lot of experimenting. In this paper we will test influence of modelling method which is the corner stone of Inductive Preprocessing Algorithm. Modelling method is used to evaluate evolved sequence of the preprocessing methods. In this paper we compare four modelling methods in respect to final achieved accuracy. The tested modelling methods are Polynomial model, Decision Tree, SVM and Logistic Function Classifier. To test our automatic preprocessing utilize several real-world datasets available from UCI Machine learning repository.

Testing of Inductive Preprocessing Algorithm

Authors
Čepek, M.; Kordík, P.; Šnorek, M.
Year
2009
Published
Proceedings of the 3rd International Workshop on Inductive Modelling 2009. Kiev: Ukr. INTEI, 2009. pp. 13-18.
Type
Proceedings paper
Annotation
The data preprocessing is very important part of the knowledge discovery process. Data mining systems contains tens of preprocessing methods (for example methods for missing data imputation, data reduction, discretization, data enrichment, etc...) and usually it is not clear which methods to use. The selection of preprocessing methods appropriate for particular dataset needs strong experience and a lot of experimenting. In this paper we will test our extension of inductive approach to data preprocessing. We developed inductive preprocessing method which utilizes genetic algorithm to compose from scratch a sequence of preprocessing methods which fits to the data and allows successful model to be created. To test our automatic preprocessing utilize several real-world datasets available from UCI Machine learning repository.