Bachelor theses
Data cleaning with probabilistic programming
Author
Tomáš Jungman
Year
2022
Type
Bachelor thesis
Supervisor
Mgr. Vojtěch Rybář
Reviewers
Ing. Daniel Vašata, Ph.D.
Department
Summary
This paper includes research in the field of cleaning and filling in datasets and focuses on a specific approach using probabilistic programming. The practical part of the work operates with the probabilistic programming language PClean, programmed in Julia. The principles on which it operates are explained and the specific parts required to write the program are laid out. Subsequently, PClean is used to write a program for filling and correcting values in a data set with car records (price, power, fuel, etc.)
Once this dataset is corrected, regression is used to estimate the price and the quality of the result is compared with the results based on uncorrected data with standardised added values for each column or added based on expert knowledge. The model learned the data via PClean does not achieve the qualities of the model based on expert knowledge. However, PClean does offer a fast way to fill in missing categorical values with a quality exceeding the trivial fill-in mechanism commonly used today.
Flow modelling around airfoil with graph neural networks
Author
David Horský
Year
2022
Type
Bachelor thesis
Supervisor
Mgr. Vojtěch Rybář
Reviewers
Ing. Daniel Vašata, Ph.D.
Department
Summary
In this thesis we reviewed uses of machine learning in computational fluid
dynamics. We then implemented a state-of-the-art graph neural network to
simulate the flow around an airfoil in 2D. We train the model at lower speeds
and angles of attack and then extrapolate to higher ones. We trained a model
that extrapolates with a small precision error and remains stable on long
rollouts.
Machine Learning Explainability Methods
Author
Danila Makulov
Year
2024
Type
Bachelor thesis
Supervisor
Mgr. Vojtěch Rybář
Reviewers
Ing. Magda Friedjungová, Ph.D.
Department
Summary
Machine Learning is becoming more and more used in many sensitive applications where it is essential to understand why the models behave as they do. Such a rapid increase has heightened the demand for Explainable Machine Learning and new explanation methods. These methods, however, are not guaranteed to yield consistent outputs.
This work give a concise overview of the current state of Explainable Machine Learning and its methods, focusing primarily on the local explanation methods (e.g. SHAP and LIME) and global plotting methods for tabular data, and methods specific to neural network models. We show examples of inconsistent explanations of SHAP and LIME, illustrate and explain how some methods are impacted by correlation, and show practical examples of using neural network methods to analyze the model and find its biases. In the end we give some recommendations when dealing with inconsistent outputs based on the research we made and our own experiments.
Explainability Methods for Vision Transformers
Author
Miroslav Jára
Year
2024
Type
Bachelor thesis
Supervisor
Mgr. Vojtěch Rybář
Reviewers
Ing. Daniel Vašata, Ph.D.
Department
Summary
This thesis provides a detailed examination of the Vision Transformer (ViT) architecture and the explainability methods available for this architecture. The theoretical section thoroughly analyses the structure and operational principles of ViT and the attention mechanism. Additionally, the theoretical part reviews the current state of explainability methods for ViT, exploring the fundamentals of their functioning, their classifications, and the metrics used to compare their properties. The practical section focuses on evaluating three explainability methods: Beyond Intuition, GradientExplainer, and ViT-Shapley. These methods were applied to PyTorch ViT models that were pre-trained on the ImageNet21K dataset and subsequently fine-tuned on training data from two datasets. The experiments examined the characteristics of these methods, such as faithfulness, robustness, and complexity, using several metrics.
Comparing interpretable models with post-hoc explainable black box models
Author
Mikuláš Kočí
Year
2024
Type
Bachelor thesis
Supervisor
Mgr. Vojtěch Rybář
Reviewers
Ing. Ivo Petr, Ph.D.
Department
Summary
The main objective of this thesis is to provide further proof that interpretable models can achieve similar, if not better performance than black-box models. In our experiment, we found that there was not one dataset, where the black-box model performed significantly better than its interpretable counterparts, the highest difference we saw in terms of the F1 score was 0.02. This advantage is definitely not significant enough to outweigh the advantages brought about by an inherently interpretable model. This is especially true for high-stakes decisions, where we would be forced to use an explainability method, which could fail to reveal bias in the black-box model, potentially leading to poor performance in the real world.