doc. Ing. Filip Křikava, Ph.D.

Theses

Dissertation theses

Fast and robust data-analysis pipelines

Level
Topic of dissertation thesis
Topic description

Data analysis is typically performed by composing a series of discrete tools and libraries into a data analysis pipeline. These pipelines are at the core of data-driven science that has been central to most disciplines and today see an explosion in the widespread use of computational methods and available data. As the number of tools and size of data keep growing, we face problems with the scalability of the pipelines and the trustworthiness of their results.

The goal of this work is to research ways to make data analysis pipelines scalable (accommodate growing data and computational needs) and trustworthy (facilitate auditing of the analysis result). The research will go along two axes. The first will focus on extending the R programming language with transparent horizontal and vertical scaling. The second will study a combination of static and dynamic program analysis techniques to gain insight into the nature and severity of programming errors in the code of data-analysis pipelines, and propose algorithms for their detection and possible automated repair.

Bachelor theses

Interactively controlled PC games using smart phones

Author
Marek Foltýn
Year
2016
Type
Bachelor thesis
Supervisor
Ing. Filip Křikava, Ph.D.
Reviewers
Ing. Vojtěch Jirkovský
Summary
The main purpose of this thesis is to create an interactive PC game control system using smartphones in order to enhance the game experience. The thesis contains the analysis of different ways how a PC game can be controlled, the overwiev of interactive mobile technologies and also the communication system implementation, which is demonstrated in a simple game.

Web application PWiL - System for localization of patients

Author
Vít Medřický
Year
2016
Type
Bachelor thesis
Supervisor
Ing. Filip Křikava, Ph.D.
Reviewers
Ing. Jan Bradáč
Summary
This thesis is focused on the process of creating a web system that makes possible to locate patients in nursing homes. The purpose is to simplify patient's ability to call for help in need. The thesis begins with analysis of system's requirements and use cases, based on which the user interface is designed, and the wireframes are constructed. System is implemented using the Laravel framework.

Dynamic test generation for R packages

Author
Filippo Ghibellini
Year
2017
Type
Bachelor thesis
Supervisor
Ing. Filip Křikava, Ph.D.
Summary
Statistical computing is gaining popularity with the increasing demand in related fields like Machine learning, Big data, and others. R is the main player in terms of programming languages but its unique design made it difficult to share advancements from standard languages. One of the artifacts is the deficiency of more advanced testing tools. In this thesis we present a tool that allows to record executions of an R program and generate unit tests asserting the reproducibility of the observed behaviour.

Conjugata - conjugation training app - frontend

Author
Matěj Sedlák
Year
2018
Type
Bachelor thesis
Supervisor
Ing. Filip Křikava, Ph.D.
Reviewers
prof. Dr. Ing. Petr Kroha, CSc.
Summary
Bachelors thesis focuses on user interface guidelines and application development for education and practise of foreign languages verb conjugation. We dedicate this version to spanish. Our application provides two testing modes (easy and challenging) and three answer input options. Concretely selection from five verb forms, combining letters to create the verb and writing the verb out from the keyboard. Furthermore, it offers user performance statistics and building unique test for individual users. This application is for users with miscellaneous levels of spanish language knowledge thanks to availability to choose verb categories, which user wants to learn and practice. This thesis collaborates with Petr Polivka's thesis (student FIT), who is making this applications backend.

Conjugata - conjugation training app - backend

Author
Petr Polívka
Year
2018
Type
Bachelor thesis
Supervisor
Ing. Filip Křikava, Ph.D.
Reviewers
prof. Dr. Ing. Petr Kroha, CSc.
Summary
This bachelors thesis focuses on analysis, schema and subsequent backend implementation of a mobile application for Android operating system. The aplication called Conjugata aids with learning verb conjugation in foreign languages, concretely in spanish. It also adjusts itself to the answers the end user is choosing. It offers selected tenses and categories used in spanish language. Furthermore, it provides educational system, that walks the user through the rules of spanish verb conjugation. We created a helpful tool for all spanish students, who want to practice verb conjugation, but also for beginners, that are new to verb conjugation and want to learn it. The application was made in cooperation with ČVUT student Matej Sedlak, who worked on the applications frontend. In the attachment you'll find the finished application in APK format for mobile phones with Android operating system

ArtilEcho - a strategy game for blind people

Author
Tomáš Jozífek
Year
2019
Type
Bachelor thesis
Supervisor
Ing. Filip Křikava, Ph.D.
Reviewers
doc. Ing. Mgr. Petr Klán, CSc.
Summary
Video games are an extremely popular form of entertainment, but cannotbe enjoyed by people with severe visual imparity since they use computergraphics as the primary source of interaction. Audio games, on the otherhand, rely on speech synthesis and sounds making it possible for even fullyblind people to play computer games. This thesis focuses on the creation ofa turn-based multiplayer artillery-type audio game similar to ShellShock Live.It also provides a general overview of audio game design.

Pattern matching in C11

Author
Jan Jindráček
Year
2019
Type
Bachelor thesis
Supervisor
Ing. Filip Křikava, Ph.D.
Reviewers
Ing. Jan Trávníček, Ph.D.
Summary
Pattern matching is a general mechanism found in many programming languages for checking a value against a certain pattern. A matching pattern can also be used to deconstruct a value into its constituent parts. This work will study the possibility of adding such a feature to the C programming language. Analyze the pattern matching in different programming languages and propose syntax and semantic, suitable for the C programming language. Implement a prototype including suitable test suite and proper documentation.

An actor model implementation for the OCaml programming language

Author
Narek Vardanjan
Year
2020
Type
Bachelor thesis
Supervisor
Ing. Filip Křikava, Ph.D.
Reviewers
Ing. Petr Špaček, Ph.D.
Summary
The actor model is an abstraction for concurrent programming, that uses actors, independent units of computations. These units are spawned, so they can communicate with each other via messages and change their states accordingly to them. Messages are processed serially, which guarantees needed synchronization for a state change. Thanks to that there is no need for using synchronization primitives like locks. The work describes the core parts of an actor model library implementation. It consists of a brief introduction to the classic concurrent model with its drawbacks. Then it introduces the actor model and its most influential flavors. Aside from the core constructs of spawning, sending messages, and state changes, the library implements additional functionality for monitoring/linking the actors and name resolving. The library is written in Objective Caml leveraging its language features.

Haskell Dynamic Tracing

Author
Ondřej Kvapil
Year
2021
Type
Bachelor thesis
Supervisor
doc. Ing. Filip Křikava, Ph.D.
Reviewers
Vitaly Bragilevsky, MSc.
Summary
Haskell is one of the most well-known instances of a programming language that uses non-strict semantics. On the one hand, this brings the convenience of infinite data structures, user-defined control flow, and the possibility to avoid unnecessary computation. On the other hand, these benefits are hampered by the runtime overhead and hard-to-predict the behaviour of call-by-need. This begs the question: Is laziness worth it? To answer this question, we need to understand how laziness is used in the wild. To this end, we develop a tool for dynamic analysis used to trace the evaluation of function parameters. It is implemented as a compiler plugin for the Glasgow Haskell Compiler.

Analyze Scala Code Using Graph Database

Author
Otakar Vinklář
Year
2020
Type
Bachelor thesis
Supervisor
Ing. Filip Křikava, Ph.D.
Reviewers
Ing. Michal Valenta, Ph.D.
Summary
For a programming language to become as convenient as possible and to evolve it is necessary to understand how the language is used by the programmers. In paper [KŘIKAVA, Filip; MILLER, Heather; VITEK, Jan. Scala implicits are everywhere: a large-scale study of the use of Scala implicits in the wild. Proceedings of the ACM on Programming Languages. 2019, vol. 3, no.OOPSLA, pp. 1-28.] Krikava et al colleagues conduct large scale study on Implicits -- Scala programming language feature. The analysis part of the underlying solution proves to be cumbersome, as the relational data model is not suitable for highly connected data. This thesis follows the underlying implementation with the aim to improve the analysis part, in terms of flexibility to support the creation of new queries. This thesis confronts these challenges with the use of graph database, which supports storing data with high amount of relationships. Particularly Neo4j graph database implementation with its Cypher query language is chosen for this purpose. This thesis shows, that the graph database with its query language offers a high-level interface for static code analysis.

An overview of gradual typing approaches in dynamic programming languages

Author
Rostislav Blaha
Year
2023
Type
Bachelor thesis
Supervisor
doc. Ing. Filip Křikava, Ph.D.
Reviewers
Pierre Donat-Bouillud, Ph.D.
Summary
Gradual typing is a feature that allows programming languages to combine dynamic and static typing within the same codebase, enabling the incremental addition of type annotations as the code evolves. This study investigates existing approaches to gradual typing in Python, Ruby, and PHP, with the goal of identifying techniques that could be applied to the R programming language. The thesis synthesizes information from multiple sources such as documentation, academic articles, blog posts, formal proposals, and forum discussions. It follows up with suggestions on implementation, syntax, semantics, tool support, and adoption strategies for gradual typing in R.

Simple Object Machine implementation in functional programming language

Author
Filip Říha
Year
2023
Type
Bachelor thesis
Supervisor
doc. Ing. Filip Křikava, Ph.D.
Reviewers
Ing. Jan Liam Verter
Summary
This thesis provides an implementation of a Smalltalk programing language dialect called Simple Object Machine (SOM) in Haskell, a purely functional language. It explores the syntax and semantics of a SOM program and analyses already existing implementations. Then it provides the design and implementation details of the virtual machine, that is based on bytecode instructions and a bytecode interpreter. The parts of the VM are individualy explored, which are lexer, parser, compiler, runtime environment and garbage collector.

Master theses

Shere - notes and document management application

Author
Marek Foltýn
Year
2018
Type
Master thesis
Supervisor
Ing. Filip Křikava, Ph.D.
Reviewers
Ing. Jan Ječmen
Summary
The main purpose of this thesis is to create a note taking application in order to improve note taking experience. The thesis contains the analysis of exist- ing note taking software, the discussion of their shortcomings and also the implementation. Its design is based on the analysis and addresses the main shortcomings.

Localization system PROWiLOS v1.0

Author
Vít Medřický
Year
2019
Type
Master thesis
Supervisor
Ing. Filip Křikava, Ph.D.
Reviewers
Ing. Jan Bradáč
Summary
This thesis follows on a prototype of web system PWiL created in author's bachelor thesis. PWiL is a system for patient localization in nursery homes allowing effective call for help when necessary. From the prototype in the bachelor thesis a functional system has been created and is now suitable for deployment into a trial run. Missing functionalities of the web application user interface are added and communication interfaces are implemented and tested. The application is implemented in PHP using the Laravel framework.

Grammatica - an app for practicing grammar

Author
Alexander Bublik
Year
2019
Type
Master thesis
Supervisor
Ing. Filip Křikava, Ph.D.
Reviewers
prof. Dr. Ing. Petr Kroha, CSc.
Summary
This thesis describes the process of creating a mobile application for Android OS for studying foreign languages and server-side with a database that communicates with the client through the REST API. The application support handwriting input and simulates the experience of doing exercises in the textbook as accurately as possible. The thesis paper contains an analysis of possible solutions and competing applications, designing client and server parts, implementation and testing of the application.

SWM - Simple Window Manager

Author
Jan Bína
Year
2020
Type
Master thesis
Supervisor
Ing. Filip Křikava, Ph.D.
Reviewers
doc. Ing. Jan Janoušek, Ph.D.
Summary
This thesis deals with the design and implementation of a stacking window manager for the X Window System. A window manager is the core component of any modern graphical desktop - it is responsible for the placement and appearance of application windows on the screen. While there is a plethora of window managers, especially on the X Window System, most of them are either heavyweight window managers that are part of a desktop environment or lightweight tiling managers. In this thesis, we try to fill the gap by developing a lightweight stackable window manager that complies with the freedesktop standards such as ICCCM and EWMH, and that follows the UNIX philosophy of doing one thing and doing it well. The focus has been on simplicity, code readability, testability, and making it easy to use and extend.

Desktop agnostic power manager for Linux

Author
Róbert Selvek
Year
2022
Type
Master thesis
Supervisor
doc. Ing. Filip Křikava, Ph.D.
Reviewers
Ing. Jiří Kašpar
Summary
A power manager is a service that reacts to user activity and triggers various power-saving actions. For example, it dims the screen, locks the user session or transfers the system into a sleep state. Most of the existing power managers are tightly integrated into complete desktop environments such as GNOME or KDE and thus unusable as a standalone service for custom-made desktops (a popular trend among Linux power-users). The very few standalone ones lack the features one would expect from power management known from a modern desktop. This thesis describes the design and implementation of a novel, standalone, desktop agnostic power manager. It supports all the usual power-saving actions, is easy to configure, and can be extended with new behavior.

Ahead-of-time compiler for the microC langauge

Author
Václav Král
Year
2022
Type
Master thesis
Supervisor
doc. Ing. Filip Křikava, Ph.D.
Reviewers
Ing. Jiří Kašpar
Summary
The aim of this thesis is to implement an ahead-of-time optimizing compiler for microC---language used in the NI-APR (Selected Methods for Program Analysis) course at FIT CTU for teaching program analyses. The compiler should primarily serve as an educational material of the course NI-APR, which demonstrates application and usefulness of selected static analyses during compilation and optimization. In this thesis, the reader will get familiar with not only the architecture of compilers, but also with the static analyses supported by the compiler. Further in this thesis, the design and most importantly the implementation are discussed. The optimization capabilities of the implementation are then demonstrated on several examples. Some of the possible future work improvements are proposed at the end of the thesis. The result of the thesis is a working optimizing microC compiler.

Automated data analysis pipelines

Author
Michael Vrána
Year
2023
Type
Master thesis
Supervisor
doc. Ing. Filip Křikava, Ph.D.
Reviewers
Pierre Donat-Bouillud, Ph.D.
Summary
Data analysis pipelines describe data analysis as a sequence of interdependent steps. These pipelines enable reproducibility and effective execution of the analysis. This thesis describes the design and implementation of an R package called Pipelinr, a domain-specific language and a runtime for data analysis pipelines. The designed DSL allows the user to describe the pipeline as a set of interdependent stages. Furthermore, it allows the user to use various composable dynamic branching patterns to break down a stage into a set of tasks, which can be executed in parallel using GNU Parallel. The runtime also provides the user with metadata about the pipeline's execution, which can also be used as input to the pipeline itself.

Out of process byte-code copiler for the R programming language

Author
Adam Plodek
Year
2024
Type
Master thesis
Supervisor
doc. Ing. Filip Křikava, Ph.D.
Reviewers
Sebastián Krynski, MSc.
Summary
R is a dynamic programming language used mainly in statistics and data visualization. Its unique set of features and extensive ecosystem of packages enables statisticians to write software without the need to be software engineers. The GNU R implementation of an interpreter for R programming language is considered primary implementation. To speed up the execution of the R programs, the bytecode interpreter was implemented next to the standard AST interpreter. To compile the AST representation of the program into its bytecode representation, the compiler for GNU R bytecode was introduced. This thesis explores one possibility of improvement for this compilation process, namely the out-of-process compilation. This approach allows the implementation of the compilers in different languages and could unlock more possibilities for sharing the compiled code between clients. Moreover, the compiler process can be located outside of the machine on which the R interpreter is running, which can be used to move the compilation overhead to a more powerful machine. I describe the process of creating the experimental implementation of such a solution done in Rust programming language, which can serve as a baseline for future work. To achieve this, the custom representation of R values and serialization of those values was created. This was then used to implement the compiler and server, which communicates with the package that can be used by the interpreter. Finally, I evaluate the current state of implementation of my compilation server. This is split into two parts: correctness and performance. Both of these criteria are compared against the current implementation embedded in the GNU R interpreter. The result of this evalutation showed that the compilation process could be sped up 20 times in best case scenario, when only compilation it self is counted. When the loading of the data is included the speed up ended up being 3 times compared to GNU R implementation.

Profiler for the R programming language

Author
Karolina Hrnčiříková
Year
2024
Type
Master thesis
Supervisor
doc. Ing. Filip Křikava, Ph.D.
Reviewers
Mgr. Tomáš Petříček, Ph.D.
Summary
The R language excels in data exploration and analysis but often faces challenges regarding execution speed and efficiency. R is dynamically typed, has automatic memory collection, and, most importantly, one of its main implementations, GNU R, interprets AST in combination with just-in-time compilation into bytecode. All these factors contribute to R being a comparatively slow language. To improve performance, users are forced to rewrite performance-sensitive code in C, C++, or Fortran through packages. However, finding out which code segments are slow because they are executed in the R interpreter is not easy because the current profiling methods do not distinguish between native and R execution. In this thesis, we propose a profiler that can distinguish between R and native execution. Inspired by Scalene, a profiler for Python, we implement a proof-of-concept profiler into GNU R 4.3.3. We evaluate the profiler in comparison to Rprof, the most used R profiler.

Using malware detection techniques for dependency detection of R programs

Author
Petr Adámek
Year
2024
Type
Master thesis
Supervisor
doc. Ing. Filip Křikava, Ph.D.
Reviewers
Pierre Donat-Bouillud, Ph.D.
Summary
R is a programming language commonly used in data science. This is possible due to the large number of publicly accessible packages. To reduce the number of misbehaving packages, some checks are made. These include automatically running code examples but still leave a lot of room for manual checking. A tool that would automatically gather the dependencies of any given R program could significantly reduce the manual overhead required. It would also find use in the context of creating an environment in which a program's result could be consistently reproduced. This thesis imitates some of the work done in tools used to analyse or sandbox potentially harmful programs. Using the system call interposition mechanisms in the Linux kernel, I have created a tool which can track dependencies of a given program. These dependencies can then be used to generate a report for auditing or create an environment in which the program execution can be reproduced. Even though the tool cannot be directly used for any security-critical purposes due to the used mechanism and its associated race conditions, the tool is useful in reproducing shell scripts, execution of R programs, and others. The thesis also mentions many of the variables and potential attack vectors a complete solution would have to consider, which are often glossed over. Moreover, other potential manners in which a tracing mechanism such as this could be implemented are explored.