doc. Ing. Filip Křikava, Ph.D.

filip.krikava@fit.cvut.cz
TH:A-1254

Theses

Topics of theses
Sample theses

Dissertation theses

Fast and robust data-analysis pipelines

Supervisor

doc. Ing. Filip Křikava, Ph.D.

Level

Topic of dissertation thesis

Topic description

Data analysis is typically performed by composing a series of discrete tools and libraries into a data analysis pipeline. These pipelines are at the core of data-driven science that has been central to most disciplines and today see an explosion in the widespread use of computational methods and available data. As the number of tools and size of data keep growing, we face problems with the scalability of the pipelines and the trustworthiness of their results.

The goal of this work is to research ways to make data analysis pipelines scalable (accommodate growing data and computational needs) and trustworthy (facilitate auditing of the analysis result). The research will go along two axes. The first will focus on extending the R programming language with transparent horizontal and vertical scaling. The second will study a combination of static and dynamic program analysis techniques to gain insight into the nature and severity of programming errors in the code of data-analysis pipelines, and propose algorithms for their detection and possible automated repair.

Bachelor theses

Interactively controlled PC games using smart phones

Author

Marek Foltýn

Year

2016

Type

Bachelor thesis

Supervisor

Ing. Filip Křikava, Ph.D.

Reviewers

Ing. Vojtěch Jirkovský

Department

Department of Software Engineering

Summary

The main purpose of this thesis is to create an interactive PC game control system using smartphones in order to enhance the game experience. The thesis contains the analysis of different ways how a PC game can be controlled, the overwiev of interactive mobile technologies and also the communication system implementation, which is demonstrated in a simple game.

Thesis on DSpace

Web application PWiL - System for localization of patients

Author

Vít Medřický

Year

2016

Type

Bachelor thesis

Supervisor

Ing. Filip Křikava, Ph.D.

Reviewers

Ing. Jan Bradáč

Department

Department of Software Engineering

Summary

This thesis is focused on the process of creating a web system that makes possible to locate patients in nursing homes. The purpose is to simplify patient's ability to call for help in need. The thesis begins with analysis of system's requirements and use cases, based on which the user interface is designed, and the wireframes are constructed. System is implemented using the Laravel framework.

Thesis on DSpace

Dynamic test generation for R packages

Author

Filippo Ghibellini

Year

2017

Type

Bachelor thesis

Supervisor

Ing. Filip Křikava, Ph.D.

Department

Department of Theoretical Computer Science

Summary

Statistical computing is gaining popularity with the increasing demand in related fields like Machine learning, Big data, and others. R is the main player in terms of programming languages but its unique design made it difficult to share advancements from standard languages. One of the artifacts is the deficiency of more advanced testing tools. In this thesis we present a tool that allows to record executions of an R program and generate unit tests asserting the reproducibility of the observed behaviour.

Thesis on DSpace

Conjugata - conjugation training app - frontend

Author

Matěj Sedlák

Year

2018

Type

Bachelor thesis

Supervisor

Ing. Filip Křikava, Ph.D.

Reviewers

prof. Dr. Ing. Petr Kroha, CSc.

Department

Department of Software Engineering

Summary

Bachelors thesis focuses on user interface guidelines and application development for education and practise of foreign languages verb conjugation. We dedicate this version to spanish. Our application provides two testing modes (easy and challenging) and three answer input options. Concretely selection from five verb forms, combining letters to create the verb and writing the verb out from the keyboard. Furthermore, it offers user performance statistics and building unique test for individual users. This application is for users with miscellaneous levels of spanish language knowledge thanks to availability to choose verb categories, which user wants to learn and practice. This thesis collaborates with Petr Polivka's thesis (student FIT), who is making this applications backend.

Thesis on DSpace

Conjugata - conjugation training app - backend

Author

Petr Polívka

Year

2018

Type

Bachelor thesis

Supervisor

Ing. Filip Křikava, Ph.D.

Reviewers

prof. Dr. Ing. Petr Kroha, CSc.

Department

Department of Software Engineering

Summary

This bachelors thesis focuses on analysis, schema and subsequent backend implementation of a mobile application for Android operating system. The aplication called Conjugata aids with learning verb conjugation in foreign languages, concretely in spanish. It also adjusts itself to the answers the end user is choosing. It offers selected tenses and categories used in spanish language. Furthermore, it provides educational system, that walks the user through the rules of spanish verb conjugation. We created a helpful tool for all spanish students, who want to practice verb conjugation, but also for beginners, that are new to verb conjugation and want to learn it. The application was made in cooperation with ČVUT student Matej Sedlak, who worked on the applications frontend. In the attachment you'll find the finished application in APK format for mobile phones with Android operating system

Thesis on DSpace

ArtilEcho - a strategy game for blind people

Author

Tomáš Jozífek

Year

2019

Type

Bachelor thesis

Supervisor

Ing. Filip Křikava, Ph.D.

Reviewers

doc. Ing. Mgr. Petr Klán, CSc.

Department

Department of Software Engineering

Summary

Video games are an extremely popular form of entertainment, but cannotbe enjoyed by people with severe visual imparity since they use computergraphics as the primary source of interaction. Audio games, on the otherhand, rely on speech synthesis and sounds making it possible for even fullyblind people to play computer games. This thesis focuses on the creation ofa turn-based multiplayer artillery-type audio game similar to ShellShock Live.It also provides a general overview of audio game design.

Thesis on DSpace

Pattern matching in C11

Author

Jan Jindráček

Year

2019

Type

Bachelor thesis

Supervisor

Ing. Filip Křikava, Ph.D.

Reviewers

Ing. Jan Trávníček, Ph.D.

Department

Department of Software Engineering

Summary

Pattern matching is a general mechanism found in many programming languages for checking a value against a certain pattern. A matching pattern can also be used to deconstruct a value into its constituent parts. This work will study the possibility of adding such a feature to the C programming language. Analyze the pattern matching in different programming languages and propose syntax and semantic, suitable for the C programming language. Implement a prototype including suitable test suite and proper documentation.

Thesis on DSpace

An actor model implementation for the OCaml programming language

Author

Narek Vardanjan

Year

2020

Type

Bachelor thesis

Supervisor

Ing. Filip Křikava, Ph.D.

Reviewers

Ing. Petr Špaček, Ph.D.

Department

Department of Software Engineering

Summary

The actor model is an abstraction for concurrent programming, that uses actors, independent units of computations. These units are spawned, so they can communicate with each other via messages and change their states accordingly to them. Messages are processed serially, which guarantees needed synchronization for a state change. Thanks to that there is no need for using synchronization primitives like locks. The work describes the core parts of an actor model library implementation. It consists of a brief introduction to the classic concurrent model with its drawbacks. Then it introduces the actor model and its most influential flavors. Aside from the core constructs of spawning, sending messages, and state changes, the library implements additional functionality for monitoring/linking the actors and name resolving. The library is written in Objective Caml leveraging its language features.

Thesis on DSpace

Haskell Dynamic Tracing

Author

Ondřej Kvapil

Year

2021

Type

Bachelor thesis

Supervisor

doc. Ing. Filip Křikava, Ph.D.

Reviewers

Vitaly Bragilevsky, MSc.

Department

Department of Theoretical Computer Science

Summary

Haskell is one of the most well-known instances of a programming language that uses non-strict semantics. On the one hand, this brings the convenience of infinite data structures, user-defined control flow, and the possibility to avoid unnecessary computation. On the other hand, these benefits are hampered by the runtime overhead and hard-to-predict the behaviour of call-by-need. This begs the question: Is laziness worth it? To answer this question, we need to understand how laziness is used in the wild. To this end, we develop a tool for dynamic analysis used to trace the evaluation of function parameters. It is implemented as a compiler plugin for the Glasgow Haskell Compiler.

Thesis on DSpace

Analyze Scala Code Using Graph Database

Author

Otakar Vinklář

Year

2020

Type

Bachelor thesis

Supervisor

Ing. Filip Křikava, Ph.D.

Reviewers

Ing. Michal Valenta, Ph.D.

Department

Department of Theoretical Computer Science

Summary

For a programming language to become as convenient as possible and to evolve it is necessary to understand how the language is used by the programmers. In paper [KŘIKAVA, Filip; MILLER, Heather; VITEK, Jan. Scala implicits are everywhere: a large-scale study of the use of Scala implicits in the wild. Proceedings of the ACM on Programming Languages. 2019, vol. 3, no.OOPSLA, pp. 1-28.] Krikava et al colleagues conduct large scale study on Implicits -- Scala programming language feature. The analysis part of the underlying solution proves to be cumbersome, as the relational data model is not suitable for highly connected data. This thesis follows the underlying implementation with the aim to improve the analysis part, in terms of flexibility to support the creation of new queries. This thesis confronts these challenges with the use of graph database, which supports storing data with high amount of relationships. Particularly Neo4j graph database implementation with its Cypher query language is chosen for this purpose. This thesis shows, that the graph database with its query language offers a high-level interface for static code analysis.

Thesis on DSpace

An overview of gradual typing approaches in dynamic programming languages

Author

Rostislav Blaha

Year

2023

Type

Bachelor thesis

Supervisor

doc. Ing. Filip Křikava, Ph.D.

Reviewers

Pierre Donat-Bouillud, Ph.D.

Department

Department of Software Engineering

Summary

Gradual typing is a feature that allows programming languages to combine dynamic and static typing within the same codebase, enabling the incremental addition of type annotations as the code evolves. This study investigates existing approaches to gradual typing in Python, Ruby, and PHP, with the goal of identifying techniques that could be applied to the R programming language. The thesis synthesizes information from multiple sources such as documentation, academic articles, blog posts, formal proposals, and forum discussions. It follows up with suggestions on implementation, syntax, semantics, tool support, and adoption strategies for gradual typing in R.

Thesis on DSpace

Simple Object Machine implementation in functional programming language

Author

Filip Říha

Year

2023

Type

Bachelor thesis

Supervisor

doc. Ing. Filip Křikava, Ph.D.

Reviewers

Ing. Jan Liam Verter

Department

Department of Theoretical Computer Science

Summary

This thesis provides an implementation of a Smalltalk programing language dialect called Simple Object Machine (SOM) in Haskell, a purely functional language. It explores the syntax and semantics of a SOM program and analyses already existing implementations. Then it provides the design and implementation details of the virtual machine, that is based on bytecode instructions and a bytecode interpreter. The parts of the VM are individualy explored, which are lexer, parser, compiler, runtime environment and garbage collector.

Thesis on DSpace

Master theses

Shere - notes and document management application

Author

Marek Foltýn

Year

2018

Type

Master thesis

Supervisor

Ing. Filip Křikava, Ph.D.

Reviewers

Ing. Jan Ječmen

Department

Department of Software Engineering

Summary

The main purpose of this thesis is to create a note taking application in order to improve note taking experience. The thesis contains the analysis of exist- ing note taking software, the discussion of their shortcomings and also the implementation. Its design is based on the analysis and addresses the main shortcomings.

Thesis on DSpace

Localization system PROWiLOS v1.0

Author

Vít Medřický

Year

2019

Type

Master thesis

Supervisor

Ing. Filip Křikava, Ph.D.

Reviewers

Ing. Jan Bradáč

Department

Department of Software Engineering

Summary

This thesis follows on a prototype of web system PWiL created in author's bachelor thesis. PWiL is a system for patient localization in nursery homes allowing effective call for help when necessary. From the prototype in the bachelor thesis a functional system has been created and is now suitable for deployment into a trial run. Missing functionalities of the web application user interface are added and communication interfaces are implemented and tested. The application is implemented in PHP using the Laravel framework.

Thesis on DSpace

Grammatica - an app for practicing grammar

Author

Alexander Bublik

Year

2019

Type

Master thesis

Supervisor

Ing. Filip Křikava, Ph.D.

Reviewers

prof. Dr. Ing. Petr Kroha, CSc.

Department

Department of Software Engineering

Summary

This thesis describes the process of creating a mobile application for Android OS for studying foreign languages and server-side with a database that communicates with the client through the REST API. The application support handwriting input and simulates the experience of doing exercises in the textbook as accurately as possible. The thesis paper contains an analysis of possible solutions and competing applications, designing client and server parts, implementation and testing of the application.

Thesis on DSpace

SWM - Simple Window Manager

Author

Jan Bína

Year

2020

Type

Master thesis

Supervisor

Ing. Filip Křikava, Ph.D.

Reviewers

doc. Ing. Jan Janoušek, Ph.D.

Department

Department of Theoretical Computer Science

Summary

This thesis deals with the design and implementation of a stacking window manager for the X Window System. A window manager is the core component of any modern graphical desktop - it is responsible for the placement and appearance of application windows on the screen. While there is a plethora of window managers, especially on the X Window System, most of them are either heavyweight window managers that are part of a desktop environment or lightweight tiling managers. In this thesis, we try to fill the gap by developing a lightweight stackable window manager that complies with the freedesktop standards such as ICCCM and EWMH, and that follows the UNIX philosophy of doing one thing and doing it well. The focus has been on simplicity, code readability, testability, and making it easy to use and extend.

Thesis on DSpace

Desktop agnostic power manager for Linux

Author

Róbert Selvek

Year

2022

Type

Master thesis

Supervisor

doc. Ing. Filip Křikava, Ph.D.

Reviewers

Ing. Jiří Kašpar

Department

Department of Theoretical Computer Science

Summary

A power manager is a service that reacts to user activity and triggers various power-saving actions. For example, it dims the screen, locks the user session or transfers the system into a sleep state. Most of the existing power managers are tightly integrated into complete desktop environments such as GNOME or KDE and thus unusable as a standalone service for custom-made desktops (a popular trend among Linux power-users). The very few standalone ones lack the features one would expect from power management known from a modern desktop. This thesis describes the design and implementation of a novel, standalone, desktop agnostic power manager. It supports all the usual power-saving actions, is easy to configure, and can be extended with new behavior.

Thesis on DSpace

Ahead-of-time compiler for the microC langauge

Author

Václav Král

Year

2022

Type

Master thesis

Supervisor

doc. Ing. Filip Křikava, Ph.D.

Reviewers

Ing. Jiří Kašpar

Department

Department of Theoretical Computer Science

Summary

The aim of this thesis is to implement an ahead-of-time optimizing compiler for microC---language used in the NI-APR (Selected Methods for Program Analysis) course at FIT CTU for teaching program analyses. The compiler should primarily serve as an educational material of the course NI-APR, which demonstrates application and usefulness of selected static analyses during compilation and optimization. In this thesis, the reader will get familiar with not only the architecture of compilers, but also with the static analyses supported by the compiler. Further in this thesis, the design and most importantly the implementation are discussed. The optimization capabilities of the implementation are then demonstrated on several examples. Some of the possible future work improvements are proposed at the end of the thesis. The result of the thesis is a working optimizing microC compiler.

Thesis on DSpace

Automated data analysis pipelines

Author

Michael Vrána

Year

2023

Type

Master thesis

Supervisor

doc. Ing. Filip Křikava, Ph.D.

Reviewers

Pierre Donat-Bouillud, Ph.D.

Department

Department of Theoretical Computer Science

Summary

Data analysis pipelines describe data analysis as a sequence of interdependent steps. These pipelines enable reproducibility and effective execution of the analysis. This thesis describes the design and implementation of an R package called Pipelinr, a domain-specific language and a runtime for data analysis pipelines. The designed DSL allows the user to describe the pipeline as a set of interdependent stages. Furthermore, it allows the user to use various composable dynamic branching patterns to break down a stage into a set of tasks, which can be executed in parallel using GNU Parallel. The runtime also provides the user with metadata about the pipeline's execution, which can also be used as input to the pipeline itself.

Thesis on DSpace

Out of process byte-code copiler for the R programming language

Author

Adam Plodek

Year

2024

Type

Master thesis

Supervisor

doc. Ing. Filip Křikava, Ph.D.

Reviewers

Sebastián Krynski, MSc.

Department

Department of Theoretical Computer Science

Summary

R is a dynamic programming language used mainly in statistics and data visualization. Its unique set of features and extensive ecosystem of packages enables statisticians to write software without the need to be software engineers. The GNU R implementation of an interpreter for R programming language is considered primary implementation. To speed up the execution of the R programs, the bytecode interpreter was implemented next to the standard AST interpreter. To compile the AST representation of the program into its bytecode representation, the compiler for GNU R bytecode was introduced. This thesis explores one possibility of improvement for this compilation process, namely the out-of-process compilation. This approach allows the implementation of the compilers in different languages and could unlock more possibilities for sharing the compiled code between clients. Moreover, the compiler process can be located outside of the machine on which the R interpreter is running, which can be used to move the compilation overhead to a more powerful machine. I describe the process of creating the experimental implementation of such a solution done in Rust programming language, which can serve as a baseline for future work. To achieve this, the custom representation of R values and serialization of those values was created. This was then used to implement the compiler and server, which communicates with the package that can be used by the interpreter. Finally, I evaluate the current state of implementation of my compilation server. This is split into two parts: correctness and performance. Both of these criteria are compared against the current implementation embedded in the GNU R interpreter. The result of this evalutation showed that the compilation process could be sped up 20 times in best case scenario, when only compilation it self is counted. When the loading of the data is included the speed up ended up being 3 times compared to GNU R implementation.

Thesis on DSpace

Profiler for the R programming language

Author

Karolina Hrnčiříková

Year

2024

Type

Master thesis

Supervisor

doc. Ing. Filip Křikava, Ph.D.

Reviewers

Mgr. Tomáš Petříček, Ph.D.

Department

Department of Theoretical Computer Science

Summary

The R language excels in data exploration and analysis but often faces challenges regarding execution speed and efficiency. R is dynamically typed, has automatic memory collection, and, most importantly, one of its main implementations, GNU R, interprets AST in combination with just-in-time compilation into bytecode. All these factors contribute to R being a comparatively slow language. To improve performance, users are forced to rewrite performance-sensitive code in C, C++, or Fortran through packages. However, finding out which code segments are slow because they are executed in the R interpreter is not easy because the current profiling methods do not distinguish between native and R execution. In this thesis, we propose a profiler that can distinguish between R and native execution. Inspired by Scalene, a profiler for Python, we implement a proof-of-concept profiler into GNU R 4.3.3. We evaluate the profiler in comparison to Rprof, the most used R profiler.

Thesis on DSpace

Enriched contextual dispatch for Ř

Author

Michal Štěpánek

Year

2025

Type

Master thesis

Supervisor

doc. Ing. Filip Křikava, Ph.D.

Reviewers

Ing. Petr Máj, Ph.D.

Department

Department of Theoretical Computer Science

Summary

The R programming language primarily focuses on statistical computing and data visualization while minimizing requirements on users' proficiency in computer science. R is a dynamically typed, functional, object-oriented, and interpreted language with lazy evaluation. GNU-R, the most widespread implementation, interprets AST or bytecode generated via JIT compilation resulting in possibly inefficient program execution, due to the genericity of these representations with relation to the R. The Ř is an alternative JIT R compiler, which extends GNU-R with native code compilation to achieve better performance. The Ř compiler performs specialization and speculation to optimize the code. Specialization is performed via contextual dispatch and speculative optimization is performed based on the information (type feedback) collected. Ř uses one type feedback object per R closure. This thesis extends the Ř compiler with the ability to collect and store speculative information relative to the calling context. The thesis designs, implements, and discusses strategies used for utilization of the extension, namely feedback merging and feedback filling, and assesses their performance on benchmarks. The thesis concludes that while the performance for developed strategies is comparable, feedback merging, or similar mechanism, should be used. On the contrary, methods defined for feedback filling offer limited advantages and are unnecessary. The thesis also finds programs benefitting from feedback splitting and therefore concludes that the idea should be developed further.

Thesis on DSpace

Using malware detection techniques for dependency detection of R programs

Author

Petr Adámek

Year

2024

Type

Master thesis

Supervisor

doc. Ing. Filip Křikava, Ph.D.

Reviewers

Pierre Donat-Bouillud, Ph.D.

Department

Department of Information Security

Summary

R is a programming language commonly used in data science. This is possible due to the large number of publicly accessible packages. To reduce the number of misbehaving packages, some checks are made. These include automatically running code examples but still leave a lot of room for manual checking. A tool that would automatically gather the dependencies of any given R program could significantly reduce the manual overhead required. It would also find use in the context of creating an environment in which a program's result could be consistently reproduced. This thesis imitates some of the work done in tools used to analyse or sandbox potentially harmful programs. Using the system call interposition mechanisms in the Linux kernel, I have created a tool which can track dependencies of a given program. These dependencies can then be used to generate a report for auditing or create an environment in which the program execution can be reproduced. Even though the tool cannot be directly used for any security-critical purposes due to the used mechanism and its associated race conditions, the tool is useful in reproducing shell scripts, execution of R programs, and others. The thesis also mentions many of the variables and potential attack vectors a complete solution would have to consider, which are often glossed over. Moreover, other potential manners in which a tracing mechanism such as this could be implemented are explored.

Thesis on DSpace