Automatic Poetic Metre Detection for Czech Verse
Autoři
Klesnilová, K.; Klouda, K.; Friedjungová, M.; Plecháč, P.
Rok
2024
Publikováno
Studia Metrica et Poetica. 2024, 11(1), 44-61. ISSN 2346-6901.
Typ
Článek
Pracoviště
Anotace
Metrical analysis of verse is an essential and challenging task in the research on versification consisting of analysing a poem and deciding which metre it is written in. Thanks to existing corpora, we can take advantage of data-driven approaches, which can be better suited to the specific versification problems at hand than rulebased systems.
This work analyses the Czech accentual-syllabic verse and automatic metre assignment using the vast and annotated Corpus of Czech Verse. We define the problem as a sequence tagging task and approach it using a machine learning model and many different input data configurations. In comparison to this approach, we reimplement the existing data-driven system KVĚTA.
Our results demonstrate that the bidirectional LSTM-CRF sequence tagging model, enhanced with syllable embeddings, significantly outperforms the existing KVĚTA system, with predictions achieving 99.61% syllable accuracy, 98.86% line accuracy, and 90.40% poem accuracy. The model also achieved competitive results with token embeddings. One of the most interesting findings is that the best results are obtained by inputting sequences representing whole poems instead of individual poem lines.
Motivické a tematické klastry v básnických textech české poezie 19. a počátku 20. století
Autoři
Kořínková, L.; Nováková, T.; Kosák, M.; Flaišman, J.; Klouda, K.
Rok
2024
Publikováno
Česká literatura. 2024, 72(2), 204-217. ISSN 0009-0468.
Typ
Článek
Pracoviště
Anotace
Cílem tohoto článku je analyzovat možnosti a především výsledky strojového zpracování motivických a tematických shluků v poezii 19. století, které bylo provedeno na korpusu básní v plnotextové databázi České elektronické knihovny.
The number of primitive words of unbounded exponent in the language of an HD0L-system is finite
Autoři
Rok
2024
Publikováno
Journal of Combinatorial Theory, Series A. 2024, 206 ISSN 0097-3165.
Typ
Článek
Pracoviště
Anotace
Let H be an HD0L-system. We show that there are only finitely many primitive words v with the property that , for all integers k, is an element of the factorial language of H. In particular, this result applies to the set of all factors of a morphic word. We provide a formalized proof in the proof assistant Isabelle/HOL as part of the Combinatorics on Words Formalized project.
Characterization of circular D0L-systems
Autoři
Rok
2019
Publikováno
Theoretical Computer Science. 2019, 790 131-137. ISSN 0304-3975.
Typ
Článek
Pracoviště
Anotace
We give a characterization of circularity of a D0L-system. The characterizing condition is simple to verify and yields an efficient algorithm. To derive it, we prove that every non-circular D0L-system contains arbitrarily long repetitions. This result was already published in 1993 by Mignosi and Séébold, however their proof is only a sketch. We give a complete proof that, in addition, is valid for a slightly relaxed definition of circularity, called weak circularity.
Fixed Points of Sturmian Morphisms and Their Derivated Words
Autoři
Klouda, K.; Medková, K.; Pelantová, E.; Starosta, Š.
Rok
2018
Publikováno
Theoretical Computer Science. 2018, 743 23-37. ISSN 0304-3975.
Typ
Článek
Pracoviště
Anotace
Any infinite uniformly recurrent word u can be written as concatenation of a finite number of return words to a chosen prefix w of u. Ordering of the return words to w in this concatenation is coded by derivated word d_u(w). In 1998, Durand proved that a fixed point u of a primitive morphism has only finitely many derivated words d_u(w) and each derivated word d_u(w) is fixed by a primitive morphism as well. In our article we focus on Sturmian words fixed by a primitive morphism. We provide an algorithm which to a given Sturmian morphism ψ lists the morphisms fixing the derivated words of the Sturmian word u = psi(u). We provide a sharp upper bound on length of the list.
Synchronizing delay for binary uniform morphisms
Autoři
Klouda, K.; Medková, K.
Rok
2016
Publikováno
Theoretical Computer Science. 2016, 615 12-22. ISSN 0304-3975.
Typ
Článek
Pracoviště
Anotace
Circular D0L-systems are those with finite synchronizing delay. We introduce a tool called graph of overhangs which can be used to find the minimal value of synchronizing delay of a given D0L-system. By studying the graphs of overhangs, a general upper bound on the minimal value of a synchronizing delay of a circular D0L-system with a binary uniform morphism is given.
An algorithm for enumerating all infinite repetitions in a D0L-system
Autoři
Rok
2015
Publikováno
Journal of Discrete Algorithms. 2015, 33 130-138. ISSN 1570-8667.
Typ
Článek
Pracoviště
Anotace
We describe a simple algorithm that finds all primitive words v such that v^k is a factor of the language of a given D0L-system for all k. It follows that the number of such words is finite. This polynomial-time algorithm can be also used to decide whether a D0L-system is repetitive.
An Exact Polynomial Time Algorithm for Computing the Least Trimmed Squares Estimate
Autoři
Rok
2015
Publikováno
Computational Statistics and Data Analysis. 2015, 84 27-40. ISSN 0167-9473.
Typ
Článek
Pracoviště
Anotace
An exact algorithm for computing the estimates of regression coefficients given by the least trimmed squares method is presented. The algorithm works under very weak assumptions and has polynomial complexity. Simulations show that in the case of two or three explanatory variables, the presented algorithm is often faster than the exact algorithms based on a branch-and-bound strategy whose complexity is not known. The idea behind the algorithm is based on a theoretical analysis of the respective objective function, which is also given.
Factor and Palindromic Complexity of The-Morse's Avatars
Autoři
Klouda, K.; Frougny, Ch.
Rok
2013
Publikováno
Acta Polytechnica. 2013, 53(6), 868-871. ISSN 1210-2709.
Typ
Článek
Pracoviště
Anotace
Two infinite words that are connected with some significant univoque numbers are
studied. It is shown that their factor and palindromic complexities almost coincide with the
factor and palindromic complexities of the famous Thue-Morse word. Keywords: factor
complexity, palindromic complexity, univoque numbers, Thue-Morse word.
Bispecial factors in circular non-pushy D0L languages
Autoři
Rok
2012
Publikováno
Theoretical Computer Science. 2012, 445 63-74. ISSN 0304-3975.
Typ
Článek
Pracoviště
Anotace
We study bispecial factors in fixed points of morphisms. In particular, we propose a simple method of finding all bispecial words of non-pushy circular D0L-systems. This method can be formulated as an algorithm. Moreover, we prove that non-pushy circular D0L-systems are exactly those with finite critical exponents.
Rational base number systems for p-adic numbers
Autoři
Frougny, Ch.; Klouda, K.
Rok
2012
Publikováno
RAIRO - Theoretical Informatics and Applications. 2012, 46(01), 87-106. ISSN 0988-3754.
Typ
Článek
Pracoviště
Anotace
This paper deals with rational base number systems for p-adic numbers. We mainly focus on the system proposed by Akiyama et al. in 2008, but we also show that this system is in some sense isomorphic to some other rational base number systems by means of finite transducers. We identify the numbers with finite and eventually periodic representations and we also determine the number of representations of a given p-adic number.
Critical Exponent of Infinite Words Coding Beta-integers Associated with Non-simple Parry Numbers
Autoři
Balková, L.; Pelantová, E.; Klouda, K.
Rok
2011
Publikováno
Integers: Electronic Journal of Combinatorial Number Theory. 2011, 11b 1-25. ISSN 1553-1732.
Typ
Článek
Pracoviště
Anotace
In this paper, we study the critical exponent of infinite words u coding beta-
integers for beta being a non-simple Parry number. In other words, we investigate
the maximal consecutive repetitions of factors that occur in the infinite word in
question. We calculate also the ultimate critical exponent that expresses how long
repetitions occur in the infinite word u when the factors of length growing ad
infinitum are considered. The basic ingredients of our method are the description
of all bispecial factors of u and the notion of return words. This method can be
applied to any fixed point of any primitive substitution.