"You have to read 50 different RFCs that contradict each other": An Interview Study on the Experiences of Implementing Cryptographic Standards
Autoři
Huaman, N.; Suray, J.; Klemmer, J.; Fourné, M.; Amft, S.; Trummová, I.; Acar, Y.; Fahl, S.
Rok
2024
Publikováno
33rd USENIX Security Symposium. The USENIX Association, 2024. p. 7249-7266. ISBN 978-1-939133-44-1.
Typ
Stať ve sborníku
Pracoviště
Anotace
Implementing cryptographic standards is a critical process for the cryptographic ecosystem. Cryptographic standards aim to support developers and engineers in implementing cryptographic primitives and protocols. However, past security incidents suggest that implementing cryptographic standards can be challenging and might jeopardize software and hardware security. We need to understand and mitigate the pain points of those implementing cryptographic standards to support them better.
To shed light on the challenges and obstacles of implementing cryptographic standards, we conducted 20 semi-structured interviews with experienced cryptographers and cryptographic software engineers. We identify common practices when implementing standards, including the criticality of reference and third-party implementations, test vectors to verify implementations, and the open standard community as central support for questions and reviews of implementations.
Based on our findings, we recommend transparent standardization processes, strong (ideally formal) verification, improved support for comparing implementations, and covering updates and error handling in the standardization process.
A comparison of adversarial malware generators
Autoři
Louthánová, P.; Kozák, M.; Jureček, M.; Stamp, M.; Di Troia, F.
Rok
2024
Publikováno
Journal of Computer Virology and Hacking Techniques. 2024, 20(4), 623-639. ISSN 2263-8733.
Typ
Článek
Pracoviště
Anotace
Machine learning has proven to be a valuable tool for automated malware detection, but machine learning systems have also been shown to be subject to adversarial attacks. This paper summarizes and compares related work on generating adversarial malware samples, specifically malicious Windows Portable Executable files. In contrast with previous research, we not only compare generators of adversarial malware examples theoretically, but we also provide an experimental comparison and evaluation for practical usability. We use gradient-based, evolutionary-based, and reinforcement-based approaches to create adversarial samples, which we test against selected antivirus products. The results show that applying optimized modifications to previously detected malware can lead to incorrect classification of the file as benign. Moreover, generated malicious samples can be effectively employed against detection models other than those used to produce them, and combinations of methods can construct new instances that avoid detection. Based on our findings, the Gym-malware generator, which uses reinforcement learning, has the greatest practical potential. This generator has the fastest average sample production time of 5.73 s and the highest average evasion rate of 44.11%. Using the Gym-malware generator in combination with itself further improved the evasion rate to 58.35%. However, other tested methods scored significantly lower in our experiments than reported in the original publications, highlighting the importance of a standardized evaluation environment.
A Comparison of Logic Extraction Methods in Hardware-Translated Neural Networks
Autoři
Rok
2024
Publikováno
Proceedings of the 27th International Symposium on Design and Diagnostics of Electronic Circuits & Systems. Piscataway: IEEE, 2024. p. 86-91. ISBN 979-8-3503-5934-3.
Typ
Stať ve sborníku
Pracoviště
Anotace
Small quantized neural networks with strong requirements
on throughput and latency can be translated into
combinational logic circuits and synthesized by logic design tools.
To capture the function of the network (or a part of it) as a logic
function, two approaches have been taken. The first one observes
the inputs and outputs, while the network predicts a training
set, and uses them directly as specification. The response to
activation values that have not occurred in the training set remain
unspecified. The other approach uses a complete set of activation
values at the input of the examined part. Our study aims to
quantify the inaccuracy of the first method, the influence of
logic minimization used on accuracy, and the impact on the final
synthesized circuit. We also document the quantitative changes
in quantized networks.
A numerical range approach to Birkhoff–James orthogonality with applications
Autoři
Martín, M.; Merí, J.; Quero, A.; Roy, S.; Sain, D.
Rok
2024
Publikováno
Banach Journal of Mathematical Analysis. 2024, 18(2), ISSN 2662-2033.
Typ
Článek
Pracoviště
Anotace
The main aim of this paper is to provide characterizations of Birkhoff–James orthogonality (BJ-orthogonality in short) in a number of families of Banach spaces in terms of the elements of significant subsets of the unit ball of their dual spaces, which makes the characterizations more applicable. The tool to do so is a fine study of the abstract numerical range and its relation with the BJ-orthogonality. Among other results, we provide a characterization of BJ-orthogonality for spaces of vector-valued bounded functions in terms of the domain set and the dual of the target space, which is applied to get results for spaces of vector-valued continuous functions, uniform algebras, Lipschitz maps, injective tensor products, bounded linear operators with respect to the operator norm and to the numerical radius, multilinear maps, and polynomials. Next, we study possible extensions of the well-known Bhatia–Šemrl theorem on BJ-orthogonality of matrices, showing results in spaces of vector-valued continuous functions, compact linear operators on reflexive spaces, and finite Blaschke products. Finally, we find applications of our results to the study of spear vectors and spear operators. We show that no smooth point of a Banach space can be BJ-orthogonal to a spear vector of Z. As a consequence, if X is a Banach space containing strongly exposed points and Y is a smooth Banach space with dimension at least two, then there are no spear operators from X to Y. Particularizing this result to the identity operator, we show that a smooth Banach space containing strongly exposed points has numerical index strictly smaller than one. These latter results partially solve some open problems.
Ab initio translationally invariant nucleon-nucleus optical potentials
Autoři
Burrows, M.; Launey, K.D.; Mercenne, A.; Baker, R.B.; Sargsyan, G.H.; Dytrych, T.; Langr, D.
Rok
2024
Publikováno
PHYSICAL REVIEW C. 2024, 109 ISSN 2469-9985.
Typ
Článek
Pracoviště
Anotace
We combine the ab initio symmetry-adapted no-core shell model (SA-NCSM) with the single-particle Green's function approach to construct optical potentials rooted in first principles. Specifically, we show that total cross sections and phase shifts for neutron elastic scattering from a 4He target with projectile energies between 0.5 and 10 MeV closely reproduce the experiment. In addition, we discuss an important new development that resolves a long-standing issue with spurious center-of-mass motion in the Green's function formalism for many-body approaches. The new development opens a path for first-principle predictions of cross sections for elastic scattering of single-nucleon projectiles, nucleon capture, and deuteron breakup reactions, feasible for a broad range of open-shell spherical and deformed nuclei in the SA-NCSM approach.
Accuracy versus precision in boosted top tagging with the ATLAS detector
Autoři
Aad, G.; Aakvaag, E.; Abbott, B.; Abdelhameed, S.; Ali, B.; Augsten, K.; Bergmann, B.; Day-Hall, H.; Fiedler, P.; Hubáček, Z.; Mondal, S.; Myška, M.; Novotný, L.; Petousis, V.; Pospíšil, S.; Smolek, K.; Sopczak, A.; Vacek, V.; Vokáč, P.; Zaplatílek, O.
Rok
2024
Publikováno
Journal of Instrumentation. 2024, 19(8), ISSN 1748-0221.
Typ
Článek
Pracoviště
Anotace
The identification of top quark decays where the top quark has a large momentum transverse to the beam axis, known as top tagging, is a crucial component in many measurements of Standard Model processes and searches for beyond the Standard Model physics at the Large Hadron Collider. Machine learning techniques have improved the performance of top tagging algorithms, but the size of the systematic uncertainties for all proposed algorithms has not been systematically studied. This paper presents the performance of several machine learning based top tagging algorithms on a dataset constructed from simulated proton-proton collision events measured with the ATLAS detector at √s = 13 TeV. The systematic uncertainties associated with these algorithms are estimated through an approximate procedure that is not meant to be used in a physics analysis, but is appropriate for the level of precision required for this study. The most performant algorithms are found to have the largest uncertainties, motivating the development of methods to reduce these uncertainties without compromising performance. To enable such efforts in the wider scientific community, the datasets used in this paper are made publicly available.
Action Duration Generalization for Exact Multi-Agent Collective Construction
Autoři
Rameš, M.; Surynek, P.
Rok
2024
Publikováno
Proceedings of the 16th International Conference on Agents and Artificial Intelligence. Setúbal: Science and Technology Publications, Lda, 2024. p. 718-725. vol. 3. ISSN 2184-433X. ISBN 978-989-758-680-4.
Typ
Stať ve sborníku
Pracoviště
Anotace
This paper addresses exact approaches to multi-agent collective construction problem which tasks a group of cooperative agents to build a given structure in a blocksworld under the gravity constraint. We propose a generalization of the existing exact model based on mixed integer linear programming by accommodating varying agent action durations. We refer to the model as a fraction-time model. The introduction of action durations enables one to create a more realistic model for various domains. It provides a significant reduction of plan execution duration at the cost of increased computational time, which rises steeply the closer the model gets to the exact real-world action duration. We also propose a makespan estimation function for the fraction-time model. This can be used to estimate the construction time reduction size for cost-benefit analysis. The fraction-time model and the makespan estimation function have been evaluated in a series of experiments using a set of benchmark st ructures. The results show a significant reduction of plan execution duration for non-constant duration actions due to decreasing synchronization overhead at the end of each action. According to the results, the makespan estimation function provides a reasonably accurate estimate of the makespan.
Adaptive Input Normalization for Quantized Neural Networks
Autoři
Rok
2024
Publikováno
Proceedings of the 27th International Symposium on Design and Diagnostics of Electronic Circuits & Systems. Piscataway: IEEE, 2024. p. 130-135. ISBN 979-8-3503-5934-3.
Typ
Stať ve sborníku
Pracoviště
Anotace
Neural networks with quantized activation functions
cannot adapt the quantization at the input of their first layer.
Preprocessing is therefore required to adapt the range of input
data to the quantization range. Such preprocessing usually
includes an activation-wise linear transformation and is steered
by the properties of the training set. We suggest to include the
linear transform into the training process. We document that
it improves accuracy, requires the same resources as standard
preprocessing, plays a role in network pruning, and is reasonably
stable with respect to initialization.
Analysis of Statistical Distribution Changes of Input Features in Network Traffic Classification Domain
Autoři
Jančička, L.; Koumar, J.; Soukup, D.; Čejka, T.
Rok
2024
Publikováno
NOMS 2024-2024 IEEE Network Operations and Management Symposium. Seoul: IEEE CLEO/Pacific Rim, 2024. ISSN 2374-9709. ISBN 979-8-3503-2793-9.
Typ
Stať ve sborníku
Pracoviště
Anotace
This study investigates the evolving landscape of network traffic monitoring, which is crucial for maintaining computer network services and security. Traditional methods like Deep Packet Inspection (DPI) face challenges due to increased privacy protection through encryption, prompting a shift towards statistical-based detection using Machine Learning (ML). On the other hand, ML struggles with long-term evaluation due to various distribution changes. This study focuses on the CESNET-TLS-Year22 dataset, derived from one year of TLS network traffic on the CESNET2 backbone. Described research explores the behavior of modern protocols in real-world scenarios and their impact on dataset quality. The main result of our analysis is the identification of the Weekend phenomenon in network traffic classification that is generally overlooked during ML model training.
Analysis of Statistical Distribution Changes of Input Features in Network Traffic Classification Domain
Autoři
Jančička, L.; Koumar, J.; Soukup, D.; Čejka, T.
Rok
2024
Publikováno
Proceedings of the 12th Prague Embedded Systems Workshop. Praha: CTU. Faculty of Information Technology, 2024. ISBN 978-80-01-07303-2.
Typ
Stať ve sborníku
Pracoviště
Anotace
This study investigates the evolving landscape of network traffic monitoring, which is crucial for maintaining computer network services and security. Traditional methods like Deep Packet Inspection (DPI) face challenges due to increased privacy protection through encryption, prompting a shift towards statistical-based detection using Machine Learning (ML). On the other hand, ML struggles with long-term evaluation due to various distribution changes. This study focuses on the CESNET-TLS-Year22 dataset, derived from one year of TLS network traffic on the CESNET2 backbone. Described research explores the behavior of modern protocols in real-world scenarios and their impact on dataset quality. The main result of our analysis is the identification of the Weekend phenomenon in network traffic classification that is generally overlooked during ML model training.
Ancient Egyptian scribes and specific skeletal occupational risk markers (Abusir, Old Kingdom)
Autoři
Brukner Havelková, P.; Dulíková, V.; Bejdová, Š.; Vacková, J.; Velemínský, P.; Bárta, M.
Rok
2024
Publikováno
Scientific Reports. 2024, 14(1), 13317-1-13317-19. ISSN 2045-2322.
Typ
Článek
Pracoviště
Anotace
Men with writing proficiency enjoyed a privileged position in ancient Egyptian society in the third millennium BC. Research focusing on these officials of elevated social status ("scribes") usually concentrates on their titles, scribal statues, iconography, etc., but the individuals themselves, and their skeletal remains, have been neglected. The aim of this study is to reveal whether repetitive tasks and maintained postures related to scribal activity can manifest in skeletal changes and identify possible occupational risk factors. A total of 1767 items including entheseal changes, non-metric traits, and degenerative changes were recorded from the human remains of 69 adult males of well-defined social status categories from the necropolis at Abusir (2700-2180 BC). Statistically significant differences between the scribes and the reference group attested a higher incidence of changes in scribes and manifested themselves especially in the occurrence of osteoarthritis of the joints. Our research reveals that remaining in a cross-legged sitting or kneeling position for extended periods, and the repetitive tasks related to writing and the adjusting of the rush pens during scribal activity, caused the extreme overloading of the jaw, neck and shoulder regions.
Approximating subset sum ratio via partition computations
Autoři
Alonistiotis, G.; Antonopoulos, A.; Melissinos, N.; Pagourtzis, A.; Petsalakis, S.; Vasilakis, M.
Rok
2024
Publikováno
Acta Informatica. 2024, 61(2), 101-113. ISSN 1432-0525.
Typ
Článek
Pracoviště
Anotace
We present a new FPTAS for the SUBSET SUM RATIO problem, which, given a set of integers, asks for two disjoint subsets such that the ratio of their sums is as close to 1 as possible. Our scheme makes use of exact and approximate algorithms for PARTITION, and clearly showcases the close relationship between the two algorithmic problems. Depending on the relationship between the size of the input set n and the error margin \espilon, we improve upon the best currently known algorithm of Melissinos and Pagourtzis [COCOON 2018] of complexity O(n^4/\espilon). In particular, the exponent of n in our proposed scheme may decrease down to 2, depending on the PARTITION algorithm used.
Automatic Miscalibration Diagnosis: Interpreting Probability Integral Transform (PIT) Histograms
Autoři
Podsztavek, O.; Jordan, A.I.; Tvrdík, P.; Polsterer, K.L.
Rok
2024
Publikováno
ESANN 2024 proceedings. Louvain la Neuve: Ciaco - i6doc.com, 2024. p. 137-142. ISBN 978-2-87587-090-2.
Typ
Stať ve sborníku
Pracoviště
Anotace
Quantifying the predictive uncertainty of a model is essential for risk assessment. We address the proper calibration of the predictive uncertainty in regression tasks by employing the probability integral transform (PIT) histogram to diagnose miscalibration. PIT histograms are often difficult to interpret, and therefore we present an approach to an automatic interpretation of PIT histograms based on an interpreter trained with a synthetic data set. Given a PIT histogram of a model and a data set, the interpreter can estimate the data-generating distribution of the data set with the main purpose of identifying the cause of miscalibration.
Automatic Poetic Metre Detection for Czech Verse
Autoři
Klesnilová, K.; Klouda, K.; Friedjungová, M.; Plecháč, P.
Rok
2024
Publikováno
Studia Metrica et Poetica. 2024, 11(1), 44-61. ISSN 2346-6901.
Typ
Článek
Pracoviště
Anotace
Metrical analysis of verse is an essential and challenging task in the research on versification consisting of analysing a poem and deciding which metre it is written in. Thanks to existing corpora, we can take advantage of data-driven approaches, which can be better suited to the specific versification problems at hand than rulebased systems.
This work analyses the Czech accentual-syllabic verse and automatic metre assignment using the vast and annotated Corpus of Czech Verse. We define the problem as a sequence tagging task and approach it using a machine learning model and many different input data configurations. In comparison to this approach, we reimplement the existing data-driven system KVĚTA.
Our results demonstrate that the bidirectional LSTM-CRF sequence tagging model, enhanced with syllable embeddings, significantly outperforms the existing KVĚTA system, with predictions achieving 99.61% syllable accuracy, 98.86% line accuracy, and 90.40% poem accuracy. The model also achieved competitive results with token embeddings. One of the most interesting findings is that the best results are obtained by inputting sequences representing whole poems instead of individual poem lines.
Average-case complexity of a branch-and-bound algorithm for MIN DOMINATING SET
Autoři
Denat, T.; Harutyunyan, A.; Melissinos, N.; Paschos, V. T..
Rok
2024
Publikováno
Discrete Applied Mathematics. 2024, 345 4-8. ISSN 0166-218X.
Typ
Článek
Pracoviště
Anotace
The average-case complexity of a branch-and-bound algorithm for MIN DOMINATING SET problem in random graphs in the G(n, p) model is studied. We identify phase transitions between subexponential and exponential average-case complexities, depending on the growth of the probability p with respect to the number n of nodes. (c) 2023 Elsevier B.V. All rights reserved.
Banach spaces with small weakly open subsets of the unit ball and massive sets of Daugavet and Δ-points
Autoři
Cobollo, C.; Isert, D.; López-Pérez, G.; Martín, M.; Perreau, Y.; Quero de la Rosa, A.; Quilis, A.; Rodríguez-Vidanes, D.L.; Rueda Zoca, A.
Rok
2024
Publikováno
Revista de la Real Academia de Ciencias Exactas, Físicas y Naturales. Serie A. Matemáticas. 2024, 118(3), ISSN 1578-7303.
Typ
Článek
Pracoviště
Anotace
We prove that there exists an equivalent norm · on L∞[0,1] with the following properties: The unit ball of (L∞[0,1],·) contains non-empty relatively weakly open subsets of arbitrarily small diameter; The set of Daugavet points of the unit ball of (L∞[0,1],·) is weakly dense; The set of ccw Δ-points of the unit ball of (L∞[0,1],·) is norming. We also show that there are points of the unit ball of (L∞[0,1],·) which are not Δ-points, meaning that the space (L∞[0,1],·) fails the diametral local diameter 2 property. Finally, we observe that the space (L∞[0,1],·) provides both alternative and new examples that illustrate the differences between the various diametral notions for points of the unit ball of Banach spaces.
beeFormer: Bridging the Gap Between Semantic and Interaction Similarity in Recommender Systems
Autoři
Vančura, V.; Kordík, P.; Straka, M.
Rok
2024
Publikováno
RecSys '24: Proceedings of the 18th ACM Conference on Recommender Systems. New York: ACM, 2024. p. 1102-1107. ISBN 979-8-4007-0505-2.
Typ
Stať ve sborníku
Pracoviště
Anotace
Recommender systems often use text-side information to improve their predictions, especially in cold-start or zero-shot recommendation scenarios, where traditional collaborative filtering approaches cannot be used. Many approaches to text-mining side information for recommender systems have been proposed over recent years, with sentence Transformers being the most prominent one. However, these models are trained to predict semantic similarity without utilizing interaction data with hidden patterns specific to recommender systems. In this paper, we propose beeFormer, a framework for training sentence Transformer models with interaction data. We demonstrate that our models trained with beeFormer can transfer knowledge between datasets while outperforming not only semantic similarity sentence Transformers but also traditional collaborative filtering methods. We also show that training on multiple datasets from different domains accumulates knowledge in a single model, unlocking the possibility of training universal, domain-agnostic sentence Transformer models to mine text representations for recommender systems. We release the source code, trained models, and additional details allowing replication of our experiments at https://github.com/recombee/beeformer.
Calibration of a soft secondary vertex tagger using proton-proton collisions at Formula Presented with the ATLAS detector
Autoři
Filmer, E.K.; Grant, C.M.; Green, M.J.; Jackson, P.; Ali, B.; Augsten, K.; Bergmann, B.; Day-Hall, H.; Fiedler, P.; Hubáček, Z.; Mondal, S.; Myška, M.; Novotný, L.; Petousis, V.; Pospíšil, S.; Smolek, K.; Sopczak, A.; Vacek, V.; Vokáč, P.; Zaplatílek, O.
Rok
2024
Publikováno
Physical Review D. 2024, 110(3), ISSN 2470-0010.
Typ
Článek
Pracoviště
Anotace
Several processes studied by the ATLAS experiment at the Large Hadron Collider produce low-momentum Formula Presented-flavored hadrons in the final state. This paper describes the calibration of a dedicated tagging algorithm that identifies Formula Presented-flavored hadrons outside of hadronic jets by reconstructing the soft secondary vertices originating from their decays. The calibration is based on a proton-proton collision dataset at a center-of-mass energy of 13 TeV corresponding to an integrated luminosity of Formula Presented. Scale factors used to correct the algorithm’s performance in simulated events are extracted for the Formula Presented-tagging efficiency and the mistag rate of the algorithm using a data sample enriched in Formula Presented events. Several orthogonal measurement regions are defined, binned as a function of the multiplicities of soft secondary vertices and jets containing a Formula Presented-flavored hadron in the event. The mistag rate scale factors are estimated separately for events with low and high average numbers of interactions per bunch crossing. The results, which are derived from events with low missing transverse momentum, are successfully validated in a phase space characterized by high missing transverse momentum and therefore are applicable to new physics searches carried out in either phase space regime.
CESNET-TLS-Year22: A year-spanning TLS network traffic dataset from backbone lines
Typ
Článek
Pracoviště
Anotace
The modern approach for network traffic classification (TC), which is an important part of operating and securing networks, is to use machine learning (ML) models that are able to learn intricate relationships between traffic characteristics and communicating applications. A crucial prerequisite is having representative datasets. However, datasets collected from real production networks are not being published in sufficient numbers. Thus, this paper presents a novel dataset, CESNET-TLS-Year22, that captures the evolution of TLS traffic in an ISP network over a year. The dataset contains 180 web service labels and standard TC features, such as packet sequences. The unique year-long time span enables comprehensive evaluation of TC models and assessment of their robustness in the face of the ever-changing environment of production networks.
Classification and online clustering of zero-day malware
Autoři
Rok
2024
Publikováno
Journal of Computer Virology and Hacking Techniques. 2024, 20(4), 579-592. ISSN 2263-8733.
Typ
Článek
Pracoviště
Anotace
A large amount of new malware is constantly being generated, which must not only be distinguished from benign samples, but also classified into malware families. For this purpose, investigating how existing malware families are developed and examining emerging families need to be explored. This paper focuses on the online processing of incoming malicious samples to assign them to existing families or, in the case of samples from new families, to cluster them. We experimented with seven prevalent malware families from the EMBER dataset, four in the training set and three additional new families in the test set. The features were extracted by static analysis of portable executable files for the Windows operating system. Based on the classification score of the multilayer perceptron, we determined which samples would be classified and which would be clustered into new malware families. We classified 97.21% of streaming data with a balanced accuracy of 95.33%. Then, we clustered the remaining data using a self-organizing map, achieving a purity from 47.61% for four clusters to 77.68% for ten clusters. These results indicate that our approach has the potential to be applied to the classification and clustering of zero-day malware into malware families.