Analysis of Statistical Distribution Changes of Input Features in Network Traffic Classification Domain
Authors
Jančička, L.; Koumar, J.; Soukup, D.; Čejka, T.
Year
2024
Published
NOMS 2024-2024 IEEE Network Operations and Management Symposium. Seoul: IEEE CLEO/Pacific Rim, 2024. ISSN 2374-9709. ISBN 979-8-3503-2793-9.
Type
Proceedings paper
Departments
Annotation
This study investigates the evolving landscape of network traffic monitoring, which is crucial for maintaining computer network services and security. Traditional methods like Deep Packet Inspection (DPI) face challenges due to increased privacy protection through encryption, prompting a shift towards statistical-based detection using Machine Learning (ML). On the other hand, ML struggles with long-term evaluation due to various distribution changes. This study focuses on the CESNET-TLS-Year22 dataset, derived from one year of TLS network traffic on the CESNET2 backbone. Described research explores the behavior of modern protocols in real-world scenarios and their impact on dataset quality. The main result of our analysis is the identification of the Weekend phenomenon in network traffic classification that is generally overlooked during ML model training.
Analysis of Statistical Distribution Changes of Input Features in Network Traffic Classification Domain
Authors
Jančička, L.; Koumar, J.; Soukup, D.; Čejka, T.
Year
2024
Published
Proceedings of the 12th Prague Embedded Systems Workshop. Praha: CTU. Faculty of Information Technology, 2024. ISBN 978-80-01-07303-2.
Type
Proceedings paper
Departments
Annotation
This study investigates the evolving landscape of network traffic monitoring, which is crucial for maintaining computer network services and security. Traditional methods like Deep Packet Inspection (DPI) face challenges due to increased privacy protection through encryption, prompting a shift towards statistical-based detection using Machine Learning (ML). On the other hand, ML struggles with long-term evaluation due to various distribution changes. This study focuses on the CESNET-TLS-Year22 dataset, derived from one year of TLS network traffic on the CESNET2 backbone. Described research explores the behavior of modern protocols in real-world scenarios and their impact on dataset quality. The main result of our analysis is the identification of the Weekend phenomenon in network traffic classification that is generally overlooked during ML model training.
MFWDD: Model-based Feature Weight Drift Detection Showcased on TLS and QUIC Traffic
Authors
Year
2024
Published
2024 20th International Conference on Network and Service Management (CNSM). New York: IEEE, 2024. ISSN 2165-963X. ISBN 978-3-903176-66-9.
Type
Proceedings paper
Departments
Annotation
Machine learning (ML) represents an efficient and popular approach for network traffic classification. However, network traffic inspection is a challenging domain and trained models may degrade soon after deployment. Besides biases present during data captures and model creation, data drifts contribute significantly to ML model degradation. This paper proposes a novel method called Model-based Feature Weight Drift Detection (MFWDD) for concept drift detection. It is a part of a public software framework suited for dataset drift analysis tailored to the domain of network traffic. This work addresses TLS and QUIC service classification problems, examines a variety of experiments analyzing the evolution of the respective distributions, and observes their degradation over time on different ML features. The MFWDD framework guided TLS and QUIC services classification models retraining throughout an extensive period and not only prevented model degradation but also improved its performance and consistency over time.
NetTiSA: Extended IP flow with time-series features for universal bandwidth-constrained high-speed network traffic classification
Authors
Year
2024
Published
Computer Networks. 2024, 240 1-22. ISSN 1389-1286.
Type
Article
Departments
Annotation
Network traffic monitoring based on IP Flows is a standard monitoring approach that can be deployed to various network infrastructures, even the large ISP networks connecting millions of people. Since flow records traditionally contain only limited information (addresses, transport ports, and amount of exchanged data), they are also commonly extended by additional features that enable network traffic analysis with high accuracy. These flow extensions are, however, often too large or hard to compute, which then allows only offline analysis or limits their deployment only to smaller-sized networks. This paper proposes a novel extended IP flow called NetTiSA (Network Time Series Analysed) flow, based on analysing the time series of packet sizes. By thoroughly testing 25 different network traffic classification tasks, we show the broad applicability and high usability of NetTiSA flow. For practical deployment, we also consider the sizes of flows extended by NetTiSA features and evaluate the performance impacts of their computation in the flow exporter. The novel features proved to be computationally inexpensive and showed excellent discriminatory performance. The trained machine learning classifiers with proposed features mostly outperformed the state-of-the-art methods. NetTiSA finally bridges the gap and brings universal, small-sized, and computationally inexpensive features for traffic classification that can be scaled up to extensive monitoring infrastructures, bringing the machine learning traffic classification even to 100 Gbps backbone lines.
Augmenting Monitoring Infrastructure For Dynamic Software-Defined Networks
Authors
Year
2023
Published
2023 8th International Conference on Smart and Sustainable Technologies (SpliTech). New Jersey: IEEE, 2023. ISBN 978-953-290-128-3.
Type
Proceedings paper
Annotation
Software-Defined Networking (SDN) and virtual environment raise new challenges for network monitoring tools. The dynamic and flexible nature of these network technologies requires adaptation of monitoring infrastructure to overcome challenges of analysis and interpretability of the monitored network traffic. This paper describes a concept of automatic on-demand deployment of monitoring probes and correlation of network data with infrastructure state and configuration in time. Such an approach to monitoring SDN virtual networks is usable in several use cases, such as IoT networks and anomaly detection. It increases visibility into complex and dynamic networks. Additionally, it can help with the creation of well-annotated datasets that are essential for any further research.
Enhancing DeCrypto: Finding Cryptocurrency Miners Based on Periodic Behavior
Authors
Year
2023
Published
2023 19th International Conference on Network and Service Management (CNSM). New York: IEEE, 2023. International Conference on Network and Service Management. vol. 19. ISSN 2165-9605. ISBN 978-3-903176-59-1.
Type
Proceedings paper
Annotation
While the popularity of cryptocurrencies and the whole industry's value are rising, the number of threat actors who use illegal “coin miner mal ware” is increasing as well. The threat actors commonly use computational resources of companies, research and educational institutions, or end users. In this paper, we analyzed the long-term periodic behavior of the cryptocurrency miners communicating in computer networks. We propose a novel method for cryptominers detection using specially designed periodicity features. The detection algorithm is based on the mathematical detection of periodic Flow time series (FTS) and feature mining. Altogether with the Machine Learning technique, the resulting system achieves high-precision performance. Furthermore, our approach enhances a flow-based cryptominers detection system DeCrypto to further improve its reliability and feasibility for high-speed networks.
Network Traffic Classification Based on Single Flow Time Series Analysis
Authors
Year
2023
Published
2023 19th International Conference on Network and Service Management (CNSM). New York: IEEE, 2023. International Conference on Network and Service Management. vol. 19. ISSN 2165-9605. ISBN 978-3-903176-59-1.
Type
Proceedings paper
Departments
Annotation
Network traffic monitoring using IP flows is used to handle the current challenge of analyzing encrypted network communication. Nevertheless, the packet aggregation into flow records naturally causes information loss; therefore, this paper proposes a novel flow extension for traffic features based on the time series analysis of the Single Flow Time series, i.e., a time series created by the number of bytes in each packet and its timestamp. We propose 69 universal features based on the statistical analysis of data points, time domain analysis, packet distribution within the flow timespan, time series behavior, and frequency domain analysis. We have demonstrated the usability and universality of the proposed feature vector for various network traffic classification tasks using 15 well-known publicly available datasets. Our evaluation shows that the novel feature vector achieves classification performance similar or better than related works on both binary and multiclass classification tasks. In more than half of the evaluated tasks, the classification performance increased by up to 5 %.
Unevenly Spaced Time Series from Network Traffic
Authors
Year
2023
Published
Proceedings of the 7th Network Traffic Measurement and Analysis Conference. Piscataway: IEEE, 2023. ISBN 978-3-903176-58-4.
Type
Proceedings paper
Annotation
Reliable detection of security events is essential for network security. Therefore, a suitable traffic representation and model are required. Contrary to the currently used approaches, this paper presents Unevenly Spaced Time Series (USTS) as a feasible representation of network traffic with several brilliant benefits for analysis. The article concerns several types of USTS. A dataset captured on a real ISP network was created to evaluate the properties of USTS. The dataset contains over 35 million time series. We experimentaly proved the USTS is suitable for network traffic analysis and allow automatic processing, e.g., to classify network traffic.
Network traffic classification based on periodic behavior detection
Authors
Year
2022
Published
Proceedings of 2022 18th International Conference on Network and Service Management (CNSM). New York: IEEE, 2022. p. 359-363. ISSN 2165-9605. ISBN 978-3-903176-51-5.
Type
Proceedings paper
Annotation
Even though encryption hides the content of communication from network monitoring and security systems, this paper shows a feasible way to retrieve useful information about the observed traffic. The paper deals with detection of periodic behavioral patterns of the communication that can be detected using time series created from network traffic by autocorrelation function and Lomb-Scargle periodogram. The revealed characteristics of the periodic behavior can be further exploited to recognize particular applications. We have experimented with the created dataset of 61 classes, and trained a machine learning classifier based on XGBoost that performed the best in our experiments, reaching 90% F1-score.