Monitoring and management of large-scale communication networks

Funded by EPSRC, InnovateUK KTP, Moogsoft Ltd

Key collaborator: Dr George Parisis (co-PI)

Graduate students: Mr Giles Winchester (PhD, 2021-), Dr Antoine Messager (PhD, 2015-2019)

The management of communications networks involves a number of challenges: 1) high-dimensional data, 2) the underlying systems that generate the data are changing continuously, 3) the data generated by IT systems is redundant and extremely noisy (Messager et al., 2018). Because these three properties make the application of traditional machine learning and data science techniques very challenging, network operators are engaged in the development of ad-hoc techniques. The aim of this project is to leverage a combination of network science and machine learning approaches (including Reinforcement Learning) to improve the capacity of networks management operators to efficiently and rapidly respond to failures. The three main axes of research will be: (a) the development of a dynamic functional connectivity inference framework specifically adapted to the constraints of IT network management providing a greater understanding of zones of influence for both hardware and software quantities – for work to date, see (Messager et al., 2019) ; (b) network controllability via manipulation of the higher-order structure of the network using local information only; and (c) the development of novel mechanisms to identify and predict network incidents and service outages in the presence of change in large scale network deployments that may involve elastic, distributed and migrate-able services, distributed network controllers and virtual network functions.

Publications to date:

Messager, A., Georgiou, N., & Berthouze, L. (2019). A new method for the robust characterisation of pairwise statistical dependency between point processes. https://arxiv.org/abs/1904.04813v1

Messager, A., Parisis, G., Kiss, I. Z., Harper, R., Tee, P., & Berthouze, L. (2019). Inferring Functional Connectivity From Time-Series of Events in Large Scale Network Deployments. IEEE Transactions on Network and Service Management, 16(3), 857–870. https://doi.org/10.1109/TNSM.2019.2932896

Messager, A., Parisis, G., Harper, R., Tee, P., Kiss, I. Z., & Berthouze, L. (2018). Network events in a large commercial network: What can we learn? NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium, 1–6. https://doi.org/10.1109/NOMS.2018.8406289

Efficient and scalable computation of metrics for logical and physical IT infrastructure networks

Key collaborators: Dr George Parisis (co-PI) and Dr Shahriar Etemadi-Tajbakhsh (KTP fellow)

Partnership objective: To develop a mathematical framework, and a computational implementation that can run on commodity hardware, for the efficient and scalable calculation of the metrics associated with the logical and physical graphs inherent in IT infrastructures.