Evaluating INT, JTI, and sFlow @ AmLightvassi
Renata Frez, Dr. Julio Ibarra, and Jeronimo Bezerra participated in the Internet2 Technology Exchange which is a premier technical event in the global research and education community. The goal is to convene technologists, scientists, engineers, architects, analysts, students, and anyone passionate about technology, innovation, and challenging tasks. TechEx provides a forum for debate and discussion that enable timely conversations critical to ensuring that we, as a community of technology leaders committed to supporting research and collaboration, are aligned and are moving forward together.
Renata Frez presented “Evaluating INT, JTI, and sFlow @ AmLight” at the Advanced Networking session
Abstract: Like most research and education networks (REN), AmLight has diverse science drivers and workflows. While some science drivers operate on best-effort, some require steady jitter and delay, and some have strict SLA for packet loss and service outages. For international long-haul highly-distributed RENs such as AmLight, the critical component to enable the quality of service users require/demand is an efficient network monitoring framework. Monitoring the network state and events must be done accurately and as close to real-time as possible. Legacy technologies and approaches, such as SNMP and flow sampling, have always been the main approaches used by network operators to monitor the network state. These technologies have their limitations and they are well-known, for instance, the lack of real-time support.
In 2014, when AmLight became an SDN infrastructure with OpenFlow, OpenFlow counters were added to the existing SNMP/sFlow monitoring combo. However, OpenFlow counters suffered the same inconsistencies as SNMP: each vendor had its level of accuracy and minimum polling interval. Multiple vendors were evaluated regarding the quality of the counters provided via OpenFlow, and results were presented at the 2016 Traffic Monitoring and Analysis (TMA) workshop and the 2016 Internet2 Technology Exchange. In the end, AmLight learned that SNMP + sFlow + OpenFlow counters were not sufficient to achieve its monitoring goals.
Since 2018, AmLight has focused on monitoring the network by leveraging advancements in the field of programmable data planes, more specifically, In-band Network Telemetry (INT). As a solution for leveraging INT was not commercially available, AmLight built its telemetry exporting and gathering framework in partnership with NoviFlow and Barefoot Networks. The development of a telemetry framework was the scope of work of the NSF AmLight-INT project. The academic community praised the outcomes of the AmLight-INT project, and the solution developed was presented multiple times since 2018, including at the 2019 Internet2 Technology Exchange. In production at AmLight, INT gives real-time per-packet visibility with zero impact on network devices’ network functions, since all telemetry data is gathered and exported directly from the switch’s ASIC.
Although INT provides visibility of 100% of the packets flowing through programmable switches, its goals are focused on real-time granular network state monitoring. Leveraging INT for long-term reports and top talkers can lead to an expensive monitoring environment due to the amount of data generated. Moreover, AmLight is not entirely composed of programmable switches, as it leverages legacy Top of Rack (ToR) switches and Juniper MX routers. As a result, monitoring legacy devices is still mandatory for AmLight. To understand how close to real-time its legacy devices can accurately perform, AmLight pushed the limits of its legacy monitoring tools. Juniper expanded its MX family with a network streaming telemetry solution called Juniper Telemetry Interface (JTI), which provides telemetry summaries every two seconds. JTI has been in use at AmLight since 2021. JTI, for instance, aims to replace SNMP with a closer-to-real-time solution to gather interface counters.
In 2022, AmLight performed multiple simulations and field evaluations to compare how each monitoring technology reported specific events, from microbursts to DDoS attacks. Also, AmLight experienced various events (microbursts and DDoS attacks) that helped shape the solution to its needs. AmLight compared INT, SNMP, sFlow, and JTI by creating paths that included all sorts of technologies, and used highly precise traffic generators to create network events. Results, methodology and lessons learned will be presented to the audience, as well as how AmLight combined its monitoring technologies to guarantee science drivers’ expectations are met.
More details about the event can be found here: https://internet2.edu/2022-technology-exchange/2022-program/advanced-networking-sessions/