It doesn’t matter if you’re driving a Ferrari Roma Spider or a Ford Focus, if there’s a queue to get into the shopping mall then you’re not getting there any faster than the car in front of you.
The same is true of broadband. You could have a gigabit connection or one that tops out at 40Mbits/sec, but if the internet traffic queues are building, you’re still likely to suffer lag when you’re on a video call, playing games or streaming. For hardware such as the forthcoming Apple Vision Pro, high latency could ruin the experience.
Speed isn’t everything, as we’ve discussed many times before in SamKnows Spotlight. Which is the reason the industry is now focused on driving down latency (more specifically, “working latency” – the latency you can expect under normal usage of your connection). Broadband providers, the big tech firms, app developers and performance monitoring firms – including SamKnows – are all working in harmony to eradicate the bottlenecks that cause lag and stutter.
Customers on gigabit-grade connections would be forgiven for believing their connection shouldn’t suffer from any lag or stutter. But even the fattest download pipes can still suffer from latency, as SamKnows’ lead integrations engineer Ben Janoff explains.
“Latency can creep in from a lot of different places,” said Ben. “Not only is there latency due to the technology itself, but there's also lots of queuing-induced latency; this is where the load on your connection, all the other devices you have in your house using the internet, are filling up all the queues that exist on the line. So, all your traffic then has to spend time waiting in a queue.”
In fact, those on the fastest connections can suffer the most. “The more speed you have, the deeper the queues have to be,” said Ben. “And the deeper the queues have to be, the higher the latency. That has a knock-on effect for applications such as video conferencing and game streaming.”
That effect was seen clearly at the start of the Covid pandemic, according to Comcast vice president Jason Livingood. “If you think about video conferencing early on in the pandemic, when people shifted to working from home, we had a lot of people upgrading to our 1 or 1.2-gig tier and while they certainly had plenty of capacity… they still had poor Zoom or Teams or WebEx quality,” he said. “And that was simply down to an issue of lag, which generally arises out of buffering that's occurring some place in the network, where a buffer is too big and it's queuing up packets instead of sending them right away.”
That can be okay for big downloads,” added Jason, “but when it's real-time traffic, it's a problem.
Sorting the traffic
To stop those internet traffic queues from forming on its network, Comcast is implementing a new technology developed by the cable industry called Low Latency DOCSIS (LLD). This is based on the IETF Low Latency, Low Loss, Scalable Throughput (L4S) standard, as well as the upcoming Non-Queue-Building (NQB) Per-Hop Behaviour standard, which should help to process latency-sensitive traffic more quickly.
Instead of putting all internet traffic in the same queue, LLD creates two queues at points where bottlenecks have traditionally occurred in the network: one queue for latency-sensitive traffic and one for the rest. With LLD, the cable modem transmission system (CMTS) will have two queues in the downstream direction, while the customer’s cable modem/router itself will have two queues in the upstream direction. There are also some existing queues in the Wi-Fi network that are effectively repurposed.
What’s the difference between how the queues operate? “The easy way to summarise the low-latency queue is that it's a very shallow queue and so it's not going to build up a big buffer of packets,” said Jason. “It's going to forward those packets as soon as possible.”
App developers will have to mark which queue they want their data to be processed through. But won’t those developers be tempted to put everything in the low-latency queue to get their traffic through the system as fast as possible? “That's a good indicator of the first misunderstanding that a lot of developers and other folks have about this,” said Jason.
“It's not getting access to a faster queue per se, where there's greater capacity or more speed, just a more responsive queue with a shallower buffer. If you think about it from the perspective of the application developer, not every application will find that desirable. If you’re someone who is doing a big background file download – think about a game update or an operating system update – then you may want to build a queue, so that you can try to maximise the capacity of the link.”
But if LLD works as intended, it means latency-sensitive traffic such as Zoom calls or games streaming packets won’t be joining the back of that long queue. Those small cars will get to nip in front of the big lorries, to stretch our traffic metaphor, ensuring they arrive at their destination more quickly.
It’s early days for LLD, with Comcast still midway through its trials at the time of writing. But according to Jason Livingood, the initial results are very encouraging.
Comcast has its own measurement agent inside its XB7 and XB8 gateways, and it’s seen more than a 50% reduction in latency when running the LLD trials on those devices. “A 50% reduction is pretty great and, in particular, it also reduces the jitter – the variability of that delay – and makes it a lot smoother, which is much better for applications. We expect further improvements as we make adjustments,” he said.
One of the applications that stands to benefit most from LLD is games streaming, and Comcast has been working with NVIDIA to test its GeForce Now service as part of the trial. Previously, GeForce Now might see lag spikes on the Comcast network of 225ms or beyond. Now those spikes are down to 20ms. Any game streamer will tell you that could be the difference between life and death in first-person shooters!
Apple’s latency measure
Then we come to the newly created latency metric, Apple’s RPM, which stands for round trips per minute. This works very differently to traditional latency metrics, as Ben Janoff explains. “The most important thing to know about RPM is that normally when we talk about latency we're talking in milliseconds, and the bigger the latency the worse it is. So, one millisecond is really good, five milliseconds is a bit worse, ten milliseconds is worse still.”
With RPM, the inverse is true: 800 RPM is much better than 200 RPM. And to give you a sense of scale, “5,000 RPM is a really great connection,” said Ben.
Comcast has been measuring RPM in its trials. Jason Livingood said he’s seen customers shoot up from less than 100 RPM to between 1,000-2,400 RPM – what he describes as a “dramatic improvement”. In the trial Comcast is also testing that same RPM test with L4S turned on in macOS, which should increase the RPMs even further.
“What's really cool about the SamKnows Whiteboxes out there that have implemented this responsiveness test is that those are all controlled, so we know they're the exact same hardware, unlike in customer homes, where there's a lot of variability – and often a Wi-Fi connection between the test client and internet,” said Jason. “So, this is a great measurement platform for us.”
RPM is also a more realistic measure of latency than a single ping, according to Ben Janoff, because it more accurately models the load on a connection in a typical home. “RPM is all about latency when the connection is being heavily used,” he explained. “When we just talk about latency, we could be talking about latency under load, or we could be talking about latency not under load. A number in milliseconds doesn't really convey any information about what else was happening on the line. But when we talk about RPM, we know that we're talking about a connection that's being used as much as possible and then we measure the latency. It allows us to see how the network is going to perform under harsher conditions that reflect real-world use.”
Ben gives a real-world scenario that RPM would accurately model. “You have a situation where mum has started a backup on her laptop and that transfer is going along quite nicely, and then upstairs the child is trying to play online video games, but their experience is really degraded by the bulk transfer happening. That's one case under which we can use RPM to measure the latency, but not only does it measure the latency of the gaming session, it also measures the latency of the bulk transfer connection.”
“This is really important for situations such as video streaming,” Ben added. “If you imagine you're watching a video streaming service, and you come to the realisation that you've already seen this part of the episode and you want to fast forward, you grab the video scrubber and you sweep it to the right. Here, the latency is really important, because you also have a bulk transfer going on. There’s a lot of video streaming data coming down from the internet to your device, but you also want it to be responsive. You want it to not take too long to chew through all the data that was already sent before you start getting to the new data that you care about at the point you moved the playback scrubber. So, RPM is very interesting because it blends latency of the traffic that is not generating the load, and the latency of the load itself.”
Breach of net neutrality?
As Jason Livingood explained earlier, even though LLD doesn’t create a “fast lane” and a “slow lane” – it’s just moving different types of traffic to different queues – some might question whether it breaches net neutrality, the principle that all traffic should be treated equally.
Jason Livingood is adamant that shouldn’t be a concern. “The way that this should properly be deployed is that the applications themselves are the only ones that should be doing the marking of their packets,” he said, adding that a network operator shouldn’t attempt deep-packet inspection of traffic to decide what queue it goes to, not least because that would be futile for the vast majority of data. “Almost all traffic is encrypted,” he said, estimating a figure as high as 98%, “so you're really making a leap of faith here from the standpoint of inference.”
“The problem is when you have a false positive,” said Jason, of any attempt to determine latency needs of an application flow at the network level. If a latency-critical app was wrongly placed in the incorrect queue, it would “get really poor quality of experience and we just don't think that's worth the trouble or expense and complexity, and the troubleshooting – and it obviously can pose policy issues. We've recommended against that, and our deployment design doesn't do any of that.”
“The other thing that we have had to do is really help explain to policymakers and regulators that low-latency queuing and this L4S networking isn't adding a higher level of priority or assigning a higher amount of capacity or speed,” Jason added. “It's carrying packets at the same best-effort level of priority and it's sharing the same overall bandwidth. It isn’t granting them more speed or priority, and so I think it immediately takes those net neutrality concerns off the table.”
Meeting consumer demand
Some might be wondering why broadband providers and others are going to such lengths to improve latency on their networks and applications. After all, the promise of shaving off milliseconds of latency is a hard sell to consumers – it certainly isn’t as easy to understand as bumping up download speeds.
However, Jason Livingood argues that consumers are becoming more latency aware. “Some core segments of broadband users are already fairly well sensitised to lag or delay, certainly gamers,” he said. “Often, lag or delay is shown right in the corner of their screen when they're gaming, so they’re very focused on it, especially high-end gamers who are really trying to find out every single way to eke out a few milliseconds here and there in their setups.”
Homeworkers are another core segment that are sensitive to latency problems, according to Jason. “They subscribe to the highest-end tier that they can get in terms of capacity and yet they still have this lag issue,” he said.
Ben Janoff agrees. “We’re seeing consumers become a lot more aware about how latency sensitive their applications are and about how much latency affects their experience, especially latency under load,” he said.
Then there is the next-generation hardware, such as headsets being designed for the metaverse or augmented reality. SamKnows CEO Alex Salter said at the time the company was devising its new latency tests there was a “lot of noise around headsets that companies such as Meta or Apple are releasing, that give real-time feedback through the goggles. For those applications to work, it needs to be extremely low latency and that became a very powerful new use case.”
Could latency become the next big metric in broadband performance, even superseding download speeds? “I think that if we switched from speed-only to latency-only, we'd be no better off,” said Ben. “What people want is their experiences to be good. They want to go to web pages and have them appear quickly. They want to move their video scrubbers and have their videos start streaming quickly. They want to scroll around on Google Earth and have the map tiles appear quickly – and that isn't quite speed, and it isn't quite latency, but it's a blend of both; the idea of RPM is to capture them both.”