We are often asked about what is acceptable network performance and what can be done to improve things when performance is sub-par. The unfortunate answer is that there is no one silver bullet for all network issues, but armed with enough knowledge (and good ol’ lead bullets, so to speak), you can make your customers' networks hum.
Here's an issue that’s come up recently:
"Netflix recommends 25 Mbps per stream in order to get 4K HDR quality video. This means that with only a 30ms round-trip time (latency), users will be BELOW the recommendation, even if nothing else is happening on the network!"
For this first part, there are quite a few issues we need to define.
For the purpose of our discussions, we’ll define network latency as the time it takes for a packet to travel from one device to another. Latency is much like the time it takes for your voice to travel from your mouth to the ear of the person you are speaking with.
Where Does Latency Come From?
Latency is a cumulative effect of the individual latencies along the end-to-end network path. This includes every network segment along the way between two devices (like a switch or access point). Every segment, or hop, represents another opportunity to introduce additional latency into the network.
Network routers are the devices that create the most latency of any device on the end-to-end path. Additionally, packet queuing due to link congestion is often the culprit for large amounts of latency. When a switch, access point, or router becomes loaded, the time it takes to process each packet increases, driving up latency. Some types of network technology, such as satellite communications, add large amounts of latency because of the time it takes for a packet to travel across the link. Since latency is cumulative, the more links and router hops there are, the larger end-to-end latency will be.
What Happens with High Latency?
TCP (Transmission Control Protocol) traffic represents a significant amount of the traffic on your local network. TCP is a "guaranteed" delivery protocol, meaning that the device sending the packets gets a confirmation for every packet that is sent. The receiving device sends back an acknowledgment packet to let the sender know that it received the information. If the sender does not receive an acknowledgment in a certain period of time, it will resend the "lost" packets.
For simplicity, let's call that period of time that the sender waits before resending packets the "window size." While the sender is resending packets, it is no longer sending new information. The window size is adjusted over time and tightly correlates to the amount of latency between the two devices.
As latency increases, the sending device spends more and more time waiting on acknowledgments rather than sending packets!
But Does It Really Affect Anything?
Since the window size is adjusted upwards as latency increases, there is a direct inverse relationship between latency and throughput on the network. Let's look at an example of two devices that are directly connected via a 100Mbs Ethernet network (with nothing in between). The theoretical max throughput of this network is 100Mbps. Now take a look at what happens to that throughput as latency increases.
Notice how drastic the drop in throughput is — with round-trip times as low as 30ms!
It Gets Worse!
Remember when we mentioned that some packets become "lost"? These lost packets have to be resent, thus increasing the amount of data that must be transmitted. Packet loss will cause the sender to sit idle for longer periods of time waiting for the acknowledgments to come back from the receiver. The packets that get lost might even be the acknowledgment back from the receiver, meaning that the sender will be resending information that was already sent successfully. The result is a further significant decrease in throughput.
Taking the same test system from above and introducing a 2% packet loss through a packet loss generator gives you the following results.
Here's a great visual representation of the effect of packet loss and latency on network throughput.
What Network Performance Should We See?
This is a very difficult question to answer with a blanket rule. There are some situations where increased latency is unavoidable. What is critical is that you are monitoring that latency and packet loss so that you can identify what is typical and respond to issues quickly.
Here are some guidelines for acceptable performance on your networks:
- Latency on a local area wired Ethernet network should be 1-2ms.
- Wireless networks often have higher latency and packet loss. Maximize signal strength, coverage, and RF interference to get latency and packet loss to a minimum.
- A round-trip latency of 30ms or less is healthy on a typical broadband cable modem or DSL WAN connection (fiber has much lower latency).
- Round-trip latency between 30ms and 50ms should be monitored closely. Consider looking deeper at the network for potential issues.
Interested in learning more? CEDIA is offering Networking Workshops and Boot Camps all over North America. Find more info here.
So What’s the Fix?
I often say that troubleshooting network problems can often feel like chasing ghosts. There are a lot of complex, hidden issues and problems that present themselves sporadically; without the proper tools and training, resolving these issues can be impossible. A tip that I use in many of the CEDIA training courses is to use the OSI seven-layer model and root-cause analysis to bust those ghosts on the network. So, what causes packet loss and latency on a network and how do you apply these fundamentals? Let’s take a look at what could cause performance issues on the network starting at the physical layer and moving all the way up to the presentation layer.
Physical issues on the network
Bad cabling, improper terminations, and physical port failures can all cause packet loss and latency on a network. In the field, this can be caused by poor pre-wire, bad trim out, a stray nail, and other physical wiring issues.
Poor signals and interference
For both wired and wireless connections, bad signals can cause slow transmission times as well as packet loss. As you get physically further away from the source your signal weakens, and eventually, the transmission will fail. Also, interference, both RF and electromagnetic, can cause signal flow issues which result in loss, latency, and signal corruption.
Too much traffic for a device to process is a big problem with older equipment, especially switches, access points, and routers. Much like a computer, when you have too many programs running, the processor and memory of the network device may become highly utilized. That over-utilization results in queuing, which increases latency, lowers throughput and ultimately leads to packet timeouts and loss. As we add more devices and stream more content, equipment installed years ago may not have enough raw horsepower to keep up with the growing need for speed.
Bugs, viruses, and rogue traffic
As we move up the stack, there are applications which can impact the performance of the network. High traffic flow can force queuing and overload the switches, routers and access points. Additionally, in a software-driven world, bugs will exist. It’s an unfortunate reality, and it can lead to headaches when troubleshooting. Memory leaks, bad protocol support, and runaway processes can all cause problems on critical networking infrastructure. Following best practices, such as using only stable firmware releases in the field and using proven devices and manufacturers can limit your potential exposure to pesky bugs.
These are some of the more common causes of packet loss and latency in the networks home technology professionals manage. Understanding the root cause and what to look for will help you chase down the ghosts that often pop up in your networks. It is also important to note that an ounce of prevention is worth a pound of cure. Taking the time to design, engineer, configure, install, and certify your networks and cabling will save you big headaches in the future.