If you don’t know me, I’ve been working on a peer-to-peer application as my full time job for the past five years. I’ve written the same networking implementation twice. If there’s something that I’ve noticed over the past five years is how little the average programmer knows about networking. To maybe help remedy this, here is a brain dump of what I’ve learned (either before or during these five years) about networking.
In the list below, I have assumed that I was talking to an average programmer who has created a socket a couple of times for experiments, but hasn’t actually worked on this professionally.
Here we go.
General knowledge
- The average programmer or end user barely knows what an IP address is. Don’t expect any base level of knowledge from them in your UX. This includes for example the concepts of opening and closing a port.
- The average programmer who has never written networking code overestimates how capable they would be of writing good networking code.
- The average programmer sees networking in general as a solved problem, something boring that just works if you know what you’re doing. It is very much not.
TCP
- A TCP connection has two unidirectional streams, one per direction. Each party can send a FIN to close their sending side (in which case the other party will receive an EOF) but still continue receiving data.
- A surprisingly low number of programmers know about the previous point.
- The TCP protocol is moderately complicated. Implementations of the TCP protocol are insanely complicated. It’s not even worth trying to understand how an implementation works unless you have a passion for it.
- Don’t tunnel TCP traffic through TCP.
- TCP could in principle be made better if it had more information from the higher-level protocol. Instead, TCP tries to “magically” work no matter how it’s used, which has lead to trade-offs. This is why QUIC has been developed.
- Because of TCP slow start, if you stop sending data on the socket for more than a few seconds, the next data you write will be transferred at a very slow speed. This can be problematic if you want to send occasional bursts of data.
- You should probably disable the Nagle algorithm and make sure that you buffer as much data as possible before calling send.
Doing so requires designing your code so that you can know whether more data can be written or if the protocol is now blocked waiting for data from the remote. If you don’t consider this in your API design, you might have to refactor huge parts of your code. - If your code buffers outgoing data and sends the content of the buffer either when it’s full or after a delay, then you’re doing exactly the same thing as what the Nagle algorithm does, and you’re probably better off leaving it enabled and removing that buffering.
- You can in theory implement a multiplexing protocol on top of TCP, but in practice this can’t be done properly due to lacking access to the implementation details of TCP. This didn’t prevent people from trying.
IP
- In 2022 you still can’t rely on IPv6 being enabled on end users’ machines and by their ISPs. You have to put an IPv4 fallback everywhere.
- Some IP addresses are reserved. For example, 203.0.113.0 is reserved for documentation purposes.
- Do not rely on IP addresses to identity the machines you’re connected to. IP addresses can be spoofed. If you want to track the identities of the machines you’re connected to, you must use cryptography.
- Assume that your software is running in Kubernetes or behind some sort of proxy. Do not assume that your source IP address is globally accessible.
- Do not assume that IP addresses are transitive. If Alice receives a message from an IP address, one can assume that Alice will be able to send a message back to that IP address. However, if Alice gives that IP address to Bob, one cannot assume that Bob will be able to send a message to it.
- However, one can make the assumption that IP addresses are transitive when it comes to creating heuristics or games, as it is most often than not the case.
- Mobile phones in particular might change their IP address relatively frequently, for example when switching between WiFi and mobile Internet. In the case of TCP, this change requires reestablishing all existing connections. This might lead to bad UX if your application doesn’t react appropriately.
Back-pressure
- If you receive data at a faster speed than you can process it, you have a problem. You can solve this problem either by slowing down the sender (what TCP does) or silently discarding some of the data (what UDP does).
- This applies to networking traffic, but also to reading data from a disk, receiving data from an external process, or higher-level concepts such as receiving data from another part of the process. Any situation where data comes “from the outside” is concerned by the problematic of back-pressure.
- It is easy to ignore this problem because in practice you are rarely limited by your processing speed (processing data typically involves running some CPU computation or writing it to disk, both of which are usually faster than receiving data from the networking).
But once a problem happens (such as a burst of traffic, or writing to the disk becomes slow because the disk is busy), the fact that you have ignored the problem will lead to a waterfall of issues. - If you ignore this problem early on, you might have to rewrite big parts of your code later on, which then might never happen for business reasons. Do not ignore this problem early on.
- If you go for the strategy of slowing down the sender, then you should avoid loops in your data flow entirely. Otherwise, you run the risk of a deadlock because two senders are each blocked waiting for the other to have finished sending.
- When receiving data, do not dynamically grow buffers infinitely. Put a limit to the maximum size of these buffers. Otherwise, if you receive data at a faster rate than you can process it, you have a memory leak.
- If the buffer of incoming data becomes too large, then there can be a long delay between when data is received and when it is processed. If a response is expected, this delay might be long enough to trigger timeouts. The size of the buffer of incoming data should be set according to the maximum acceptable delay.
- If the buffer of incoming data is too small, you will need more context switches or networking round-trips than required. This can be problematic, but less problematic than having a buffer too large.
Multiplexing
- Multiplexing consists in combining multiple streams into one.
- While multiplexing is used everywhere in the Internet infrastructure, it is still a relatively unknown concept within applications, because it is not really possible to properly implement multiplexing on top of the TCP protocol.
- The purpose of multiplexing is to solve the problem of head-of-line blocking. If you want to request multiple things in parallel, you probably should use multiplexing.
- The number of substreams being open at any given time should have a limit, in order to avoid malicious remotes triggering memory leaks.
- Multiplexing allows you to send an urgent piece of data to the remote, a bit like an emergency vehicle diving past a traffic jam.
- In terms of API, multiplexing is typically the point where a bad API makes you lose control of whether or not you should flush your buffer of outgoing data or if more data is about to be added to it. Easy-to-use high-level APIs typically don’t include a way for the multiplexing code to know this, and a more complicated API is unfortunately necessary. This is a problematic that you should have in your mind from the start of your API design.
Designing protocols
- Designing a single protocol that does everything can be incredibly complicated. Instead, use multiplexing to combine multiple individually simple protocols together.
- Don’t lose control of how data is transferred over the wire by stacking up abstraction layers. You always want to be able to fix low-level problems if any arises.
- Sometimes you will want to send a stream of data, and sometimes you will want to send individual packets. You can easily convert between one another, but avoid converting back and forth multiple times due to abstraction layers.
- Write a specification of your protocol, and write unit tests that make sure whether your application conforms to this protocol.
- Assuming that you’re using multiplexing to combine multiple simple protocols together, make sure that in each protocol each party has a precise order in which they send their messages. Failure to ensure that might lead to both sides being deadlocked because the buffers on both sides are full and waiting for the other party to empty them.
- High-level protocols can typically be put into two categories: Alice sends a request to Bob and Bob sends back a response, or Alice asks (explicitly or implicitly) Bob to be kept up-to-date with what happens and Bob sends back a stream of events.
- In the case of Alice sending requests and Bob sending back responses, Alice can slow down the responses by not sending more requests.
- Requests and responses need a size limit, to avoid memory leaks.
- It can be difficult to differentiate between situations where the remote is a bit slow to send its message or has limited bandwidth and situations where the remote is not sending anything at all. In order to differentiate as quickly as possible between these two situations, the timeout and the maximum size of a request or response should be relatively small.
- Consequently, avoid big requests. Prefer splitting a big request into several small requests and combine the responses together. For example, prefer sending 50 requests (possibly in parallel) with a 1 MiB response each instead of one request with a 50 MiB response.
- In the case of a stream of events, the events in question are typically not generated for the sole purpose of networking but come from some external source.
The back-pressure strategy that consists in slowing down the sender thus has a flaw, as the sender needs a strategy in case the events are piling up on its sending side.
Design your events on a high-level so that they can be safely ignored or merged. For example, a message saying “value is now 3” can be discarded if it is followed with a message saying “value is now 5”.
DoS attacks
- The average Internet commenter doesn’t properly understand what a DoS attack is.
- There exists basically two kinds of DoS attacks: “low-level”, and application-level.
- “Low-level” DoS attacks (on the IP level or TCP/UDP level) are rarely discussed because they can’t solved by application developers. They are generally considered as a solved problem, and the solution is named CloudFlare.
- Preventing application-level DoS attacks mostly consists in properly designing protocols to cover all corner cases such as “what if the remote sends an insanely large number of <X> messages?”, “when a networking message leads to one or more elements being added to a buffer, does this buffer have a limit to its size?” or “what if the remote sends the data at 1 Byte/second?”, or “what if the remote announces that it is about to send a packet containing 200 GB of data?”.
- Avoid algorithms that are O(n²) or worse, even if you measured them as being faster than a O(n) or O(1) equivalent. A malicious remote could send a payload that intentionally triggers a bad algorithmic complexity in your code.
- The objective of a DoS attack is to deny access to the service to legitimate users. Hence the name. It doesn’t necessarily imply crashing a server.
- If your application responds to a heavy load by returning errors to legitimate users, then you effectively make it easier for the attacker to attack you.
- An application can be said to be DoS-resilient if it guarantees a minimum level of service to all the peers it is connected to, no matter what malicious peers might possibly send to it.
- Because the objective is guaranteeing a minimum level of service, and that CPU and memory are limited, ultimately an application can only serve a up to a certain number of peers, after which it must deny access to new ones.
DDoS attacks typically try to occupy all the available slots in order to prevent legitimate users from connecting. - You should leave all the DoS-mitigation techniques such as blocking IP addresses to external applications (such as firewalls, or CloudFlare). Do not implement this yourself in your application.
- Vertical scaling consists in increasing the CPU and memory in order to increase the number of peers that can be served by a single application. Horizontal scaling consists in increasing the total amount of CPU and memory but by spawning more machines and more instances of the application.
- Ultimately, resisting a DDoS attack involves paying more money for more CPU and memory, while the attacker is spending money performing the attack. The winner is whoever last runs out of money. The objective of programmers is to reduce the amount of money that would need to be spent by the defender.
Observability
- If you‘re not familiar with observability, look at how Prometheus would do it. For example, measuring the bandwidth usage is simply done by keeping counters of the total number of bytes sent and received since the application starts, and the rate calculation is done by the UI.
- You should measure almost everything: the bandwidth usage or your application, but also the number of connections, substreams, requests, request sizes, response sizes, number of errors, timeouts reached, etc.
- Due to the number of layers typically involved in networking code it can be easy to miss easy optimizations, and measuring everything is a good way to spot things that seem off.
- Just like in politics, you can make statistics say anything you want, and measuring the correct thing is often more complicated than fixing issues.
Peer-to-peer networks
- Because nodes can join and quit the network at any time, keeping track of the nodes of the network and knowing who to connect to is an incredibly complex open problem.
- Due to the previous point, providing user feedback as to why your peer-to-peer application doesn’t work properly is very challenging. Contrary to traditional client-server applications, failing to connect isn’t a hard error.
- It is relatively easy to solve some problems by designing “games”. Games, however, tend to lead to emergent behaviors where for example all the nodes on the network synchronize themselves. This can lead for example to huge CPU spikes, which are amplified if back-pressure isn’t done correctly.
- It can be very hard to differentiate between abnormal traffic (such as a DoS attack or a bug) and an emergent behavior.
- While peer-to-peer applications are in principle not more difficult to make DoS-resilient than traditional client-servers, in practice they tend to use more complicated protocols. Additionally, in client-server protocols, the client often assumes that the server is trusted, while in peer-to-peer applications this can’t be assumed.
Misc
- Users of your software will run your software in Kubernetes even if you advise them to not use Kubernetes, and then complain to you when it doesn’t work properly.
- Many inexperienced tech-savvy people think that running a server at home 24/7 is an easier solution than asking a cloud provider.
- It’s basically impossible to find and hire networking experts. They don’t exist on the job market.
- Yes, it’s all incredibly complicated.