In this article, we’ll explore various methods of establishing TCP connections and see that the well-known three-way handshake isn’t the only way. We’ll learn how a TCP self-connection is possible. We’ll run through some TCP fundamentals, discuss ephemeral ports, and examine how TCP can be unreliable. We’ll try out tcpdump, netem, and NFQUEUE. Using Scapy, we’ll write several custom servers that communicate over TCP and implement different connection-establishment techniques.

A Bit About TCP Link to heading

TCP provides the user of the protocol with a (reasonably) reliable, full-duplex, byte-stream transmission service. Let’s break down what that means:

  • Full-duplex means that both sides can send and receive data simultaneously at any given time. In other words, the two directions are independent of each other. In contrast, a half-duplex connection requires the parties to take turns transmitting data, while a simplex connection only allows data transmission in one direction.
  • A byte-stream service means that the TCP user can send a sequence of bytes to the peer and receive a sequence of bytes from the peer in the exact order they were written. Dividing this byte stream into logical parts—for example, into the messages of a higher-layer protocol—is the responsibility of the protocol user.
  • Why TCP is considered reasonably reliable will be discussed at the end of this section.

The original TCP specification—RFC 793 from 1981—is now obsolete. A newer version—RFC 9293 from 2022—incorporates the updates and additions accumulated since 1981.

To support multiple destinations on the same host, TCP uses unsigned 16-bit ports, which means the maximum port number is 65535. This also enables multiple TCP connections between the same pair of hosts.

TCP protocol messages are called segments. They consist of a header—which contains service information necessary for TCP operation—and a payload—a contiguous portion of the byte sequence sent by the protocol user. Each byte of payload is assigned a sequence number, with numbering done independently in each direction—client to server and server to client. This allows TCP to:

  • track which parts of the data have been successfully received by the other side and retransmit data that was not received;
  • reconstruct the original order of segments that arrived out of order;
  • detect duplicate segments.

Therefore, each segment that contains a payload specifies the number of the first byte of the payload it carries. One might assume that numbering starts from 0 or 1, but that’s not the case. Using different initial sequence numbers complicates or prevents certain attacks where an attacker needs to guess the byte numbers that one party expects. If the sequence of initial sequence numbers is also increasing over time, it solves another issue: it allows distinguishing a segment from the current connection from a segment that was sent during a previous connection, got “lost” in the network, and only arrived after a delay.

When establishing a connection, both the server and the client choose their own ISN (Initial Sequence Number)—a 32-bit unsigned number that defines the numbering of bytes they will transmit. The first byte of payload sent will have the number ISN + 1, the second one ISN + 2, and so on (all modulo 2^32). The chosen ISN is communicated to the other party during connection establishment process.

For more information about ISNs, how they are generated, and attacks targeting them, see RFC 6528: Defending Against Sequence Number Attacks.

A TCP segment header, among other things, includes:

  • A 32-bit Sequence Number (Seq) field, which holds the number of the first byte of the payload carried by this segment.
  • A 32-bit Acknowledgment Number (Ack) field—this indicates the number of the first byte that has not yet been received from the other side by the sender of the segment. Uses numbering of the other side.

TCP segment headers also include a set of flags (boolean values). In the context of this article, the following flags are of interest:

  • SYN (synchronize sequence numbers) — This flag sets the semantics of the SEQ field: if the flag is set, the Seq field contains the ISN. Otherwise, the Seq field contains the sequence number of the first byte of data in the segment.
  • ACK (acknowledgment field is significant) — This flag sets the semantics of the Ack field: if the flag is set, the Ack field contains a valid value. Otherwise, the field should be considered unset and its value ignored.

Why was TCP referred to as reasonably reliable rather than simply reliable? Typically, when people say “TCP is reliable”, they mean that it ensures reliable delivery of segments to the recipient: TCP solves the problems mentioned above, whereas lower-layer protocols do not.

However, when it comes to data integrity, there are some “rough edges”. To verify that the transmitted data has not been accidentally altered, TCP uses a 16-bit checksum—which, by modern standards, is rather weak. Since Ethernet uses the much more robust CRC32, the likelihood that corrupted data would pass both checks remains very low (note that IP does not check the integrity of its payload at all). Still, this probability is significantly higher than that of a random collision in modern cryptographic hash functions—such as even the currently considered weak MD5 or others. Moreover, there’s a risk that data passing the CRC32 check could still be corrupted due to bugs in protocol stack software (as happened, for example, at Twitter).

So what can be done? One straightforward option is to use TLS, which uses cryptography. If you’re developing your own protocol and need to ensure data integrity, you can use cryptographic hash functions like SHA-256, or lighter-weight checksums such as CRC32C. In general, implementing end-to-end integrity checks is good practice.

Further reading on this topic:

Note: This article does not cover TCP extensions, such as TCP Fast Open.

The commands and programs described in this article were run inside Docker containers based on Debian 12 with the Linux kernel 6.8.0-58-generic.

3-Way Handshake Link to heading

A handshake is the process of exchanging messages to establish a connection between two hosts. This is necessary to pre-negotiate and agree on various parameters.

The side that sends the first TCP segment—thus initiating the connection setup—is called the client. The side that waits for the initial segment from the other side is called the server. In other words, the client performs an active open, and the server performs a passive open. [2]

The three-way handshake (3WHS) is how TCP connection setup happens in the vast majority of cases (essentially always). It’s called “three-way” because a successful TCP connection is established through three segments:

  1. The client sends the server a segment with the SYN flag set and its generated ISN, written in the Seq field.
  2. The server replies with a segment that has both the SYN and ACK flags set. The Seq field contains the server’s generated ISN. As an acknowledgment that it received the previous segment, the Ack field contains a value equal to the client’s ISN + 1.
  3. The client responds with a segment that has the ACK flag set and an Ack value equal to the server’s ISN + 1.

Let’s see how this looks in practice. We’ll set up a test environment using Docker, consisting of the following services:

  • Client — Sends an HTTP GET request to the server (as TCP payload) and prints the body of the HTTP response it receives. We’ll use curl as the client.
  • Server — Listens for incoming connections on port 8000. Once a connection is established, it sends an HTTP response (as TCP payload).
  • Observer — Monitors the TCP segments exchanged between the client and server and displays them. We’ll use tcpdump for this.

To begin, let’s observe how communication typically works between a client and server using the standard socket library, relying on the operating system’s built-in TCP implementation. We perform a bind, listen, and accept, then send an HTTP response:

import socket

def create_http_response(body: str) -> bytes:
    template = f"""HTTP/1.1 200 OK\r
Connection: close\r
Content-Length: {len(body)}\r
Content-Type: text/html\r
Host: server\r
\r
{body}"""
    return template.encode("ascii")


if __name__ == "__main__":
    # SOCK_STREAM means using TCP (stream = byte-stream service)
    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
        s.bind(("0.0.0.0", 8000))
        s.listen(1)
        conn, _ = s.accept()
        with conn:
            # Send an HTTP response over TCP.
            conn.sendall(create_http_response("Hello from the default server!"))

Now let’s examine the output from tcpdump:

15:12:20.742145 IP split-handshake-client-1.split-handshake_default.56016 > df4e7823c0b4.8000: Flags [S], seq 3088454448, win 64240, options [mss 1460,sackOK,TS val 592953866 ecr 0,nop,wscale 7], length 0
15:12:20.742197 IP df4e7823c0b4.8000 > split-handshake-client-1.split-handshake_default.56016: Flags [S.], seq 3878989831, ack 3088454449, win 65160, options [mss 1460,sackOK,TS val 140002617 ecr 592953866,nop,wscale 7], length 0
15:12:20.742252 IP split-handshake-client-1.split-handshake_default.56016 > df4e7823c0b4.8000: Flags [.], ack 3878989832, win 502, options [nop,nop,TS val 592953866 ecr 140002617], length 0
15:12:20.742423 IP split-handshake-client-1.split-handshake_default.56016 > df4e7823c0b4.8000: Flags [P.], seq 3088454449:3088454524, ack 3878989832, win 502, options [nop,nop,TS val 592953866 ecr 140002617], length 75
15:12:20.742448 IP df4e7823c0b4.8000 > split-handshake-client-1.split-handshake_default.56016: Flags [.], ack 3088454524, win 509, options [nop,nop,TS val 140002617 ecr 592953866], length 0
15:12:20.742543 IP df4e7823c0b4.8000 > split-handshake-client-1.split-handshake_default.56016: Flags [P.], seq 3878989832:3878989953, ack 3088454524, win 509, options [nop,nop,TS val 140002617 ecr 592953866], length 121
15:12:20.742606 IP split-handshake-client-1.split-handshake_default.56016 > df4e7823c0b4.8000: Flags [.], ack 3878989953, win 502, options [nop,nop,TS val 592953866 ecr 140002617], length 0
15:12:20.742674 IP df4e7823c0b4.8000 > split-handshake-client-1.split-handshake_default.56016: Flags [R.], seq 3878989953, ack 3088454524, win 509, options [nop,nop,TS val 140002617 ecr 592953866], length 0
tcpdump: pcap_loop: The interface disappeared
8 packets captured
8 packets received by filter
0 packets dropped by kernel

Let’s manually remove log messages unrelated to connection establishment and format the log for better readability:

client > server: Flags [S], seq 3088454448
server > client: Flags [SA], seq 3878989831, ack 3088454449
client > server: Flags [A], ack 3878989832

Flags [...] shows the flags set in the captured segment. S means the SYN flag is set, while A means the ACK flag is set.

Now let’s implement our own server that doesn’t rely on the OS’s TCP stack. Instead, it will manually perform the three-way handshake and send the HTTP message itself. To send TCP segments directly, we need to drop at least one level lower—to the IP layer—and send IP packets with our TCP segments as payload.

We’ll use the Scapy library to receive and process incoming TCP segments. While we could use raw sockets to work with IP packets directly, Scapy offers a convenient interface for manipulating packets and segments.

The operating system doesn’t know what we’re doing. So, when a segment arrives at a port considered closed by the OS’s TCP stack, it replies with a segment containing the RST flag, indicating the connection is not allowed. This behavior is undesirable: our custom server will handle the segments itself, and we don’t want the OS interfering. To solve this, we configure a firewall rule that silently drops incoming and outgoing TCP segments in such a way, that our application will still see these segments, but the OS TCP stack won’t.

iptables -t raw -I INPUT -p tcp --dport 8000 -j DROP
iptables -t raw -I OUTPUT -p tcp --sport 8000 -j DROP

These rules configure silent dropping of segments that have 8000 as either the source or destination port. Since the rules are applied in the raw table, they take effect before the TCP stack processes the segments.

From this point on, all server implementations include only the minimal amount of code and functionality needed to establish a TCP connection and send data. Below are some of the simplifications that were made:

  • All incoming segments after connection establishment are ignored.
  • The server sends a TCP segment containing an HTTP response immediately after the connection is established, without waiting for an HTTP request.
  • A large (and larger) part of the TCP protocol is not implemented—this includes the lack of graceful connection termination.
  • The ISN is generated using pure randomness.
  • Only one client can be served at a time.
  • Segments with unexpected flags are not handled.

All of this is done to keep the code minimal and focus purely on the TCP connection establishment process.

The HTTP response sent by the server makes it possible to visit http://localhost:8000 in a browser and visually confirm that the connection is established and working.

Here is the code for the server that performs the three-way handshake manually:

import ctypes
import random
from scapy.layers.inet import TCP, IP
from scapy.sendrecv import send, sniff
from scapy.packet import Packet, Raw


# Values larger than 2**32 will wrap around using modulo 2**32.
def uint32(x: int) -> int:
    return ctypes.c_uint32(x).value


def create_http_response(body: str) -> bytes:
    template = f"""HTTP/1.1 200 OK\r
Connection: close\r
Content-Length: {len(body)}\r
Content-Type: text/html\r
Host: server\r
\r
{body}"""
    return template.encode("ascii")


class Server:
    def run(self) -> None:
        sniff(iface="eth0", prn=self.handle_packet, filter=f"tcp dst port 8000", store=0)

    def handle_packet(self, p: Packet) -> None:
        # Check which flags are present in the incoming TCP segment.
        fs = set(p[TCP].flags)
        if fs == {"S"}:  # SYN
            self._connection_established = False
            self._isn = random.randint(0, 2**32 - 1)

            self._reply(
                p,
                flags="SA",
                ack=uint32(p[TCP].seq + 1),
                seq=self._isn,
            )
        elif fs == {"A"}:  # ACK
            if self._connection_established:
                return
            self._connection_established = True

            # Send the payload.
            self._reply(
                p,
                create_http_response("Hello from the 3-way handshake TCP server!"),
                flags="PA",  # PSH-ACK. PSH asks the receiver to pass the data to the application without buffering.
                ack=p[TCP].seq,
                seq=uint32(self._isn + 1),
            )

    def _reply(self, p: Packet, payload: bytes = b"", **kwargs) -> None:
        il = IP(src=p[IP].dst, dst=p[IP].src)
        tl = TCP(
            sport=p[TCP].dport,
            dport=p[TCP].sport,
            **kwargs,
        )
        send(il / tl / Raw(payload), verbose=False)


if __name__ == "__main__":
    Server().run()

Let’s look at the manually simplified tcpdump log that corresponds to the handshake:

client > server: Flags [S], seq 1417278038  
server > client: Flags [SA], seq 3093614128, ack 1417278039  
client > server: Flags [A], ack 3093614129

Exactly the same as with the OS’s TCP implementation. Additionally, we can verify the server’s functionality by connecting through a browser:

URL http://localhost:8000 in the browser.

Simultaneous Open Link to heading

Let’s take another look at the TCP specification:

The “three-way handshake” is the procedure used to establish a connection. This procedure normally is initiated by one TCP peer and responded to by another TCP peer. The procedure also works if two TCP peers simultaneously initiate the procedure. When simultaneous open occurs, each TCP peer receives a SYN segment that carries no acknowledgment after it has sent a SYN.

And further, from the same section:

Simultaneous initiation is only slightly more complex:

    TCP Peer A                                       TCP Peer B
1.  CLOSED                                           CLOSED
2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...
3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT
4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED
5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...
6.  ESTABLISHED  <-- <SEQ=300><ACK=101><CTL=SYN,ACK> <-- SYN-RECEIVED
7.               ... <SEQ=100><ACK=301><CTL=SYN,ACK> --> ESTABLISHED

On the left and right of the diagram are the states of the TCP state machine (omitted for simplicity in this article). In the middle, the segment flags and the values of Seq and Ack fields in the exchanged segments are shown.

So, another method for establishing a TCP connection is simultaneous open. This occurs when both endpoints send a SYN segment initiating a connection to each other before receiving the other’s SYN. In that case, the connection is established with four segments:

    A                  B
1.        SYN      -->
2.        SYN      <--
3–4.      SYN,ACK  -->
3–4.      SYN,ACK  <--

Note that in this method of connection setup, there is no client or server because both hosts perform an active open. That’s why we label them A and B. [2]

This method of establishing a TCP connection is very rare in practice, because both sides need to know each other’s IP address and TCP port ahead of time. In the typical case, the client knows the server’s address and port, and the server learns the client’s address and port from the incoming segment. Simultaneous open is used, for example, in TCP hole punching. [2] In short: both A and B send a SYN segment to each other, which allows their respective NAT devices to permit incoming traffic on those ports—thus enabling a direct TCP connection between two NATed nodes.

Let’s try to perform a simultaneous open. We’ll run two applications, each of which performs a connect to the other and sends some payload. Normally, when connect is called, the source TCP port is not explicitly specified—it’s chosen by the OS TCP-stack from the ephemeral port range (more on that in the next section). However, in our case, the IP addresses and ports must be known in advance, so ephemeral ports are unsuitable. Instead, we must explicitly bind to a specific address and port beforehand.

To ensure that each application sends its SYN before receiving the other’s, we’ll use netem to add a 400-millisecond delay in the delivery of each TCP segment:

tc qdisc add dev lo root handle 1:0 netem delay 400ms

To run the command above in a Docker container, add the --cap-add=NET_ADMIN or --privileged flag to your docker run or docker exec command.

To remove the delay use

tc qdisc del dev lo root

Thanks to this artificial delay, both clients will send their SYN segments, but neither will receive any segments from the other side for at least 400ms. This is exactly what’s needed to trigger a simultaneous open. The 400ms delay was chosen experimentally—any longer, and it risks triggering segment retransmissions due to missing acknowledgments within the expected timeframe.

As the communicating application, we’ll use nc (netcat):

nc -p 2000 127.0.0.1 3000 & nc -p 3000 127.0.0.1 2000 &

The -p NUMBER flag sets the source port — meaning that netcat performs a bind to the local IP address and specified TCP port before calling connect.

To capture packets on the loopback interface, we run tcpdump -i lo. So, after adding delivery delay with netem, starting both nc instances, we observe the following (manually simplified) tcpdump output:

A > B: Flags [S], seq 2472141684
B > A: Flags [S], seq 2303140715
B > A: Flags [SA], seq 2303140715, ack 2472141685
A > B: Flags [SA], seq 2472141684, ack 2303140716

A > B: Flags [A], ack 2303140716
B > A: Flags [A], ack 2472141685

The first four segments (separated by a blank line) represent the simultaneous open, exactly as described in the RFC. The two ACK segments that follow are not strictly necessary for connection establishment, though they are sent by the Linux TCP implementation. Let’s verify this claim: Are the two ACK segments, sent after the simultaneous open completes according to the spec, actually necessary? Could it be that the Linux implementation requires them, even though the RFC doesn’t?

To clarify, we will drop all segments between the two peers (A and B) after the SYN and SYN-ACK segments, then check the socket state. If the connection reaches the “established” state in both containers, the ACK segments aren’t essential for connection setup. If it doesn’t, they are.

Our test setup consists of the following services:

  • A, B — two applications that bind to specific ports on the local host, then each attempts to connect to the other.
  • Interceptor — a service that intercepts all TCP segments exchanged between A and B, holding each segment until one appears in the opposite direction. Then, the segment pair (one from A to B and one from B to A) is released nearly simultaneously. Segments that are not SYN or SYN-ACK are dropped. This ensures SYN from A and SYN from B are sent simultaneously, as are the SYN-ACK segments.
  • Observer — using tcpdump once again.

Containers A and B are placed in different networks to ensure communication passes through a default gateway — the interceptor. In other words, the interceptor sits between A and B. To manipulate segments, we use iptables inside the interceptor container to redirect all incoming and outgoing packets to NFQUEUE. This halts packet transmission until our Python script (using the netfilterqueue library) decides whether to drop or forward the segment, following the pairwise release logic described above. Here’s the code for the interceptor:

#!/usr/bin/env python3
import os
from netfilterqueue import NetfilterQueue
from scapy.layers.inet import TCP, IP

SIDE_A_IP = os.environ["SIDE_A_IP"]
SIDE_A_PORT = int(os.environ["SIDE_A_PORT"])
SIDE_B_IP = os.environ["SIDE_B_IP"]
SIDE_B_PORT = int(os.environ["SIDE_B_PORT"])

buffer = {"a2b": [], "b2a": []}

def cb(packet) -> None:
    sc = packet.get_payload()
    il = IP(sc)
    tl = il[TCP]

    key = ""
    if (
        il.src == SIDE_A_IP and tl.sport == SIDE_A_PORT and
        il.dst == SIDE_B_IP and tl.dport == SIDE_B_PORT
    ):
        key = "a2b"
    elif (
        il.src == SIDE_B_IP and tl.sport == SIDE_B_PORT and
        il.dst == SIDE_A_IP and tl.dport == SIDE_A_PORT
    ):
        key = "b2a"
    else:
        # Let through any segment not between A and B.
        packet.accept()
        return

    # Only allow SYN and SYN-ACK segments.
    fs = set(tl.flags)
    if fs != {"S"} and fs != {"S", "A"}:
        return

    packet.retain()
    buffer[key].append(packet)
    try_release_in_pairs()

def log_str(packet) -> str:
    scapy_packet = IP(packet.get_payload())
    return str(scapy_packet)

def try_release_in_pairs() -> None:
    while buffer["a2b"] and buffer["b2a"]:
        p1 = buffer["a2b"].pop(0)
        p2 = buffer["b2a"].pop(0)
        print(f"Releasing 2 packets: {log_str(p1)} and {log_str(p2)}.", flush=True)
        p1.accept()
        p2.accept()

if __name__ == "__main__":
    print("Creating a queue.", flush=True)
    nfq = NetfilterQueue()
    nfq.bind(1, cb)
    print("Running filter.", flush=True)
    try:
        nfq.run()
    finally:
        nfq.unbind()

Let’s take a look at the simplified tcpdump log:

15:34:33.063076 A > B: Flags [S], seq 416807200
15:34:33.063124 B > A: Flags [S], seq 2087524263
15:34:33.064385 A > B: Flags [SA], seq 416807200, ack 2087524264
15:34:33.064433 B > A: Flags [SA], seq 2087524263, ack 416807201

You can see that both SYN segments were sent almost simultaneously; the same for the SYN-ACKs. No other segments were sent.

Next, let’s check the socket status inside containers A and B:

$ ss -tan
State   Recv-Q   Send-Q   Local Address:Port   Peer Address:Port
ESTAB   0        0        172.28.1.10:27777    172.29.1.10:27778

$ ss -tan
State   Recv-Q   Send-Q   Local Address:Port   Peer Address:Port
ESTAB   0        0        172.29.1.10:27778    172.28.1.10:27777

The connections are in the ESTABlished state, even though no segments besides SYN and SYN-ACK were exchanged. This confirms our assertion: the extra ACK segments sent by the Linux TCP implementation are not required for connection establishment itself.

In summary: Linux supports simultaneous TCP open as defined in the specification — involving only the SYN and SYN-ACK segments in both directions — despite also sending additional ACKs during normal operation.

Self-connect Link to heading

Let’s consider the following Bash script:

while true
do
    telnet 127.0.0.1 50000 
done

Here, we repeatedly attempt to connect to port 50000 on the local host. If the connection fails, we try again. If it succeeds, telnet reports success in the terminal and opens an interactive session. Before running the script, port 50000 is closed.

If you let the script run for a bit, eventually (likely after a series of failed attempts), a connection will succeed:

telnet: can't connect to remote host (127.0.0.1): Connection refused
telnet: can't connect to remote host (127.0.0.1): Connection refused
telnet: can't connect to remote host (127.0.0.1): Connection refused
[ many lines like this ]
telnet: can't connect to remote host (127.0.0.1): Connection refused
telnet: can't connect to remote host (127.0.0.1): Connection refused
Connected to 127.0.0.1

Let’s try sending something:

Connected to 127.0.0.1
Hello there.
Hello there.

So, the connection is established. Anything we send gets echoed back to us. No one else is using port 50000. Everything suggests that we have connected to ourselves! But how did that happen?

Let’s consider the pseudocode for how a server socket is typically set up:

s = socket()
bind(s, IP_ADDRESS, TCP_PORT)
listen(s)
cs = accept(s)
# The cs socket is ready for use.

IP_ADDRESS specifies the interface on which to accept incoming connections: for example, 127.0.0.1, 0.0.0.0, 192.168.1.2, etc. Typically, TCP_PORT is a non-zero constant known in advance to both client and server. When the server starts, it reserves this port and waits for clients to connect on that interface and port.

Now here’s the typical client pseudocode:

s = socket()
connect(s, IP_ADDRESS, TCP_PORT)
# The s socket is ready for use.

When connecting, the client knows which TCP port to connect to. It usually doesn’t care what source port is used. In the pseudocode above, the operating system assigns a port to the socket from the ephemeral port range — a range of ports designated for short-lived connections. In contrast, reserved or prearranged ports are typically assigned to services. Once the socket is closed, the ephemeral port becomes available again.

The repeated use of “typically” implies that the client/server pseudocode reflects the most common socket usage. In practice, things may differ a bit. For instance, a server isn’t required to bind to a predefined port — it could bind to port 0, and the OS will assign an ephemeral port automatically. The assigned port can then be retrieved via getsockopt and somehow shared with the client. Likewise, a client can use bind to choose a specific local port before connecting, so that a particular source port is used.

The OS selects a port from the ephemeral port range in a somewhat randomized manner [3][4]. The search stops when an unused port is found or all ports have been tried and are busy. Note, that in order to reduce contention between connect and bind, connect call prefers even-numbered ports, falling back to odd only if all even ports were busy. [4] [5]

On Linux systems, you can check the ephemeral port range by running: cat /proc/sys/net/ipv4/ip_local_port_range

So — it’s just a matter of time before telnet happens to pick port 50000 as its source port and attempts to connect to 127.0.0.1:50000. At that point, the connect call sends a SYN segment, which is received by the same socket. We send a SYN, and, not yet receiving a SYN-ACK, we receive our own SYN. As a result, telnet ends up performing a simultaneous open with itself!

We then send and receive our own SYN-ACK, and the connection is fully established. Any further data we send simply loops back to us — so all payloads arrive back in telnet.

Split Handshake Link to heading

Looking at RFC 9293, we find the following:

The synchronization requires each side to send its own initial sequence number and to receive a confirmation of it in acknowledgment from the remote TCP peer. Each side must also receive the remote peer's initial sequence number and send a confirming acknowledgment.

    1) A --> B  SYN my sequence number is X
    2) A <-- B  ACK your sequence number is X
    3) A <-- B  SYN my sequence number is Y
    4) A --> B  ACK your sequence number is Y

Because steps 2 and 3 can be combined in a single message this is called the three-way (or three message) handshake (3WHS).

Fundamentally, the three-way handshake performs four logical actions, as illustrated in the diagram. Let’s try splitting the SYN-ACK segment into two separate segments: one ACK and one SYN. Strictly speaking, this no longer conforms to the specification, since RFC 9293 explicitly requires that a SYN-ACK be sent in response to a SYN. Still, let’s see how the client behaves.

Here’s the modified portion of the 3-way handshake server code:

def handle_packet(self, p: Packet) -> None:
    fs = set(p[TCP].flags)
    if fs == { "S" }:  # SYN
        self._connection_established = False
        self._isn = random.randint(0, 2**32 - 1)

        self._reply(
            p,
            flags="A",
            ack=uint32(p[TCP].seq + 1),
        )

        self._reply(p, flags="S", seq=self._isn)
    elif fs == { "A" }:  # ACK
        if self._connection_established:
            return
        self._connection_established = True

        self._reply(
            p,
            flags="A",
            ack=uint32(p[TCP].seq + 1),
            seq=uint32(self._isn + 1),
        )

        self._reply(
            p,
            create_http_response(
                "Hello from the split handshake TCP server (4-way)!"
            ),
            flags="PA",  # PSH-ACK
            ack=uint32(p[TCP].seq + 1),
            seq=uint32(self._isn + 1),
        )

Upon receiving a SYN segment, the server sends two separate segments: one with the ACK flag and the client’s ISN + 1, and then one with the SYN flag containing its own ISN. When the server receives the client’s ACK, it considers the connection established and immediately sends an HTTP response (for simplicity, without waiting for the HTTP request).

We test the setup — but it doesn’t work. The client never receives our payload. Let’s examine the simplified tcpdump logs:

client > server: Flags [S], seq 778074755
server > client: Flags [A], ack 778074756
server > client: Flags [S], seq 1811503747
client > server: Flags [SA], seq 778074755, ack 1811503748

# Followed by retransmissions due to lack of ACK:
client > server: Flags [SA], seq 778074755, ack 1811503748
client > server: Flags [SA], seq 778074755, ack 1811503748

Or, as a schematic:

    Client                Server
1.         SYN      -->
2.         ACK      <--
3.         SYN      <--
4.         SYN,ACK  -->

4.         SYN,ACK  -->
4.         SYN,ACK  -->
4.         SYN,ACK  -->
4.         SYN,ACK  -->

After receiving the ACK and then the SYN, the client responds with a SYN-ACK, as if the server were actively initiating the connection. In a way, the roles reverse — as if the client’s original SYN never existed, and the server was the initiator of the connection.

Upon receiving the SYN, the client sends back the server’s ISN + 1 and re-sends its own ISN. From the server’s SYN segment onward, the sequence looks like a normal 3-way handshake (starting from step 3).

Let’s play along and reply with an ACK to the client’s SYN-ACK:

elif fs == { "S", "A" }:  # SYN-ACK
    self._reply(
        p,
        flags="A",
        ack=uint32(p[TCP].seq + 1),
        seq=uint32(self._isn + 1),
    )

    self._reply(
        p,
        create_http_response(
            "Hello from the split handshake TCP server (5-way)!"
        ),
        flags="PA",  # PSH-ACK
        ack=uint32(p[TCP].seq + 1),
        seq=uint32(self._isn + 1),
    )

Now curl successfully connects and receives our HTTP response. Let’s try accessing it from a browser at http://localhost:8000:

URL http://localhost:8000 in browser.

We’ve now created a 5-segment handshake. Here’s how the connection setup looks:

    Client                Server
1.         SYN      -->
2.         ACK      <--
3.         SYN      <--
4.         SYN,ACK  -->
5.         ACK      <--

This approach is called a split handshake because we split the usual SYN-ACK into two segments. While not conforming to the specification on the server’s side, the connection is still successfully established in practice.

According to RFC 9293, in the 5-segment split handshake sequence, the ACK in step 2 is syntactically valid for the client, but is ultimately discarded. Therefore, it’s possible to send zero or more ACK segments and still complete the handshake.

Let’s now implement the canonical 4-segment version, which skips the intermediate ACK:

    def handle_packet(self, p: Packet) -> None:
        fs = set(p[TCP].flags)
        if fs == { "S" }:
            self._isn = random.randint(0, 2**32 - 1)
            self._reply(p, flags="S", seq=self._isn)
        elif fs == { "S", "A" }:
            self._reply(
                p,
                flags="A",
                ack=uint32(p[TCP].seq + 1),
                seq=uint32(self._isn + 1),
            )

            self._reply(
                p,
                create_http_response(
                    "Hello from the split handshake TCP server (4-way)!"
                ),
                flags="PA",
                ack=uint32(p[TCP].seq + 1),
                seq=uint32(self._isn + 1),
            )

Let’s verify it works:

URL http://localhost:8000 in browser.

We’ve now implemented another method of establishing a TCP connection, using four segments:

    Client                Server
1.         SYN      -->
2.         SYN      <--
3.         SYN,ACK  -->
4.         ACK      <--

According to [1], both the 4-segment and 5-segment versions are referred to as split handshakes. The term fits the 5-segment case well, as the SYN-ACK was explicitly split. For the 4-segment version, the term is somewhat less intuitive, but it is still named like that.

How does this differ from a normal 3-way or simultaneous open? The difference is that the split handshake combines elements of simultaneous open and the 3-way handshake. The beginning (SYN-SYN) mirrors the former, while the completion (SYN, SYN-ACK, ACK) resembles the latter.

Effectively, the server’s SYN makes the client “forget” the SYN it sent earlier, flipping the connection logic. From that point on, it behaves like a regular 3-way handshake with the roles reversed.

After this phenomenon was discovered, several firewalls and intrusion detection systems were tested. Some of them did not recognize such a sequence of segments as a valid connection establishment at all. Others incorrectly identified the roles of the parties — treating the server as the client and the client as the server — which meant that the common logic of “expect potential malicious behavior from the client, not the server” allowed malicious traffic originating from the client to an attacker-controlled server to evade detection. [1]

References Link to heading

  1. Macrothink Institute: The TCP Split Handshake: Practical Effects on Modern Network Equipment
  2. Stevens, Richard. TCP/IP Illustrated, Volume 1: The Protocols
  3. LWN.net: Fingerprinting systems with TCP source-port selection
  4. Cloudflare Blog: connect() — why are you so slow?
  5. Linux kernel source tree: tcp/dccp: better use of ephemeral ports in connect()