Using TLS ECH from Python


At first, the idea of encrypting more of the metadata found inside the initial packet (the “ClientHello”) of a TLS connection may seem simple and obvious, but there are of course reasons that this wasn’t done right from the start. In this post I will describe the flow of a connection using Encrypted Client Hello (ECH) to protect the metadata fields, and present a working code example using a fork of CPython built with DEfO project’s OpenSSL fork to connect to ECH-enabled HTTPS servers.

To understand why this is an issue, let’s take a step back and look at how websites are hosted. Many websites are hosted on shared servers, which means that a single server machine is responsible for serving multiple, possibly hundreds or thousands, of websites. This is known as the shared hosting model. In this setup, when a user types in a URL or clicks on a link to visit a website and the browser connects to the server, the server needs to know which website the users is requesting. This is where the Server Name Indication (SNI) comes in - it’s a field in the initial packet of a TLS connection that tells the server which website the user is trying to access. The server can then send the correct certificate so that the browser can authenticate the connection, and then send the requested website content.

Because this field was sent unencrypted, this means that anyone who can see the traffic between the user’s browser and the server can intercept the SNI and know which website the user is trying to visit. This can be a privacy concern, as it allows ISPs, network administrators, or other unwanted observers to build a profile of the user’s browsing history. It’s not just about the websites they visit, but also about the potential for censorship or targeted attacks. With the SNI being unencrypted, it’s like sending a postcard with the address visible to anyone who handles it - it may not be the end of the world for most browsing activity, but it’s certainly not private. Encrypted Client Hello aims to change this by encrypting the SNI and other metadata, making it much harder for third parties to intercept and exploit this information.

So, why wasn’t it easy to protect the SNI and other metadata from the start? The main challenge was that, in order to encrypt the SNI, the client (i.e., the user’s browser) needs to know the public key that the server wants the ClientHello to be encrypted with in advance. However, the server’s ECH public key is tied to the specific website being requested, and there wasn’t a straightforward way to discover a public key that could be used to talk to the server without revealing the SNI. This created a chicken-and-egg problem, where the client couldn’t encrypt the SNI without knowing the server’s public key, but it couldn’t know the server’s public key without sending the SNI in plaintext.

This problem is solved with ECH by introducing a new type of DNS record, called an HTTPS record. An HTTPS record is a special type of DNS record that contains the ECH public key of the server, along with other metadata, in a way that can be retrieved by the client without revealing the SNI (the website name is still leaked via the DNS request, but it is possible to protect your requests using DNS-over-TLS or DNS-over-HTTPS). The HTTPS record is typically retrieved by the client during the DNS lookup process, before the TLS connection is established.

The HTTPS record contains an ECH configuration, which is used to encrypt the SNI and other metadata. This is generated by the server and is tied to the specific configuration of the server, rather than to a specific website. By using HTTPS records to retrieve the server’s ECH public key, we are able to break the chicken-and-egg problem and provide a way to encrypt the SNI and other metadata.

Before we can lookup the HTTPS record, it’s first necessary to work out where that record would live. These records have been designed to be quite flexible, so can accommodate services running on non-default port numbers. If the default port number is in use then the HTTPS record will be on the same domain name as the website, but for non-default port numbers, there will be a prefix to the domain name:

def svcbname(url: str) -> str:
    """Derive DNS name of SVCB/HTTPS record corresponding to target URL."""
    parsed = urllib.parse.urlparse(url)
    if parsed.scheme == "https":
        if (parsed.port or 443) == 443:
            return parsed.hostname
        else:
            return f"_{parsed.port}._https.{parsed.hostname}"
    elif parsed.scheme == "http":
        if (parsed.port or 80) in (443, 80):
            return parsed.hostname
        else:
            return f"_{parsed.port}._https.{parsed.hostname}"
    else:
        # For now, no other scheme is supported
        return None

To keep it simple, the examples in this post will use plain DNS but the technique is equally applicable to DNS-over-TLS and DNS-over-HTTPS. Now that we have the domain name to query, we can fetch the ECH configuration from the DNS using the dnspython library:

def get_ech_configs(domain) -> List[bytes]:
    try:
        answers = dns.resolver.resolve(domain, "HTTPS")
    except dns.resolver.NoAnswer:
        logging.warning(f"No HTTPS record found for {domain}")
        return []
    except Exception as e:
        logging.critical(f"DNS query failed: {e}")
        sys.exit(1)
    configs: List[bytes] = []
    for rdata in answers:
        if hasattr(rdata, "params"):
            params = rdata.params
            echconfig = params.get(5)
            if echconfig:
                configs.append(echconfig.ech)
    if len(configs) == 0:
        logging.warning(f"No echconfig found in HTTPS record for {domain}")
    return configs

Once the ECH configurations are known, these can be used to establish the connection and fetch the website:

def get_http(url, ech_configs) -> bytes:
    parser = urllib.parse.urlparse(url)
    hostname, port, path = url.hostname, url.port, url.path
    logging.debug("Performing GET request for https://{hostname}:{port}/{path}")
    context = ssl.SSLContext(ssl.PROTOCOL_TLS_CLIENT)
    context.load_verify_locations(certifi.where())
    for config in ech_configs:
        try:
            context.set_ech_config(config)
        except ssl.SSLError as e:
            logging.error(f"SSL error: {e}")
            pass
    with socket.create_connection((hostname, port)) as sock:
        with context.wrap_socket(sock, server_hostname=hostname, do_handshake_on_connect=False) as ssock:
            try:
                ssock.do_handshake()
                logging.debug("Handshake completed with ECH status: %s", ssock.get_ech_status().name)
                logging.debug("Inner SNI: %s, Outer SNI: %s", ssock.server_hostname, ssock.outer_server_hostname)
                request = f'GET {path} HTTP/1.1\r\nHost: {hostname}\r\nConnection: close\r\n\r\n'
                ssock.sendall(request.encode('utf-8'))
                response = b''
                while True:
                    data = ssock.recv(4096)
                    if not data:
                        break
                    response += data
                return response
            except ssl.SSLError as e:
                logging.error(f"SSL error: {e}")
                raise e

The important step here is the new set_ech_config method on the SSLContext that allows you to add the ECH configuration containing the public key. If there are multiple records, the underlying OpenSSL will determine which of the keys to use. There are also a few new methods that allow you to get the status information relating to ECH from the SSLSocket after the completion of the handshake.

In the simple case, that’s all there is to it. If you were to watch the connection with Wireshark you would not be able to see the true SNI being sent to the server and would only see the decoy SNI present in the unencrypted “ClientHelloOuter”. This decoy SNI is added to appease middleboxes that may block traffic, accidentally or deliberately, if that field is missing entirely. There are also further protections against such middleboxes from the application of GREASE:

If the client attempts to connect to a server and does not have an ECHConfig structure available for the server, it SHOULD send a GREASE “encrypted_client_hello” extension in the first ClientHello […]

This means that if your client supports ECH but does not have the configuration available to use it, the client should still send an ECH extension filled with nonsense anyway. This will help to detect deployment issues early as errors will be immediately obvious to users and won’t rely on servers having deployed ECH before the errors are triggered.

Finally, if the server sees this GREASE ECH extension then it can use this to know that you support ECH but didn’t have a configuration available. In its reply, it can send a “retry config” and then terminate the connection. You then have the configuration available to start the connection again with a real ECH extension this time, and can cache that for future requests too.

For a full client example including the use of retry configs, you can see our example Python client at GitHub. You’ll need to use this with our CPython fork and OpenSSL fork.