Internet-Draft | QUIC Retry Offload | March 2024 |
Duke & Banks | Expires 27 September 2024 | [Page] |
QUIC uses Retry packets to reduce load on stressed servers, by forcing the client to prove ownership of its address before the server commits state. QUIC also has an anti-tampering mechanism to prevent the unauthorized injection of Retry packets into a connection. However, a server operator may want to offload production of Retry packets to an anti-Denial-of-Service agent or hardware accelerator. "Retry Offload" is a mechanism for coordination between a server and an external generator of Retry packets that can succeed despite the anti-tampering mechanism.¶
Discussion of this document takes place on the QUIC Working Group mailing list (quic@ietf.org), which is archived at https://mailarchive.ietf.org/arch/browse/quic/.¶
Source for this draft and an issue tracker can be found at https://github.com/quicwg/load-balancers.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 27 September 2024.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
QUIC [RFC9000] servers send Retry packets to avoid prematurely allocating resources when under stress, such as during a Denial of Service (DoS) attack. Because both Initial packets and Retry packets have weak authentication properties, the Retry packet contains an encrypted token that helps the client and server to validate, via transport parameters, that an attacker did not inject or modify a packet of either type for this connection attempt.¶
However, a server under stress is less inclined to process incoming Initial packets and compute the Retry token in the first place. An analogous mechanism for TCP is syncookies [RFC4987]. As TCP has weaker authentication properties to QUIC, syncookie generation can often be offloaded to a hardware device, or to a anti-Denial-of-Service provider that is topologically far from the protected server. As such an offload would behave exactly like an attacker, QUIC's authentication methods make such a capability impossible.¶
This document seeks to enable offloading of Retry generation to QUIC via explicit coordination between servers and the hardware or provider offload, which this document refers to as a "Retry Offload." It has two different modes, to conform to two different use cases.¶
The no-shared-state mode has minimal coordination and does not require key sharing. While operationally easier to configure and manage, it places severe constraints on the operational profile of the offload. In particular, the offload must control all ingress to the server and fail closed.¶
The shared-state mode removes the operational constraints, but also requires more sophisticated key management.¶
Both modes specify a common format for encoding information in the Retry token, so that the server can correctly populate the relevant transport parameter fields.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].¶
In this document, these words will appear with that interpretation only when in ALL CAPS. Lower case uses of these words are not to be interpreted as carrying significance described in RFC 2119.¶
For brevity, "Connection ID" will often be abbreviated as "CID".¶
A "Retry Offload" is a hardware or software device that is conceptually separate from a QUIC server that terminates QUIC connections. This document assumes that the Retry Offload and the server have an administrative relationship that allows them to accept common configuation.¶
A "configuration agent" is some entity that determines the common configuration to be distributed to the servers and the Retry Offload.¶
This document uses "QUIC" to refer to the protocol in QUIC version 1 [RFC9000]. Retry offloads can be applied to other versions of QUIC that use Retry packets and have identical information requirements for Retry validation. However, note that source and destination connection IDs are the only relevant data fields that are invariant across QUIC versions [RFC8999].¶
Regardless of mechanism, a Retry Offload has an active mode, where it is generating Retry packets, and an inactive mode, where it is not, based on its assessment of server load and the likelihood an attack is underway. The choice of mode MAY be made on a per-packet or per-connection basis, through a stochastic process or based on client address.¶
A configuration agent MUST distribute a list of QUIC versions the Retry Offload supports. It MAY also distribute either an "Allow-List" or a "Deny-List" of other QUIC versions. It MUST NOT distribute both an Allow-List and a Deny-List.¶
The Allow-List or Deny-List MUST NOT include any versions included for Retry Offload support.¶
The Configuration Agent MUST provide a means for the entity that controls the Retry Offload to report its supported version(s) to the configuration Agent. If the entity has not reported this information, it MUST NOT activate the Retry Offload and the configuration agent MUST NOT distribute configuration that activates it.¶
The configuration agent MAY delete versions from the final supported version list if policy does not require the Retry Offload to operate on those versions.¶
The configuration Agent MUST provide a means for the entities that control servers behind the Retry Offload to report either an Allow-List or a Deny-List.¶
If all entities supply Allow-Lists, the consolidated list MUST be the union of these sets. If all entities supply Deny-Lists, the consolidated list MUST be the intersection of these sets.¶
If entities provide a mixture of Allow-Lists and Deny-Lists, the consolidated list MUST be a Deny-List that is the intersection of all provided Deny-Lists and the inverses of all Allow-Lists.¶
If no entities that control servers have reported Allow-Lists or Deny-Lists, the default is a Deny-List with the null set (i.e., all unsupported versions will be admitted). This preserves the future extensibilty of QUIC.¶
A Retry Offload MUST forward all packets for a QUIC version it does not support that are not on a Deny-List or absent from an Allow-List. Note that if servers support versions the Retry Offload does not, this may increase load on the servers.¶
Note that future versions of QUIC might not have Retry packets, require different information in Retry, or use different packet type indicators.¶
Retry Offloads SHOULD treat Initial packets from the same connection with a uniform policy. Initial packets of the first and second client flight can be difficult to distinguish without expensive decryption of the contents, which is unsuitable under the conditions of a DDoS attack. If the first packet of a connection is admitted without Retry, but the second triggers a Retry, that Retry packet will be ignored and the loss of an Initial coalesced with other packets can impair performance. In some situations, the client does not yet have handshake keys, and dropping further client Initial packets creates a deadlock where the connection cannot progress.¶
The simplest means to ensure this is to require, when active, a Retry Token for all incoming Initial packets, and send a Retry packet otherwise. If the Retry Offload is to be more selective, one technique keeps state on which address/port 4-tuples have been admitted. Another would be to apply a secure hash to the source IP address, port, and connection ID to deterministically compute whether the Initial requires a Retry Token or not. These source values remain consistent over the handshake.¶
However, even with these techniques there is a potential problem when a Retry Offload switches from inactive to active mode. The Retry Offload could admit the first packet while in inactive mode, and then drop subsequent Initials in active mode.¶
If the Retry Offload is always on-path, it MAY keep state on incoming connections while in inactive mode to avoid this problem. If it cannot or will not keep such state, it SHOULD implement "transition mode" for an interval chosen to include the likely Initial packet exchange of most clients (200ms is a sensible default).¶
In transition mode, Retry Offloads process Initial packets with Retry tokens as in active mode. When the Retry Offload receives an Initial packet with no token, it issues a Retry AND forwards the packet to the server. If the client has already received a packet from the server, it will ignore the Retry and the connection will progress normally. If not, the client will reconnect based on the Retry, the server's response to the first initial will be discarded, and the connection will progress normally based on the client's second Initial. Appendix B explores the various possible packet sequences in transition mode.¶
Note that transition mode provides no actual DDoS relief to the server, so its duration should be as short as possible. The Retry Offload can choose not to implement transition mode and cause some client connections to fail.¶
Servers operating behind a Retry Offload SHOULD implement a mechanism that operates whenever a client Initial arrives with a valid Retry token. If there is another connection with identical client Connection ID, IP, and Port, but with an unvalidated address, that connection is immediately and silently terminated. This mechanism eliminates incorrect connection state that is an artifact of transition mode, as explained in Appendix B.¶
Initial Packets are especially effective at consuming server resources because they cause the server to create connection state. Even when mitigating this load with Retry Packets, the act of validating an Initial Token and sending a Retry Packet is more expensive than the response to a non-Initial packet with an unknown Connection ID: simply dropping it and/or sending a Stateless Reset.¶
Nevertheless, a Retry Offload in Active Mode might desire to shield servers from non-Initial packets that do not correspond to a previously admitted Initial Packet. This has a number of considerations.¶
If a Retry Offload maintains no per-flow state, it cannot distinguish between valid and invalid non-Initial packets and MUST forward all non-Initial Packets to the server.¶
For QUIC versions the Retry Offload does not support and are present on the Allow-List (or absent from the Deny-List), the Retry Offload cannot distinguish Initial Packets from other long headers and therefore MUST admit all long headers.¶
If a Retry Offload keeps per-flow state, it can identify 4-tuples that have been previously approved, admit non-Initial packets from those flows, and drop all others. However, dropping short headers will effectively break Address Migration and NAT Rebinding when in Active Mode, as post-migration packets will arrive with a previously unknown 4-tuple. This policy will also break connection attempts using any new QUIC versions that begin connections with a short header.¶
If a Retry Offload is integrated with a QUIC-LB routable load balancer [I-D.ietf-quic-load-balancers], it can verify that the Destination Connection ID is routable, and only admit non-Initial packets with routable DCIDs. As the Connection ID encoding is invariant across QUIC versions, the Retry Offload can do this for all short headers.¶
Nothing in this section prevents Retry Offloads from making basic syntax correctness checks on packets with QUIC versions that it understands (e.g., enforcing the Initial Packet datagram size minimum in version 1).¶
There are no IANA requirements.¶
These YANG models conform to [RFC6020] and express a complete Retry Offload configuration.¶
module ietf-retry-offload { yang-version "1.1"; namespace "urn:ietf:params:xml:ns:yang:ietf-quic-lb"; prefix "quic-lb"; import ietf-yang-types { prefix yang; reference "RFC 6991: Common YANG Data Types."; } import ietf-inet-types { prefix inet; reference "RFC 6991: Common YANG Data Types."; } organization "IETF QUIC Working Group"; contact "WG Web: <http://datatracker.ietf.org/wg/quic> WG List: <quic@ietf.org> Authors: Martin Duke (martin.h.duke at gmail dot com) Nick Banks (nibanks at microsoft dot com) Christian Huitema (huitema at huitema.net)"; description "This module enables the explicit cooperation of QUIC servers with offloads that generate Retry packets on their behalf. Copyright (c) 2022 IETF Trust and the persons identified as authors of the code. All rights reserved. Redistribution and use in source and binary forms, with or without modification, is permitted pursuant to, and subject to the license terms contained in, the Simplified BSD License set forth in Section 4.c of the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info). This version of this YANG module is part of RFC XXXX (https://www.rfc-editor.org/info/rfcXXXX); see the RFC itself for full legal notices. The key words 'MUST', 'MUST NOT', 'REQUIRED', 'SHALL', 'SHALL NOT', 'SHOULD', 'SHOULD NOT', 'RECOMMENDED', 'NOT RECOMMENDED', 'MAY', and 'OPTIONAL' in this document are to be interpreted as described in BCP 14 (RFC 2119) (RFC 8174) when, and only when, they appear in all capitals, as shown here."; revision "2022-02-11" { description "Initial version"; reference "RFC XXXX, QUIC Retry Offloads"; } container retry-offload-config { description "Configuration of Retry Offload. If supported-versions is empty, there is no Retry Offload. If token-keys is empty, it uses the non-shared-state offload. If present, it uses shared-state tokens."; leaf-list supported-versions { type uint32; description "QUIC versions that the Retry Offload supports. If empty, there is no Retry Offload."; } leaf unsupported-version-default { type enumeration { enum allow { description "Unsupported versions admitted by default"; } enum deny { description "Unsupported versions denied by default"; } } default allow; description "Are unsupported versions not in version-exceptions allowed or denied?"; } leaf-list version-exceptions { type uint32; description "Exceptions to the default-deny or default-allow rule."; } list token-keys { key "key-sequence-number"; description "list of active keys, for key rotation purposes. Existence implies shared-state format"; leaf key-sequence-number { type uint8 { range "0..127"; } mandatory true; description "Identifies the key used to encrypt the token"; } leaf token-key { type retry-offload-key; mandatory true; description "16-byte key to encrypt the token"; } leaf token-iv { type yang:hex-string { length 23; } mandatory true; description "8-byte IV to encrypt the token, encoded in 23 bytes"; } } } }¶
This summary of the YANG models uses the notation in [RFC8340].¶
module: retry-offload-config +--rw retry-offload-config +--rw supported-versions* uint32 +--rw unsupported-version-default? enumeration +--rw version-exceptions* uint32 +--rw token-keys* [key-sequence-number] +--rw key-sequence-number uint8 +--rw token-key quic-lb-key +--rw token-iv yang:hex-string¶
The logic motivating transition mode behavior involves detailed reasoning about endpoint behavior during the handshake. This non-normative appendix walks through the scenarios.¶
Dropping Initial packets in the client's second flight can cause performance problems or deadlocks. In the case where the client and server first flight end with both sides having handshake keys, there will generally be no impact on performance. However, if an Initial ACK is critical to progress, as it can be in the case of multiple-packet TLS messages, Hello Retry Requests, and similar cases, dropping subsequent Initial ACKs results in deadlock.¶
In transition mode, the Retry Offload forwards Initials with no token while also generating a Retry. This allows handshakes to progress without further incident.¶
If the client hello was admitted in inactive mode, then the client has already received a packet from the server. Although subsequent client Initial packets will trigger a Retry, the client will ignore these packets. Those Initials will also be processed by the server to continue the handshake.¶
After sending a Client Hello in Initial Packet A, a client will rapidly receive a Retry Packet from the Offload and attempt to reconnect accordingly with Initial Packet B.¶
The client will discard any server response to Initial A. If a Retry, it is a second Retry on the connection. If an Initial, its is encrypted with keys derived from Initial A, which have already been discarded, and will be a decryption failure.¶
Initial B's destination connection ID will be new, so the server will process it as a new connection and proceed normally.¶
Unfortunately, the server connection state initiated by Initial A will remain. For this reason, this document suggests that servers silently terminate the older connection. Requiring the address to be validated avoids cases where an attacker simply replays a client Initial with a new Destination Connection ID to terminate a valid connection.¶
Note that there are corner cases involving further packet loss that result in connection timeout. For instance, if the Retry Offload's response to Initial A is lost, then the connection will proceed based on Initial A. If the Retry Offload then switches from transition mode to active mode before the client's second flight arrives, the Retry Offload will drop the Initial packet in that flight, and the connection might deadlock.¶
Christian Huitema, Ling Tao Nju, and William Zeng Ke all provided useful input to this document.¶
RFC Editor's Note: Please remove this section prior to publication of a final version of this document.¶
Fixed mistakes in test vectors¶
Rearranged the shared-state retry token to simplify token processing¶
More compact timestamp in shared-state retry token¶
Revised server requirements for shared-state retries¶
Eliminated zero padding from the test vectors¶
Added server use bytes to the test vectors¶
Additional compliant DCID criteria¶
Replaced stream cipher algorithm with three-pass version¶
Updated Retry format to encode info for required TPs¶
Added discussion of version invariance¶
Cleaned up text about config rotation¶
Added Reset Oracle and limited configuration considerations¶
Allow dropped long-header packets for known QUIC versions¶
Removed in-band protocol from the document¶
Switch to IETF WG draft.¶
Added standard for Retry Offloads¶