< draft-ietf-tsvwg-ecn-l4s-id-28.txt   draft-ietf-tsvwg-ecn-l4s-id-29b.txt >
Transport Services (tsv) K. De Schepper Transport Services (tsv) K. De Schepper
Internet-Draft Nokia Bell Labs Internet-Draft Nokia Bell Labs
Intended status: Experimental B. Briscoe, Ed. Intended status: Experimental B. Briscoe, Ed.
Expires: 9 February 2023 Independent Expires: 1 March 2023 Independent
8 August 2022 28 August 2022
Explicit Congestion Notification (ECN) Protocol for Very Low Queuing Explicit Congestion Notification (ECN) Protocol for Very Low Queuing
Delay (L4S) Delay (L4S)
draft-ietf-tsvwg-ecn-l4s-id-28 draft-ietf-tsvwg-ecn-l4s-id-29
Abstract Abstract
This specification defines the protocol to be used for a new network This specification defines the protocol to be used for a new network
service called low latency, low loss and scalable throughput (L4S). service called low latency, low loss and scalable throughput (L4S).
L4S uses an Explicit Congestion Notification (ECN) scheme at the IP L4S uses an Explicit Congestion Notification (ECN) scheme at the IP
layer that is similar to the original (or 'Classic') ECN approach, layer that is similar to the original (or 'Classic') ECN approach,
except as specified within. L4S uses 'scalable' congestion control, except as specified within. L4S uses 'scalable' congestion control,
which induces much more frequent control signals from the network and which induces much more frequent control signals from the network and
it responds to them with much more fine-grained adjustments, so that it responds to them with much more fine-grained adjustments, so that
skipping to change at page 2, line 10 skipping to change at page 2, line 10
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on 9 February 2023. This Internet-Draft will expire on 1 March 2023.
Copyright Notice Copyright Notice
Copyright (c) 2022 IETF Trust and the persons identified as the Copyright (c) 2022 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/ Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document. license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights Please review these documents carefully, as they describe your rights
skipping to change at page 2, line 50 skipping to change at page 2, line 50
5. Network Node Behaviour . . . . . . . . . . . . . . . . . . . 20 5. Network Node Behaviour . . . . . . . . . . . . . . . . . . . 20
5.1. Classification and Re-Marking Behaviour . . . . . . . . . 20 5.1. Classification and Re-Marking Behaviour . . . . . . . . . 20
5.2. The Strength of L4S CE Marking Relative to Drop . . . . . 22 5.2. The Strength of L4S CE Marking Relative to Drop . . . . . 22
5.3. Exception for L4S Packet Identification by Network Nodes 5.3. Exception for L4S Packet Identification by Network Nodes
with Transport-Layer Awareness . . . . . . . . . . . . . 23 with Transport-Layer Awareness . . . . . . . . . . . . . 23
5.4. Interaction of the L4S Identifier with other 5.4. Interaction of the L4S Identifier with other
Identifiers . . . . . . . . . . . . . . . . . . . . . . . 23 Identifiers . . . . . . . . . . . . . . . . . . . . . . . 23
5.4.1. DualQ Examples of Other Identifiers Complementing L4S 5.4.1. DualQ Examples of Other Identifiers Complementing L4S
Identifiers . . . . . . . . . . . . . . . . . . . . . 23 Identifiers . . . . . . . . . . . . . . . . . . . . . 23
5.4.1.1. Inclusion of Additional Traffic with L4S . . . . 23 5.4.1.1. Inclusion of Additional Traffic with L4S . . . . 23
5.4.1.2. Exclusion of Traffic From L4S Treatment . . . . . 25 5.4.1.2. Exclusion of Traffic From L4S Treatment . . . . . 26
5.4.1.3. Generalized Combination of L4S and Other 5.4.1.3. Generalized Combination of L4S and Other
Identifiers . . . . . . . . . . . . . . . . . . . . 26 Identifiers . . . . . . . . . . . . . . . . . . . . 26
5.4.2. Per-Flow Queuing Examples of Other Identifiers 5.4.2. Per-Flow Queuing Examples of Other Identifiers
Complementing L4S Identifiers . . . . . . . . . . . . 28 Complementing L4S Identifiers . . . . . . . . . . . . 28
5.5. Limiting Packet Bursts from Links . . . . . . . . . . . . 28 5.5. Limiting Packet Bursts from Links . . . . . . . . . . . . 28
5.5.1. Limiting Packet Bursts from Links Fed by an L4S 5.5.1. Limiting Packet Bursts from Links Fed by an L4S
AQM . . . . . . . . . . . . . . . . . . . . . . . . . 28 AQM . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.5.2. Limiting Packet Bursts from Links Upstream of an L4S 5.5.2. Limiting Packet Bursts from Links Upstream of an L4S
AQM . . . . . . . . . . . . . . . . . . . . . . . . . 29 AQM . . . . . . . . . . . . . . . . . . . . . . . . . 29
6. Behaviour of Tunnels and Encapsulations . . . . . . . . . . . 29 6. Behaviour of Tunnels and Encapsulations . . . . . . . . . . . 29
6.1. No Change to ECN Tunnels and Encapsulations in General . 29 6.1. No Change to ECN Tunnels and Encapsulations in General . 29
6.2. VPN Behaviour to Avoid Limitations of Anti-Replay . . . . 30 6.2. VPN Behaviour to Avoid Limitations of Anti-Replay . . . . 30
7. L4S Experiments . . . . . . . . . . . . . . . . . . . . . . . 31 7. L4S Experiments . . . . . . . . . . . . . . . . . . . . . . . 31
7.1. Open Questions . . . . . . . . . . . . . . . . . . . . . 31 7.1. Open Questions . . . . . . . . . . . . . . . . . . . . . 32
7.2. Open Issues . . . . . . . . . . . . . . . . . . . . . . . 33 7.2. Open Issues . . . . . . . . . . . . . . . . . . . . . . . 33
7.3. Future Potential . . . . . . . . . . . . . . . . . . . . 33 7.3. Future Potential . . . . . . . . . . . . . . . . . . . . 33
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 34 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 34
9. Security Considerations . . . . . . . . . . . . . . . . . . . 34 9. Security Considerations . . . . . . . . . . . . . . . . . . . 35
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 35 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 35
10.1. Normative References . . . . . . . . . . . . . . . . . . 35 10.1. Normative References . . . . . . . . . . . . . . . . . . 35
10.2. Informative References . . . . . . . . . . . . . . . . . 36 10.2. Informative References . . . . . . . . . . . . . . . . . 36
Appendix A. Rationale for the 'Prague L4S Requirements' . . . . 45 Appendix A. Rationale for the 'Prague L4S Requirements' . . . . 46
A.1. Rationale for the Requirements for Scalable Transport A.1. Rationale for the Requirements for Scalable Transport
Protocols . . . . . . . . . . . . . . . . . . . . . . . . 46 Protocols . . . . . . . . . . . . . . . . . . . . . . . . 47
A.1.1. Use of L4S Packet Identifier . . . . . . . . . . . . 46 A.1.1. Use of L4S Packet Identifier . . . . . . . . . . . . 47
A.1.2. Accurate ECN Feedback . . . . . . . . . . . . . . . . 46 A.1.2. Accurate ECN Feedback . . . . . . . . . . . . . . . . 47
A.1.3. Capable of Replacement by Classic Congestion A.1.3. Capable of Replacement by Classic Congestion
Control . . . . . . . . . . . . . . . . . . . . . . . 47 Control . . . . . . . . . . . . . . . . . . . . . . . 47
A.1.4. Fall back to Classic Congestion Control on Packet A.1.4. Fall back to Classic Congestion Control on Packet
Loss . . . . . . . . . . . . . . . . . . . . . . . . 47 Loss . . . . . . . . . . . . . . . . . . . . . . . . 48
A.1.5. Coexistence with Classic Congestion Control at Classic A.1.5. Coexistence with Classic Congestion Control at Classic
ECN bottlenecks . . . . . . . . . . . . . . . . . . . 48 ECN bottlenecks . . . . . . . . . . . . . . . . . . . 49
A.1.6. Reduce RTT dependence . . . . . . . . . . . . . . . . 51 A.1.6. Reduce RTT dependence . . . . . . . . . . . . . . . . 52
A.1.7. Scaling down to fractional congestion windows . . . . 53 A.1.7. Scaling down to fractional congestion windows . . . . 53
A.1.8. Measuring Reordering Tolerance in Time Units . . . . 54 A.1.8. Measuring Reordering Tolerance in Time Units . . . . 54
A.2. Scalable Transport Protocol Optimizations . . . . . . . . 57 A.2. Scalable Transport Protocol Optimizations . . . . . . . . 57
A.2.1. Setting ECT in Control Packets and Retransmissions . 57 A.2.1. Setting ECT in Control Packets and Retransmissions . 57
A.2.2. Faster than Additive Increase . . . . . . . . . . . . 57 A.2.2. Faster than Additive Increase . . . . . . . . . . . . 58
A.2.3. Faster Convergence at Flow Start . . . . . . . . . . 58 A.2.3. Faster Convergence at Flow Start . . . . . . . . . . 58
Appendix B. Compromises in the Choice of L4S Identifier . . . . 58 Appendix B. Compromises in the Choice of L4S Identifier . . . . 59
Appendix C. Potential Competing Uses for the ECT(1) Codepoint . 63 Appendix C. Potential Competing Uses for the ECT(1) Codepoint . 64
C.1. Integrity of Congestion Feedback . . . . . . . . . . . . 63 C.1. Integrity of Congestion Feedback . . . . . . . . . . . . 64
C.2. Notification of Less Severe Congestion than CE . . . . . 65 C.2. Notification of Less Severe Congestion than CE . . . . . 65
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 65 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 66
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 66 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 66
1. Introduction 1. Introduction
This experimental track specification defines the protocol to be used This experimental track specification defines the protocol to be used
for a new network service called low latency, low loss and scalable for a new network service called low latency, low loss and scalable
throughput (L4S). L4S uses an Explicit Congestion Notification (ECN) throughput (L4S). L4S uses an Explicit Congestion Notification (ECN)
scheme at the IP layer with the same set of codepoint transitions as scheme at the IP layer with the same set of codepoint transitions as
the original (or 'Classic') Explicit Congestion Notification the original (or 'Classic') Explicit Congestion Notification
(ECN [RFC3168]). RFC 3168 required an ECN mark to be equivalent to a (ECN [RFC3168]). RFC 3168 required an ECN mark to be equivalent to a
skipping to change at page 6, line 5 skipping to change at page 6, line 5
growth in latency-sensitive applications continues, periods with growth in latency-sensitive applications continues, periods with
solely latency-sensitive traffic will become increasingly common on solely latency-sensitive traffic will become increasingly common on
links where traffic aggregation is low. During these periods, if all links where traffic aggregation is low. During these periods, if all
the traffic were marked for the same treatment, Diffserv would make the traffic were marked for the same treatment, Diffserv would make
no difference. The links with low aggregation also tend to become no difference. The links with low aggregation also tend to become
the path bottleneck under load, for instance, the access links the path bottleneck under load, for instance, the access links
dedicated to individual sites (homes, small enterprises or mobile dedicated to individual sites (homes, small enterprises or mobile
devices). So, instead of differentiation, it becomes imperative to devices). So, instead of differentiation, it becomes imperative to
remove the underlying causes of any unnecessary delay. remove the underlying causes of any unnecessary delay.
The bufferbloat project has shown that excessively-large buffering The Bufferbloat project has shown that excessively-large buffering
('bufferbloat') has been introducing significantly more delay than ('bufferbloat') has been introducing significantly more delay than
the underlying propagation time. These delays appear only the underlying propagation time [Bufferbloat]. These delays appear
intermittently -- only when a capacity-seeking (e.g. TCP) flow is only intermittently -- only when a capacity-seeking (e.g. TCP) flow
long enough for the queue to fill the buffer, causing every packet in is long enough for the queue to fill the buffer, causing every packet
other flows sharing the buffer to have to work its way through the in other flows sharing the buffer to have to work its way through the
queue. queue.
Active queue management (AQM) was originally developed to solve this Active queue management (AQM) was originally developed to solve this
problem (and others). Unlike Diffserv, which gives low latency to problem (and others). Unlike Diffserv, which gives low latency to
some traffic at the expense of others, AQM controls latency for _all_ some traffic at the expense of others, AQM controls latency for _all_
traffic in a class. In general, AQM methods introduce an increasing traffic in a class. In general, AQM methods introduce an increasing
level of discard from the buffer the longer the queue persists above level of discard from the buffer, the longer the queue persists above
a shallow threshold. This gives sufficient signals to capacity- a shallow threshold. This gives sufficient signals to capacity-
seeking (aka. greedy) flows to keep the buffer empty for its intended seeking (aka. greedy) flows to keep the buffer empty for its intended
purpose: absorbing bursts. However, RED [RFC2309] and other purpose: absorbing bursts. However, RED [RFC2309] and other
algorithms from the 1990s were sensitive to their configuration and algorithms from the 1990s were sensitive to their configuration and
hard to set correctly. So, this form of AQM was not widely deployed. hard to set correctly. So, this form of AQM was not widely deployed.
More recent state-of-the-art AQM methods, such as FQ-CoDel [RFC8290], More recent state-of-the-art AQM methods, such as FQ-CoDel [RFC8290],
PIE [RFC8033] or Adaptive RED [ARED01], are easier to configure, PIE [RFC8033] or Adaptive RED [ARED01], are easier to configure,
because they define the queuing threshold in time not bytes, so because they define the queuing threshold in time not bytes, so
configuration is invariant whatever the link rate. However, the configuration is invariant whatever the link rate. However, the
sawtoothing window of a Classic congestion control creates a dilemma sawtoothing window of a Classic congestion control creates a dilemma
for the operator: i) either configure a shallow AQM operating point, for the operator: i) either configure a shallow AQM operating point,
so the tips of the sawteeth cause minimal queue delay but the troughs so the tips of the sawteeth cause minimal queue delay, but the
underutilize the link, or ii) configure the operating point deeper troughs underutilize the link; or ii) configure the operating point
into the buffer, so the troughs utilize the link better but then the deeper into the buffer, so the troughs utilize the link better, but
tips cause more delay variation. Even with a perfectly tuned AQM, then the tips cause more delay variation. Even with a perfectly
the additional queuing delay at the tips of the sawteeth will be of tuned AQM, the additional queuing delay at the tips of the sawteeth
the same order as the underlying speed-of-light delay across the will be of the same order as the underlying base round trip time
network, thereby roughly doubling the total round-trip time. (RTT), thereby roughly doubling the total round-trip time.
If a sender's own behaviour is introducing queuing delay variation, If a sender's own behaviour is introducing queuing delay variation,
no AQM in the network can 'un-vary' the delay without significantly no AQM in the network can 'un-vary' the delay without significantly
compromising link utilization. Even flow-queuing (e.g. [RFC8290]), compromising link utilization. Even flow-queuing (e.g. [RFC8290]),
which isolates one flow from another, cannot isolate a flow from the which isolates one flow from another, cannot isolate a flow from the
delay variations it inflicts on itself. Therefore those applications delay variations it inflicts on itself. Therefore, those
that need to seek out high bandwidth but also need low latency will applications that need to seek out high bandwidth but also need low
have to migrate to scalable congestion control, which uses much latency will have to migrate to scalable congestion control, which
smaller sawtooth variations. uses much smaller sawtooth variations.
Altering host behaviour is not enough on its own though. Even if Altering host behaviour is not enough on its own though. Even if
hosts adopt low latency scalable congestion controls, they need to be hosts adopt low latency scalable congestion controls, they need to be
isolated from the large queue variations induced by existing Classic isolated from the large queue variations induced by existing Classic
congestion controls. L4S AQMs provide that latency isolation in the congestion controls. L4S AQMs provide that latency isolation in the
network and the L4S identifier enables the AQMs to distinguish the network and the L4S identifier enables the AQMs to distinguish the
two types of packet that need to be isolated: L4S and Classic. L4S two types of packet that need to be isolated: L4S and Classic. L4S
isolation can be achieved with a queue per flow (e.g. [RFC8290]) but isolation can be achieved with a queue per flow (e.g. [RFC8290]) but
a DualQ [I-D.ietf-tsvwg-aqm-dualq-coupled] is sufficient, and a DualQ [I-D.ietf-tsvwg-aqm-dualq-coupled] is sufficient, and
actually gives better tail latency [DCttH19]. Both approaches are actually gives better tail latency [DCttH19]. Both approaches are
skipping to change at page 7, line 40 skipping to change at page 7, line 40
It turns out that these scalable congestion control algorithms that It turns out that these scalable congestion control algorithms that
solve the latency problem can also solve the scalability problem of solve the latency problem can also solve the scalability problem of
Classic congestion controls. The finer sawteeth in the congestion Classic congestion controls. The finer sawteeth in the congestion
window have low amplitude, so they cause very little queuing delay window have low amplitude, so they cause very little queuing delay
variation and the average time to recover from one congestion signal variation and the average time to recover from one congestion signal
to the next (the average duration of each sawtooth) remains to the next (the average duration of each sawtooth) remains
invariant, which maintains constant tight control as flow-rate invariant, which maintains constant tight control as flow-rate
scales. A background paper [DCttH19] gives the full explanation of scales. A background paper [DCttH19] gives the full explanation of
why the design solves both the latency and the scaling problems, both why the design solves both the latency and the scaling problems, both
in plain English and in more precise mathematical form. The in plain English and in more precise mathematical form. The
explanation is summarised without the mathematics in Section 4 of the explanation is summarized without the mathematics in Section 4 of the
L4S architecture [I-D.ietf-tsvwg-l4s-arch]. L4S architecture [I-D.ietf-tsvwg-l4s-arch].
1.2. Terminology 1.2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in "OPTIONAL" in this document are to be interpreted as described in BCP
[RFC2119]. In this document, these words will appear with that 14 [RFC2119] [RFC8174] when, and only when, they appear in all
interpretation only when in ALL CAPS. Lower case uses of these words capitals, as shown here.
are not to be interpreted as carrying RFC-2119 significance.
Note: The L4S architecture [I-D.ietf-tsvwg-l4s-arch] repeats the [Note to the RFC Editor (to be removed before publication as an RFC):
following definitions, but if there are accidental differences those The L4S architecture [I-D.ietf-tsvwg-l4s-arch] repeats the following
below take precedence. definitions, which should be identical, except in the architecture
Classic CC and Scalable CC are condensed because they refer to a
section later in the architecture.]
Classic Congestion Control: A congestion control behaviour that can Classic Congestion Control: A congestion control behaviour that can
co-exist with standard Reno [RFC5681] without causing co-exist with standard Reno [RFC5681] without causing
significantly negative impact on its flow rate [RFC5033]. With significantly negative impact on its flow rate [RFC5033]. With
Classic congestion controls, such as Reno or Cubic, because flow Classic congestion controls, such as Reno or Cubic, because flow
rate has scaled since TCP congestion control was first designed in rate has scaled since TCP congestion control was first designed in
1988, it now takes hundreds of round trips (and growing) to 1988, it now takes hundreds of round trips (and growing) to
recover after a congestion signal (whether a loss or an ECN mark) recover after a congestion signal (whether a loss or an ECN mark)
as shown in the examples in section 5.1 of the L4S as shown in the examples in section 5.1 of the L4S
architecture [I-D.ietf-tsvwg-l4s-arch] and in [RFC3649]. architecture [I-D.ietf-tsvwg-l4s-arch] and in [RFC3649].
Therefore control of queuing and utilization becomes very slack, Therefore, control of queuing and utilization becomes very slack,
and the slightest disturbances (e.g. from new flows starting) and the slightest disturbances (e.g. from new flows starting)
prevent a high rate from being attained. prevent a high rate from being attained.
Scalable Congestion Control: A congestion control where the average Scalable Congestion Control: A congestion control where the average
time from one congestion signal to the next (the recovery time) time from one congestion signal to the next (the recovery time)
remains invariant as the flow rate scales, all other factors being remains invariant as the flow rate scales, all other factors being
equal. This maintains the same degree of control over queueing equal. This maintains the same degree of control over queueing
and utilization whatever the flow rate, as well as ensuring that and utilization whatever the flow rate, as well as ensuring that
high throughput is robust to disturbances. For instance, DCTCP high throughput is robust to disturbances. For instance, DCTCP
averages 2 congestion signals per round-trip whatever the flow averages 2 congestion signals per round-trip whatever the flow
rate, as do other recently developed scalable congestion controls, rate, as do other recently developed scalable congestion controls,
e.g. Relentless TCP [Mathis09], TCP Prague e.g. Relentless TCP [I-D.mathis-iccrg-relentless-tcp], TCP Prague
[I-D.briscoe-iccrg-prague-congestion-control], [PragueLinux], [I-D.briscoe-iccrg-prague-congestion-control], [PragueLinux],
BBRv2 [BBRv2], [I-D.cardwell-iccrg-bbr-congestion-control] and the BBRv2 [BBRv2], [I-D.cardwell-iccrg-bbr-congestion-control] and the
L4S variant of SCREAM for real-time media [SCReAM-L4S], [RFC8298]. L4S variant of SCREAM for real-time media [SCReAM-L4S], [RFC8298].
See Section 4.3 for more explanation. See Section 4.3 for more explanation.
Classic service: The Classic service is intended for all the Classic service: The Classic service is intended for all the
congestion control behaviours that co-exist with Reno [RFC5681] congestion control behaviours that co-exist with Reno [RFC5681]
(e.g. Reno itself, Cubic [RFC8312], (e.g. Reno itself, Cubic [RFC8312],
Compound [I-D.sridharan-tcpm-ctcp], TFRC [RFC5348]). The term Compound [I-D.sridharan-tcpm-ctcp], TFRC [RFC5348]). The term
'Classic queue' means a queue providing the Classic service. 'Classic queue' means a queue providing the Classic service.
Low-Latency, Low-Loss Scalable throughput (L4S) service: The 'L4S' Low-Latency, Low-Loss Scalable throughput (L4S) service: The 'L4S'
service is intended for traffic from scalable congestion control service is intended for traffic from scalable congestion control
algorithms, such as TCP Prague algorithms, such as the Prague congestion control
[I-D.briscoe-iccrg-prague-congestion-control], which was derived [I-D.briscoe-iccrg-prague-congestion-control], which was derived
from DCTCP [RFC8257]. The L4S service is for more general traffic from DCTCP [RFC8257]. The L4S service is for more general traffic
than just TCP Prague -- it allows the set of congestion controls than just Prague -- it allows the set of congestion controls with
with similar scaling properties to Prague to evolve, such as the similar scaling properties to Prague to evolve, such as the
examples listed above (Relentless, SCReAM, etc.). The term 'L4S examples listed above (Relentless, SCReAM, etc.). The term 'L4S
queue' means a queue providing the L4S service. queue' means a queue providing the L4S service.
The terms Classic or L4S can also qualify other nouns, such as The terms Classic or L4S can also qualify other nouns, such as
'queue', 'codepoint', 'identifier', 'classification', 'packet', 'queue', 'codepoint', 'identifier', 'classification', 'packet',
'flow'. For example: an L4S packet means a packet with an L4S 'flow'. For example: an L4S packet means a packet with an L4S
identifier sent from an L4S congestion control. identifier sent from an L4S congestion control.
Both Classic and L4S services can cope with a proportion of Both Classic and L4S services can cope with a proportion of
unresponsive or less-responsive traffic as well, but in the L4S unresponsive or less-responsive traffic as well, but in the L4S
case its rate has to be smooth enough or low enough not to build a case its rate has to be smooth enough or low enough not to build a
queue (e.g. DNS, VoIP, game sync datagrams, etc). queue (e.g. DNS, VoIP, game sync datagrams, etc.).
Reno-friendly: The subset of Classic traffic that is friendly to the Reno-friendly: The subset of Classic traffic that is friendly to the
standard Reno congestion control defined for TCP in [RFC5681]. standard Reno congestion control defined for TCP in [RFC5681].
The TFRC spec [RFC5348] indirectly implies that 'friendly' is The TFRC spec [RFC5348] indirectly implies that 'friendly' is
defined as "generally within a factor of two of the sending rate defined as "generally within a factor of two of the sending rate
of a TCP flow under the same conditions". Reno-friendly is used of a TCP flow under the same conditions". Reno-friendly is used
here in place of 'TCP-friendly', given the latter has become here in place of 'TCP-friendly', given the latter has become
imprecise, because the TCP protocol is now used with so many imprecise, because the TCP protocol is now used with so many
different congestion control behaviours, and Reno can be used in different congestion control behaviours, and Reno can be used in
non-TCP transports such as QUIC [RFC9000]. non-TCP transports such as QUIC [RFC9000].
skipping to change at page 10, line 6 skipping to change at page 10, line 6
The new L4S identifier defined in this specification is applicable The new L4S identifier defined in this specification is applicable
for IPv4 and IPv6 packets (as for Classic ECN [RFC3168]). It is for IPv4 and IPv6 packets (as for Classic ECN [RFC3168]). It is
applicable for the unicast, multicast and anycast forwarding modes. applicable for the unicast, multicast and anycast forwarding modes.
The L4S identifier is an orthogonal packet classification to the The L4S identifier is an orthogonal packet classification to the
Differentiated Services Code Point (DSCP) [RFC2474]. Section 5.4 Differentiated Services Code Point (DSCP) [RFC2474]. Section 5.4
explains what this means in practice. explains what this means in practice.
This document is intended for experimental status, so it does not This document is intended for experimental status, so it does not
update any standards track RFCs. Therefore it depends on [RFC8311], update any standards track RFCs. Therefore, it depends on [RFC8311],
which is a standards track specification that: which is a standards track specification that:
* updates the ECN proposed standard [RFC3168] to allow experimental * updates the ECN proposed standard [RFC3168] to allow experimental
track RFCs to relax the requirement that an ECN mark must be track RFCs to relax the requirement that an ECN mark must be
equivalent to a drop (when the network applies markings and/or equivalent to a drop (when the network applies markings and/or
when the sender responds to them). For instance, in the ABE when the sender responds to them). For instance, in the ABE
experiment [RFC8511] this permits a sender to respond less to ECN experiment [RFC8511] this permits a sender to respond less to ECN
marks than to drops; marks than to drops;
* changes the status of the experimental ECN nonce [RFC3540] to * changes the status of the experimental ECN nonce [RFC3540] to
skipping to change at page 12, line 25 skipping to change at page 12, line 25
* it SHOULD be consistent on all the packets of a transport layer * it SHOULD be consistent on all the packets of a transport layer
flow, so that some packets of a flow are not served by a different flow, so that some packets of a flow are not served by a different
queue to others. queue to others.
Whether the identifier would be recoverable if the experiment failed Whether the identifier would be recoverable if the experiment failed
is a factor that could be taken into account. However, this has not is a factor that could be taken into account. However, this has not
been made a requirement, because that would favour schemes that would been made a requirement, because that would favour schemes that would
be easier to fail, rather than those more likely to succeed. be easier to fail, rather than those more likely to succeed.
It is recognised that any choice of identifier is unlikely to satisfy It is recognized that any choice of identifier is unlikely to satisfy
all these requirements, particularly given the limited space left in all these requirements, particularly given the limited space left in
the IP header. Therefore a compromise will always be necessary, the IP header. Therefore, a compromise will always be necessary,
which is why all the above requirements are expressed with the word which is why all the above requirements are expressed with the word
'SHOULD' not 'MUST'. 'SHOULD' not 'MUST'.
After extensive assessment of alternative schemes, "ECT(1) and CE After extensive assessment of alternative schemes, "ECT(1) and CE
codepoints" was chosen as the best compromise. Therefore this scheme codepoints" was chosen as the best compromise. Therefore, this
is defined in detail in the following sections, while Appendix B scheme is defined in detail in the following sections, while
records its pros and cons against the above requirements. Appendix B records its pros and cons against the above requirements.
4. Transport Layer Behaviour (the 'Prague Requirements') 4. Transport Layer Behaviour (the 'Prague Requirements')
This section defines L4S behaviour at the transport layer, also known This section defines L4S behaviour at the transport layer, also known
as the Prague L4S Requirements (see Appendix A for the origin of the as the Prague L4S Requirements (see Appendix A for the origin of the
name). name).
4.1. Codepoint Setting 4.1. Codepoint Setting
A sender that wishes a packet to receive L4S treatment as it is A sender that wishes a packet to receive L4S treatment as it is
skipping to change at page 13, line 28 skipping to change at page 13, line 28
(such as that provided by AccECN [I-D.ietf-tcpm-accurate-ecn]) by (such as that provided by AccECN [I-D.ietf-tcpm-accurate-ecn]) by
both ends is a prerequisite for scalable congestion control in both ends is a prerequisite for scalable congestion control in
TCP. Therefore, the presence of ECT(1) in the IP headers even in TCP. Therefore, the presence of ECT(1) in the IP headers even in
one direction of a TCP connection will imply that both ends one direction of a TCP connection will imply that both ends
support accurate ECN feedback. However, the converse does not support accurate ECN feedback. However, the converse does not
apply. So even if both ends support AccECN, either of the two apply. So even if both ends support AccECN, either of the two
ends can choose not to use a scalable congestion control, whatever ends can choose not to use a scalable congestion control, whatever
the other end's choice. the other end's choice.
SCTP: A suitable ECN feedback mechanism for SCTP could add a chunk SCTP: A suitable ECN feedback mechanism for SCTP could add a chunk
to report the number of received CE marks to report the number of received CE marks (as described in a long-
(e.g. [I-D.stewart-tsvwg-sctpecn]), and update the ECN feedback expired draft [I-D.stewart-tsvwg-sctpecn] or as sketched out in
protocol sketched out in Appendix A of the original standards Appendix A of the now obsolete second standards track
track specification of SCTP [RFC4960]. specification of SCTP [RFC4960]).
RTP over UDP: A prerequisite for scalable congestion control is for RTP over UDP: A prerequisite for scalable congestion control is for
both (all) ends of one media-level hop to signal ECN both (all) ends of one media-level hop to signal ECN
support [RFC6679] and use the new generic RTCP feedback format of support [RFC6679] and use the new generic RTCP feedback format of
[RFC8888]. The presence of ECT(1) implies that both (all) ends of [RFC8888]. The presence of ECT(1) implies that both (all) ends of
that media-level hop support ECN. However, the converse does not that media-level hop support ECN. However, the converse does not
apply. So each end of a media-level hop can independently choose apply. So each end of a media-level hop can independently choose
not to use a scalable congestion control, even if both ends not to use a scalable congestion control, even if both ends
support ECN. support ECN.
skipping to change at page 14, line 19 skipping to change at page 14, line 19
ensures that, in steady state, the average duration between induced ensures that, in steady state, the average duration between induced
ECN marks does not increase as flow rate scales up, all other factors ECN marks does not increase as flow rate scales up, all other factors
being equal. This is termed a scalable congestion control. This being equal. This is termed a scalable congestion control. This
invariant duration ensures that, as flow rate scales, the average invariant duration ensures that, as flow rate scales, the average
period with no feedback information about capacity does not become period with no feedback information about capacity does not become
excessive. It also ensures that queue variations remain small, excessive. It also ensures that queue variations remain small,
without having to sacrifice utilization. without having to sacrifice utilization.
With a congestion control that sawtooths to probe capacity, this With a congestion control that sawtooths to probe capacity, this
duration is called the recovery time, because each time the sawtooth duration is called the recovery time, because each time the sawtooth
yields, on average it take this time to recover to its previous high yields, on average it takes this time to recover to its previous high
point. A scalable congestion control does not have to sawtooth, but point. A scalable congestion control does not have to sawtooth, but
it has to coexist with scalable congestion controls that do. it has to coexist with scalable congestion controls that do.
For instance, for DCTCP [RFC8257], TCP Prague For instance, for DCTCP [RFC8257], TCP Prague
[I-D.briscoe-iccrg-prague-congestion-control], [PragueLinux] and the [I-D.briscoe-iccrg-prague-congestion-control], [PragueLinux] and the
L4S variant of SCReAM [SCReAM-L4S], [RFC8298], the average recovery L4S variant of SCReAM [SCReAM-L4S], [RFC8298], the average recovery
time is always half a round trip (or half a reference round trip), time is always half a round trip (or half a reference round trip),
whatever the flow rate. whatever the flow rate.
As with all transport behaviours, a detailed specification (probably As with all transport behaviours, a detailed specification (probably
an experimental RFC) is expected for each congestion control, an experimental RFC) is expected for each congestion control,
following the guidelines for specifying new congestion control following the guidelines for specifying new congestion control
algorithms in [RFC5033]. In addition it is expected to document algorithms in [RFC5033]. In addition, it is expected to document
these L4S-specific matters, specifically the timescale over which the these L4S-specific matters, specifically the timescale over which the
proportionality is averaged, and control of burstiness. The recovery proportionality is averaged, and control of burstiness. The recovery
time requirement above is worded as a 'SHOULD' rather than a 'MUST' time requirement above is worded as a 'SHOULD' rather than a 'MUST'
to allow reasonable flexibility for such implementations. to allow reasonable flexibility for such implementations.
The condition 'all other factors being equal', allows the recovery The condition 'all other factors being equal', allows the recovery
time to be different for different round trip times, as long as it time to be different for different round trip times, as long as it
does not increase with flow rate for any particular RTT. does not increase with flow rate for any particular RTT.
Saying that the recovery time remains roughly invariant is equivalent Saying that the recovery time remains roughly invariant is equivalent
to saying that the number of ECN CE marks per round trip remains to saying that the number of ECN CE marks per round trip remains
invariant as flow rate scales, all other factors being equal. For invariant as flow rate scales, all other factors being equal. For
instance, an average recovery time of half of 1 RTT is equivalent to instance, an average recovery time of half of 1 RTT is equivalent to
2 ECN marks per round trip. For those familiar with steady-state 2 ECN marks per round trip. For those familiar with steady-state
congestion response functions, it is also equivalent to say that the congestion response functions, it is also equivalent to say that the
congestion window is inversely proportional to the proportion of congestion window is inversely proportional to the proportion of
bytes in packets marked with the CE codepoint (see section 2 of bytes in packets marked with the CE codepoint (see section 2 of
[PI2]). [PI2]).
In order to coexist safely with other Internet traffic, a scalable In order to coexist safely with other Internet traffic, a scalable
congestion control MUST NOT tag its packets with the ECT(1) codepoint congestion control is not allowed to tag its packets with the ECT(1)
unless it complies with the following bulleted requirements: codepoint unless it complies with the following numbered requirements
and recommendations:
1. A scalable congestion control MUST be capable of being replaced 1. A scalable congestion control MUST be capable of being replaced
by a Classic congestion control (by application and/or by by a Classic congestion control (by application and/or by
administrative control). If a Classic congestion control is administrative control). If a Classic congestion control is
activated, it will not tag its packets with the ECT(1) codepoint activated, it will not tag its packets with the ECT(1) codepoint
(see Appendix A.1.3 for rationale). (see Appendix A.1.3 for rationale).
2. As well as responding to ECN markings, a scalable congestion 2. As well as responding to ECN markings, a scalable congestion
control MUST react to packet loss in a way that will coexist control MUST react to packet loss in a way that will coexist
safely with Classic congestion controls such as standard safely with Classic congestion controls such as standard
Reno [RFC5681], as required by [RFC5033] (see Appendix A.1.4 for Reno [RFC5681], as required by [RFC5033] (see Appendix A.1.4 for
rationale). rationale).
3. In uncontrolled environments, monitoring MUST be implemented to 3. In uncontrolled environments, monitoring MUST be implemented to
support detection of problems with an ECN-capable AQM at the path support detection of problems with an ECN-capable AQM at the path
bottleneck that appears not to support L4S and might be in a bottleneck that appears not to support L4S and might be in a
shared queue. Such monitoring SHOULD be applied to live traffic shared queue. Such monitoring SHOULD be applied to live traffic
that is using Scalable congestion control. Alternatively, that is using Scalable congestion control. Alternatively,
monitoring need not be applied to live traffic, if monitoring has monitoring need not be applied to live traffic, if monitoring
been arranged to cover the paths that live traffic takes through with test traffic has been arranged to cover the paths that live
uncontrolled environments. traffic takes through uncontrolled environments.
A function to detect the above problems with an ECN-capable AQM A function to detect the above problems with an ECN-capable AQM
MUST also be implemented and used. The detection function SHOULD MUST also be implemented and used. The detection function SHOULD
be capable of making the congestion control adapt its ECN-marking be capable of making the congestion control adapt its ECN-marking
response in real-time to coexist safely with Classic congestion response in real-time to coexist safely with Classic congestion
controls such as standard Reno [RFC5681], as required by controls such as standard Reno [RFC5681], as required by
[RFC5033]. This could be complemented by more detailed offline [RFC5033]. This could be complemented by more detailed offline
detection of potential problems. If only offline detection is detection of potential problems. If only offline detection is
used and potential problems with such an AQM are detected on used and potential problems with such an AQM are detected on
certain paths, the scalable congestion control MUST be replaced certain paths, the scalable congestion control MUST be replaced
by a Classic congestion control, at least for the problem paths. by a Classic congestion control, at least for the problem paths.
See Section 4.3.1, Appendix A.1.5 and the L4S operational See Section 4.3.1, Appendix A.1.5 and the L4S operational
guidance [I-D.ietf-tsvwg-l4sops] for rationale. guidance [I-D.ietf-tsvwg-l4sops] for rationale and explanation.
Note that a scalable congestion control is not expected to change Note that a scalable congestion control is not expected to change
to setting ECT(0) while it transiently adapts to coexist with to setting ECT(0) while it transiently adapts to coexist with
Classic congestion controls, whereas a replacement congestion Classic congestion controls, whereas a replacement congestion
control that solely behaves in the Classic way will set ECT(0). control that solely behaves in the Classic way will set ECT(0).
4. In the range between the minimum likely RTT and typical RTTs 4. In the range between the minimum likely RTT and typical RTTs
expected in the intended deployment scenario, a scalable expected in the intended deployment scenario, a scalable
congestion control MUST converge towards a rate that is as congestion control MUST converge towards a rate that is as
independent of RTT as is possible without compromising stability independent of RTT as is possible without compromising stability
skipping to change at page 16, line 50 skipping to change at page 16, line 50
packets by multiplying by the current average packet rate. Then, packets by multiplying by the current average packet rate. Then,
the queue caused by each burst at the bottleneck link would not the queue caused by each burst at the bottleneck link would not
exceed 250us (under the worst-case assumption that the flow is exceed 250us (under the worst-case assumption that the flow is
filling the capacity). No normative requirement to limit bursts filling the capacity). No normative requirement to limit bursts
is given here and, until there is more industry experience from is given here and, until there is more industry experience from
the L4S experiment, it is not even known whether one is needed - the L4S experiment, it is not even known whether one is needed -
it seems to be in an L4S sender's self-interest to limit bursts. it seems to be in an L4S sender's self-interest to limit bursts.
Each sender in a session can use a scalable congestion control Each sender in a session can use a scalable congestion control
independently of the congestion control used by the receiver(s) when independently of the congestion control used by the receiver(s) when
they send data. Therefore there might be ECT(1) packets in one they send data. Therefore, there might be ECT(1) packets in one
direction and ECT(0) or Not-ECT in the other. direction and ECT(0) or Not-ECT in the other.
Later (Section 5.4.1.1) this document discusses the conditions for Later (Section 5.4.1.1) this document discusses the conditions for
mixing other "'Safe' Unresponsive Traffic" (e.g. DNS, LDAP, NTP, mixing other "'Safe' Unresponsive Traffic" (e.g. DNS, LDAP, NTP,
voice, game sync packets) with L4S traffic. To be clear, although voice, game sync packets) with L4S traffic. To be clear, although
such traffic can share the same queue as L4S traffic, it is not such traffic can share the same queue as L4S traffic, it is not
appropriate for the sender to tag it as ECT(1), except in the appropriate for the sender to tag it as ECT(1), except in the
(unlikely) case that it satisfies the above conditions. (unlikely) case that it satisfies the above conditions.
4.3.1. Guidance on Congestion Response in the RFC Series 4.3.1. Guidance on Congestion Response in the RFC Series
skipping to change at page 18, line 24 skipping to change at page 18, line 24
* Classic ECN [RFC3168]: The compromises centre around cases * Classic ECN [RFC3168]: The compromises centre around cases
where the bottleneck supports Classic ECN but not L4S. But it where the bottleneck supports Classic ECN but not L4S. But it
depends on which sub-case: depends on which sub-case:
- Shared Queue with Classic ECN: At the time of writing, the - Shared Queue with Classic ECN: At the time of writing, the
members of the Transport Working group are not aware of any members of the Transport Working group are not aware of any
current deployments of single-queue Classic ECN bottlenecks current deployments of single-queue Classic ECN bottlenecks
in the Internet. Nonetheless, at the scale of the Internet, in the Internet. Nonetheless, at the scale of the Internet,
rarity need not imply small numbers, nor that there will be rarity need not imply small numbers, nor that there will be
rarity in future. rarity in the future.
- Per-Flow-queues with Classic ECN: Most AQMs with per-flow- - Per-Flow-queues with Classic ECN: Most AQMs with per-flow-
queuing (FQ) deployed from 2012 onwards had Classic ECN queuing (FQ) deployed from 2012 onwards had Classic ECN
enabled by default, specifically FQ-CoDel [RFC8290] and enabled by default, specifically FQ-CoDel [RFC8290] and
COBALT [COBALT]. But the compromises only apply to the COBALT [COBALT]. But the compromises only apply to the
second of two further sub-cases: second of two further sub-cases:
o With per-flow-queuing, co-existence between Classic and o With per-flow-queuing, co-existence between Classic and
L4S flows is not normally a problem, because different L4S flows is not normally a problem, because different
flows are not meant to be in the same queue flows are not meant to be in the same queue
skipping to change at page 20, line 31 skipping to change at page 20, line 31
This shift of responsibility has the advantage that each sender can This shift of responsibility has the advantage that each sender can
smooth variations over a timescale proportionate to its own RTT. smooth variations over a timescale proportionate to its own RTT.
Whereas, in the Classic approach, the network doesn't know the RTTs Whereas, in the Classic approach, the network doesn't know the RTTs
of any of the flows, so it has to smooth out variations for a worst- of any of the flows, so it has to smooth out variations for a worst-
case RTT to ensure stability. For all the typical flows with shorter case RTT to ensure stability. For all the typical flows with shorter
RTT than the worst-case, this makes congestion control unnecessarily RTT than the worst-case, this makes congestion control unnecessarily
sluggish. sluggish.
This also gives an L4S sender the choice not to smooth, depending on This also gives an L4S sender the choice not to smooth, depending on
its context (start-up, congestion avoidance, etc). Therefore, this its context (start-up, congestion avoidance, etc.). Therefore, this
document places no requirement on an L4S congestion control to smooth document places no requirement on an L4S congestion control to smooth
out variations in any particular way. Implementers are encouraged to out variations in any particular way. Implementers are encouraged to
openly publish the approach they take to smoothing, and the results openly publish the approach they take to smoothing, and the results
and experience they gain during the L4S experiment. and experience they gain during the L4S experiment.
5. Network Node Behaviour 5. Network Node Behaviour
5.1. Classification and Re-Marking Behaviour 5.1. Classification and Re-Marking Behaviour
A network node that implements the L4S service: A network node that implements the L4S service:
* MUST classify arriving ECT(1) packets for L4S treatment, unless * MUST classify arriving ECT(1) packets for L4S treatment, unless
overridden by another classifier (e.g., see Section 5.4.1.2); overridden by another classifier (e.g., see Section 5.4.1.2);
* MUST classify arriving CE packets for L4S treatment as well, * MUST classify arriving CE packets for L4S treatment as well,
unless overridden by a another classifier or unless the exception unless overridden by another classifier or unless the exception
referred to next applies; referred to next applies;
CE packets might have originated as ECT(1) or ECT(0), but the CE packets might have originated as ECT(1) or ECT(0), but the
above rule to classify them as if they originated as ECT(1) is the above rule to classify them as if they originated as ECT(1) is the
safe choice (see Appendix B for rationale). The exception is safe choice (see Appendix B for rationale). The exception is
where some flow-aware in-network mechanism happens to be available where some flow-aware in-network mechanism happens to be available
for distinguishing CE packets that originated as ECT(0), as for distinguishing CE packets that originated as ECT(0), as
described in Section 5.3, but there is no implication that such a described in Section 5.3, but there is no implication that such a
mechanism is necessary. mechanism is necessary.
An L4S AQM treatment follows similar codepoint transition rules to An L4S AQM treatment follows similar codepoint transition rules to
skipping to change at page 22, line 22 skipping to change at page 22, line 22
signals in a DualQ Coupled AQM [I-D.ietf-tsvwg-aqm-dualq-coupled], as signals in a DualQ Coupled AQM [I-D.ietf-tsvwg-aqm-dualq-coupled], as
below. below.
Unless an AQM node schedules application flows explicitly, the Unless an AQM node schedules application flows explicitly, the
likelihood that the AQM drops a Not-ECT Classic packet (p_C) MUST be likelihood that the AQM drops a Not-ECT Classic packet (p_C) MUST be
roughly proportional to the square of the likelihood that it would roughly proportional to the square of the likelihood that it would
have marked it if it had been an L4S packet (p_L). That is have marked it if it had been an L4S packet (p_L). That is
p_C ~= (p_L / k)^2 p_C ~= (p_L / k)^2
The constant of proportionality (k) does not have to be standardised The constant of proportionality (k) does not have to be standardized
for interoperability, but a value of 2 is RECOMMENDED. The term for interoperability, but a value of 2 is RECOMMENDED. The term
'likelihood' is used above to allow for marking and dropping to be 'likelihood' is used above to allow for marking and dropping to be
either probabilistic or deterministic. either probabilistic or deterministic.
This formula ensures that Scalable and Classic flows will converge to This formula ensures that Scalable and Classic flows will converge to
roughly equal congestion windows, for the worst case of Reno roughly equal congestion windows, for the worst case of Reno
congestion control. This is because the congestion windows of congestion control. This is because the congestion windows of
Scalable and Classic congestion controls are inversely proportional Scalable and Classic congestion controls are inversely proportional
to p_L and sqrt(p_C) respectively. So squaring p_C in the above to p_L and sqrt(p_C) respectively. So squaring p_C in the above
formula counterbalances the square root that characterizes Reno- formula counterbalances the square root that characterizes Reno-
skipping to change at page 23, line 49 skipping to change at page 23, line 49
5.4.1. DualQ Examples of Other Identifiers Complementing L4S 5.4.1. DualQ Examples of Other Identifiers Complementing L4S
Identifiers Identifiers
5.4.1.1. Inclusion of Additional Traffic with L4S 5.4.1.1. Inclusion of Additional Traffic with L4S
In a typical case for the public Internet a network element that In a typical case for the public Internet a network element that
implements L4S in a shared queue might want to classify some low-rate implements L4S in a shared queue might want to classify some low-rate
but unresponsive traffic (e.g. DNS, LDAP, NTP, voice, game sync but unresponsive traffic (e.g. DNS, LDAP, NTP, voice, game sync
packets) into the low latency queue to mix with L4S traffic. In this packets) into the low latency queue to mix with L4S traffic. In this
case it would not be appropriate to call the queue an L4S queue, case it would not be appropriate to call the queue an L4S queue,
because it is shared by L4S and non-L4S traffic. Instead it will be because it is shared by L4S and non-L4S traffic. Instead, it will be
called the low latency or L queue. The L queue then offers two called the low latency or L queue. The L queue then offers two
different treatments: different treatments:
* The L4S treatment, which is a combination of the L4S AQM treatment * The L4S treatment, which is a combination of the L4S AQM treatment
and a priority scheduling treatment; and a priority scheduling treatment;
* The low latency treatment, which is solely the priority scheduling * The low latency treatment, which is solely the priority scheduling
treatment, without ECN-marking by the AQM. treatment, without ECN-marking by the AQM.
To identify packets for just the scheduling treatment, it would be To identify packets for just the scheduling treatment, it would be
skipping to change at page 24, line 28 skipping to change at page 24, line 28
intensity traffic; intensity traffic;
* certain low data-volume applications or protocols (e.g. ARP, DNS); * certain low data-volume applications or protocols (e.g. ARP, DNS);
* specific Diffserv codepoints that indicate traffic with limited * specific Diffserv codepoints that indicate traffic with limited
burstiness such as the EF (Expedited Forwarding [RFC3246]), Voice- burstiness such as the EF (Expedited Forwarding [RFC3246]), Voice-
Admit [RFC5865] or proposed NQB (Non-Queue- Admit [RFC5865] or proposed NQB (Non-Queue-
Building [I-D.ietf-tsvwg-nqb]) service classes or equivalent Building [I-D.ietf-tsvwg-nqb]) service classes or equivalent
local-use DSCPs (see [I-D.briscoe-tsvwg-l4s-diffserv]). local-use DSCPs (see [I-D.briscoe-tsvwg-l4s-diffserv]).
To be clear, classifying into the L queue based on application layer
identification (e.g. DNS) is an example of a local optimization, not
a recommendation. Applications will not be able to rely on such
unsolicited optimization. A more reliable approach would be for the
sender to set an appropriate IP layer identifier, such as one of the
above Diffserv codepoints.
In summary, a network element that implements L4S in a shared queue In summary, a network element that implements L4S in a shared queue
MAY classify additional types of packets into the L queue based on MAY classify additional types of packets into the L queue based on
identifiers other than the ECN field, but the types SHOULD be 'safe' identifiers other than the ECN field, but the types SHOULD be 'safe'
to mix with L4S traffic, where 'safe' is explained in to mix with L4S traffic, where 'safe' is explained in
Section 5.4.1.1.1. Section 5.4.1.1.1.
A packet that carries one of these non-ECN identifiers to classify it A packet that carries one of these non-ECN identifiers to classify it
into the L queue would not be subject to the L4S ECN marking into the L queue would not be subject to the L4S ECN marking
treatment, unless it also carried an ECT(1) or CE codepoint. The treatment, unless it also carried an ECT(1) or CE codepoint. The
specification of an L4S AQM MUST define the behaviour for packets specification of an L4S AQM MUST define the behaviour for packets
skipping to change at page 29, line 28 skipping to change at page 29, line 38
redesign work relevant to the most problematic link types. Such redesign work relevant to the most problematic link types. Such
knock-on effects of initial L4S deployment would all be part of the knock-on effects of initial L4S deployment would all be part of the
learning from the L4S experiment. learning from the L4S experiment.
The details of such link changes are beyond the scope of the present The details of such link changes are beyond the scope of the present
document. Nonetheless, where L4S technology is being implemented on document. Nonetheless, where L4S technology is being implemented on
an outgoing interface of a device, it would make sense to consider an outgoing interface of a device, it would make sense to consider
opportunities for reducing bursts arriving at other incoming opportunities for reducing bursts arriving at other incoming
interface(s). For instance, where an L4S AQM is implemented to feed interface(s). For instance, where an L4S AQM is implemented to feed
into the upstream WAN interface of a home gateway, there would be into the upstream WAN interface of a home gateway, there would be
opportunities to alter the WiFi profiles sent out of any WiFi opportunities to alter the Wi-Fi profiles sent out of any Wi-Fi
interfaces from the same device, in order to mitigate incoming bursts interfaces from the same device, in order to mitigate incoming bursts
of aggregated WiFi frames from other WiFi stations. of aggregated Wi-Fi frames from other Wi-Fi stations.
6. Behaviour of Tunnels and Encapsulations 6. Behaviour of Tunnels and Encapsulations
6.1. No Change to ECN Tunnels and Encapsulations in General 6.1. No Change to ECN Tunnels and Encapsulations in General
The L4S identifier is expected to work through and within any tunnel The L4S identifier is expected to work through and within any tunnel
without modification, as long as the tunnel propagates the ECN field without modification, as long as the tunnel propagates the ECN field
in any of the ways that have been defined since the first variant in in any of the ways that have been defined since the first variant in
the year 2001 [RFC3168]. L4S will also work with (but does not rely the year 2001 [RFC3168]. L4S will also work with (but does not rely
on) any of the more recent updates to ECN propagation in [RFC4301], on) any of the more recent updates to ECN propagation in [RFC4301],
skipping to change at page 30, line 21 skipping to change at page 30, line 30
6.2. VPN Behaviour to Avoid Limitations of Anti-Replay 6.2. VPN Behaviour to Avoid Limitations of Anti-Replay
If a mix of L4S and Classic packets is sent into the same security If a mix of L4S and Classic packets is sent into the same security
association (SA) of a virtual private network (VPN), and if the VPN association (SA) of a virtual private network (VPN), and if the VPN
egress is employing the optional anti-replay feature, it could egress is employing the optional anti-replay feature, it could
inappropriately discard Classic packets (or discard the records in inappropriately discard Classic packets (or discard the records in
Classic packets) by mistaking their greater queuing delay for a Classic packets) by mistaking their greater queuing delay for a
replay attack (see "Dropped Packets for Tunnels with Replay replay attack (see "Dropped Packets for Tunnels with Replay
Protection Enabled" in [Heist21] for the potential performance Protection Enabled" in [Heist21] for the potential performance
impact). This known problem is common to both IPsec [RFC4301] and impact). This known problem is common to both IPsec [RFC4301] and
DTLS [RFC6347] VPNs, given they use similar anti-replay window DTLS [RFC9147] VPNs, given they use similar anti-replay window
mechanisms. The mechanism used can only check for replay within its mechanisms. The mechanism used can only check for replay within its
window, so if the window is smaller than the degree of reordering, it window, so if the window is smaller than the degree of reordering, it
can only assume there might be a replay attack and discard all the can only assume there might be a replay attack and discard all the
packets behind the trailing edge of the window. The specifications packets behind the trailing edge of the window. The specifications
of IPsec AH [RFC4302] and ESP [RFC4303] suggest that an implementer of IPsec AH [RFC4302] and ESP [RFC4303] suggest that an implementer
scales the size of the anti-replay window with interface speed, and scales the size of the anti-replay window with interface speed, and
DTLS 1.3 [I-D.ietf-tls-dtls13] says "The receiver SHOULD pick a DTLS v1.3 [RFC9147] says "The receiver SHOULD pick a window large
window large enough to handle any plausible reordering, which depends enough to handle any plausible reordering, which depends on the data
on the data rate." However, in practice, the size of a VPN's anti- rate." However, in practice, the size of a VPN's anti-replay window
replay window is not always scaled appropriately. is not always scaled appropriately.
If a VPN carrying traffic participating in the L4S experiment If a VPN carrying traffic participating in the L4S experiment
experiences inappropriate replay detection, the foremost remedy would experiences inappropriate replay detection, the foremost remedy would
be to ensure that the egress is configured to comply with the above be to ensure that the egress is configured to comply with the above
window-sizing requirements. window-sizing requirements.
If an implementation of a VPN egress does not support a sufficiently If an implementation of a VPN egress does not support a sufficiently
large anti-replay window, e.g. due to hardware limitations, one of large anti-replay window, e.g. due to hardware limitations, one of
the temporary alternatives listed in order of preference below might the temporary alternatives listed in order of preference below might
be feasible instead: be feasible instead:
skipping to change at page 33, line 4 skipping to change at page 33, line 13
remedies)? remedies)?
* Was per-flow queue protection typically (un)necessary? * Was per-flow queue protection typically (un)necessary?
- How well did overload protection or queue protection work? - How well did overload protection or queue protection work?
* How well did L4S flows coexist with Classic flows when sharing a * How well did L4S flows coexist with Classic flows when sharing a
bottleneck? bottleneck?
- How frequently did problems arise? - How frequently did problems arise?
- What caused any coexistence problems, and were any problems due - What caused any coexistence problems, and were any problems due
to single-queue Classic ECN AQMs (this assumes single-queue to single-queue Classic ECN AQMs (this assumes single-queue
Classic ECN AQMs can be distinguished from FQ ones)? Classic ECN AQMs can be distinguished from FQ ones)?
* How prevalent were problems with the L4S service due to tunnels / * How prevalent were problems with the L4S service due to tunnels /
encapsulations that do not support ECN decapsulation? encapsulations that do not support ECN decapsulation?
* How easy was it to implement a fully compliant L4S congestion * How easy was it to implement a fully compliant L4S congestion
control, over various different transport protocols (TCP, QUIC, control, over various different transport protocols (TCP, QUIC,
RMCAT, etc)? RMCAT, etc.)?
Monitoring for harm to other traffic, specifically bandwidth Monitoring for harm to other traffic, specifically bandwidth
starvation or excess queuing delay, will need to be conducted starvation or excess queuing delay, will need to be conducted
alongside all early L4S experiments. It is hard, if not impossible, alongside all early L4S experiments. It is hard, if not impossible,
for an individual flow to measure its impact on other traffic. So for an individual flow to measure its impact on other traffic. So
such monitoring will need to be conducted using bespoke monitoring such monitoring will need to be conducted using bespoke monitoring
across flows and/or across classes of traffic. across flows and/or across classes of traffic.
7.2. Open Issues 7.2. Open Issues
skipping to change at page 34, line 8 skipping to change at page 34, line 18
* Potential for improvements to particular link technologies, and * Potential for improvements to particular link technologies, and
cross-layer interactions with them; cross-layer interactions with them;
* Potential for using virtual queues, e.g. to further reduce latency * Potential for using virtual queues, e.g. to further reduce latency
jitter, or to leave headroom for capacity variation in radio jitter, or to leave headroom for capacity variation in radio
networks; networks;
* Development and specification of reverse path congestion control * Development and specification of reverse path congestion control
using L4S building blocks (e.g. AccECN, QUIC); using L4S building blocks (e.g. AccECN, QUIC);
* Once queuing delay is cut down, what becomes the 'second longest * Once queuing delay is cut down, what becomes the 'second-longest
pole in the tent' (other than the speed of light)? pole in the tent' (other than the speed of light)?
* Novel alternatives to the existing set of L4S AQMs; * Novel alternatives to the existing set of L4S AQMs;
* Novel applications enabled by L4S. * Novel applications enabled by L4S.
8. IANA Considerations 8. IANA Considerations
The 01 codepoint of the ECN Field of the IP header is specified by The 01 codepoint of the ECN Field of the IP header is specified by
the present Experimental RFC. The process for an experimental RFC to the present Experimental RFC. The process for an experimental RFC to
skipping to change at page 36, line 15 skipping to change at page 36, line 30
[RFC6679] Westerlund, M., Johansson, I., Perkins, C., O'Hanlon, P., [RFC6679] Westerlund, M., Johansson, I., Perkins, C., O'Hanlon, P.,
and K. Carlberg, "Explicit Congestion Notification (ECN) and K. Carlberg, "Explicit Congestion Notification (ECN)
for RTP over UDP", RFC 6679, DOI 10.17487/RFC6679, August for RTP over UDP", RFC 6679, DOI 10.17487/RFC6679, August
2012, <https://www.rfc-editor.org/info/rfc6679>. 2012, <https://www.rfc-editor.org/info/rfc6679>.
10.2. Informative References 10.2. Informative References
[A2DTCP] Zhang, T., Wang, J., Huang, J., Huang, Y., Chen, J., and [A2DTCP] Zhang, T., Wang, J., Huang, J., Huang, Y., Chen, J., and
Y. Pan, "Adaptive-Acceleration Data Center TCP", IEEE Y. Pan, "Adaptive-Acceleration Data Center TCP", IEEE
Transactions on Computers 64(6):1522-1533, June 2015, Transactions on Computers 64(6):1522-1533, June 2015,
<http://ieeexplore.ieee.org/xpl/ <https://ieeexplore.ieee.org/xpl/
articleDetails.jsp?arnumber=6871352>. articleDetails.jsp?arnumber=6871352>.
[Ahmed19] Ahmed, A.S., "Extending TCP for Low Round Trip Delay", [Ahmed19] Ahmed, A.S., "Extending TCP for Low Round Trip Delay",
Masters Thesis, Uni Oslo , August 2019, Master's Thesis, Uni Oslo , August 2019,
<https://www.duo.uio.no/handle/10852/70966>. <https://www.duo.uio.no/handle/10852/70966>.
[Alizadeh-stability] [Alizadeh-stability]
Alizadeh, M., Javanmard, A., and B. Prabhakar, "Analysis Alizadeh, M., Javanmard, A., and B. Prabhakar, "Analysis
of DCTCP: Stability, Convergence, and Fairness", ACM of DCTCP: Stability, Convergence, and Fairness", ACM
SIGMETRICS 2011 , June 2011, SIGMETRICS 2011 , June 2011,
<https://people.csail.mit.edu/alizadeh/papers/ <https://people.csail.mit.edu/alizadeh/papers/
dctcp_analysis-sigmetrics11.pdf>. dctcp_analysis-sigmetrics11.pdf>.
[ARED01] Floyd, S., Gummadi, R., and S. Shenker, "Adaptive RED: An [ARED01] Floyd, S., Gummadi, R., and S. Shenker, "Adaptive RED: An
Algorithm for Increasing the Robustness of RED's Active Algorithm for Increasing the Robustness of RED's Active
Queue Management", ACIRI Technical Report , August 2001, Queue Management", ACIRI Technical Report , August 2001,
<http://www.icir.org/floyd/red.html>. <https://www.icir.org/floyd/red.html>.
[BBRv2] Cardwell, N., "BRTCP BBR v2 Alpha/Preview Release", github [BBRv2] Cardwell, N., "BRTCP BBR v2 Alpha/Preview Release", GitHub
repository; Linux congestion control module, repository; Linux congestion control module,
<https://github.com/google/bbr/blob/v2alpha/README.md>. <https://github.com/google/bbr/blob/v2alpha/README.md>.
[Bufferbloat]
"Bufferbloat", <https://bufferbloat.net/>. (last accessed
27 Aug 2022)
[COBALT] Palmei, J., Gupta, S., Imputato, P., Morton, J., [COBALT] Palmei, J., Gupta, S., Imputato, P., Morton, J.,
Tahiliani, M., Avallone, S., and D. Taht, "Design and Tahiliani, M., Avallone, S., and D. Taht, "Design and
Evaluation of COBALT Queue Discipline", In Proc. IEEE Evaluation of COBALT Queue Discipline", In Proc. IEEE
Int'l Symp. on Local and Metropolitan Area Networks 2019, Int'l Symp. on Local and Metropolitan Area Networks 2019,
pp1--6, 2019, pp1--6, 2019,
<https://doi.org/10.1109/LANMAN.2019.8847054>. <https://doi.org/10.1109/LANMAN.2019.8847054>.
[DCttH19] De Schepper, K., Bondarenko, O., Tilmans, O., and B. [DCttH19] De Schepper, K., Bondarenko, O., Tilmans, O., and B.
Briscoe, "`Data Centre to the Home': Ultra-Low Latency for Briscoe, "`Data Centre to the Home': Ultra-Low Latency for
All", Updated RITE project Technical Report , July 2019, All", Updated RITE project Technical Report , July 2019,
skipping to change at page 37, line 20 skipping to change at page 37, line 40
the Right Metric for Congestion Control", ACM CCR the Right Metric for Congestion Control", ACM CCR
36(1):59--62, January 2006, 36(1):59--62, January 2006,
<https://dl.acm.org/doi/10.1145/1111322.1111336>. <https://dl.acm.org/doi/10.1145/1111322.1111336>.
[ecn-fallback] [ecn-fallback]
Briscoe, B. and A.S. Ahmed, "TCP Prague Fall-back on Briscoe, B. and A.S. Ahmed, "TCP Prague Fall-back on
Detection of a Classic ECN AQM", bobbriscoe.net Technical Detection of a Classic ECN AQM", bobbriscoe.net Technical
Report TR-BB-2019-002, April 2020, Report TR-BB-2019-002, April 2020,
<https://arxiv.org/abs/1911.00710>. <https://arxiv.org/abs/1911.00710>.
[Heist21] Heist, P. and J. Morton, "L4S Tests", github README, May [Heist21] Heist, P. and J. Morton, "L4S Tests", GitHub README, May
2021, <https://github.com/heistp/l4s-tests/>. 2021, <https://github.com/heistp/l4s-tests/>.
[I-D.briscoe-docsis-q-protection] [I-D.briscoe-docsis-q-protection]
Briscoe, B. and G. White, "The DOCSIS(r) Queue Protection Briscoe, B. and G. White, "The DOCSIS(r) Queue Protection
Algorithm to Preserve Low Latency", Work in Progress, Algorithm to Preserve Low Latency", Work in Progress,
Internet-Draft, draft-briscoe-docsis-q-protection-06, 13 Internet-Draft, draft-briscoe-docsis-q-protection-06, 13
May 2022, <https://www.ietf.org/archive/id/draft-briscoe- May 2022,
docsis-q-protection-06.txt>. <https://datatracker.ietf.org/api/v1/doc/document/draft-
briscoe-docsis-q-protection/>.
[I-D.briscoe-iccrg-prague-congestion-control] [I-D.briscoe-iccrg-prague-congestion-control]
Schepper, K. D., Tilmans, O., and B. Briscoe, "Prague Schepper, K. D., Tilmans, O., and B. Briscoe, "Prague
Congestion Control", Work in Progress, Internet-Draft, Congestion Control", Work in Progress, Internet-Draft,
draft-briscoe-iccrg-prague-congestion-control-01, 11 July draft-briscoe-iccrg-prague-congestion-control-01, 11 July
2022, <https://www.ietf.org/archive/id/draft-briscoe- 2022, <https://datatracker.ietf.org/api/v1/doc/document/
iccrg-prague-congestion-control-01.txt>. draft-briscoe-iccrg-prague-congestion-control/>.
[I-D.briscoe-tsvwg-l4s-diffserv] [I-D.briscoe-tsvwg-l4s-diffserv]
Briscoe, B., "Interactions between Low Latency, Low Loss, Briscoe, B., "Interactions between Low Latency, Low Loss,
Scalable Throughput (L4S) and Differentiated Services", Scalable Throughput (L4S) and Differentiated Services",
Work in Progress, Internet-Draft, draft-briscoe-tsvwg-l4s- Work in Progress, Internet-Draft, draft-briscoe-tsvwg-l4s-
diffserv-02, November 2018, diffserv-02, 2 July 2018,
<https://www.ietf.org/archive/id/draft-briscoe-tsvwg-l4s- <https://datatracker.ietf.org/api/v1/doc/document/draft-
diffserv-02.txt>. briscoe-tsvwg-l4s-diffserv/>.
[I-D.cardwell-iccrg-bbr-congestion-control] [I-D.cardwell-iccrg-bbr-congestion-control]
Cardwell, N., Cheng, Y., Yeganeh, S. H., Swett, I., and V. Cardwell, N., Cheng, Y., Yeganeh, S. H., Swett, I., and V.
Jacobson, "BBR Congestion Control", Work in Progress, Jacobson, "BBR Congestion Control", Work in Progress,
Internet-Draft, draft-cardwell-iccrg-bbr-congestion- Internet-Draft, draft-cardwell-iccrg-bbr-congestion-
control-02, March 2022, <https://www.ietf.org/archive/id/ control-02, 7 March 2022,
draft-cardwell-iccrg-bbr-congestion-control-02.txt>. <https://datatracker.ietf.org/api/v1/doc/document/draft-
cardwell-iccrg-bbr-congestion-control/>.
[I-D.ietf-tcpm-accurate-ecn] [I-D.ietf-tcpm-accurate-ecn]
Briscoe, B., Kühlewind, M., and R. Scheffenegger, "More Briscoe, B., Kühlewind, M., and R. Scheffenegger, "More
Accurate ECN Feedback in TCP", Work in Progress, Internet- Accurate ECN Feedback in TCP", Work in Progress, Internet-
Draft, draft-ietf-tcpm-accurate-ecn-20, 25 July 2022, Draft, draft-ietf-tcpm-accurate-ecn-20, 25 July 2022,
<https://www.ietf.org/archive/id/draft-ietf-tcpm-accurate- <https://datatracker.ietf.org/api/v1/doc/document/draft-
ecn-20.txt>. ietf-tcpm-accurate-ecn/>.
[I-D.ietf-tcpm-generalized-ecn] [I-D.ietf-tcpm-generalized-ecn]
Bagnulo, M. and B. Briscoe, "ECN++: Adding Explicit Bagnulo, M. and B. Briscoe, "ECN++: Adding Explicit
Congestion Notification (ECN) to TCP Control Packets", Congestion Notification (ECN) to TCP Control Packets",
Work in Progress, Internet-Draft, draft-ietf-tcpm- Work in Progress, Internet-Draft, draft-ietf-tcpm-
generalized-ecn-10, 27 July 2022, generalized-ecn-10, 27 July 2022,
<https://www.ietf.org/archive/id/draft-ietf-tcpm- <https://datatracker.ietf.org/api/v1/doc/document/draft-
generalized-ecn-10.txt>. ietf-tcpm-generalized-ecn/>.
[I-D.ietf-tls-dtls13]
Rescorla, E., Tschofenig, H., and N. Modadugu, "The
Datagram Transport Layer Security (DTLS) Protocol Version
1.3", Work in Progress, Internet-Draft, draft-ietf-tls-
dtls13-43, 30 April 2021,
<https://www.ietf.org/archive/id/draft-ietf-tls-
dtls13-43.txt>.
[I-D.ietf-trill-ecn-support] [I-D.ietf-trill-ecn-support]
Eastlake, D. E. and B. Briscoe, "TRILL (TRansparent Eastlake, D. E. and B. Briscoe, "TRILL (TRansparent
Interconnection of Lots of Links): ECN (Explicit Interconnection of Lots of Links): ECN (Explicit
Congestion Notification) Support", Work in Progress, Congestion Notification) Support", Work in Progress,
Internet-Draft, draft-ietf-trill-ecn-support-07, 25 Internet-Draft, draft-ietf-trill-ecn-support-07, 25
February 2018, <https://www.ietf.org/archive/id/draft- February 2018,
ietf-trill-ecn-support-07.txt>. <https://datatracker.ietf.org/api/v1/doc/document/draft-
ietf-trill-ecn-support/>.
[I-D.ietf-tsvwg-aqm-dualq-coupled] [I-D.ietf-tsvwg-aqm-dualq-coupled]
Schepper, K. D., Briscoe, B., and G. White, "DualQ Coupled Schepper, K. D., Briscoe, B., and G. White, "DualQ Coupled
AQMs for Low Latency, Low Loss and Scalable Throughput AQMs for Low Latency, Low Loss and Scalable Throughput
(L4S)", Work in Progress, Internet-Draft, draft-ietf- (L4S)", Work in Progress, Internet-Draft, draft-ietf-
tsvwg-aqm-dualq-coupled-24, July 2022, tsvwg-aqm-dualq-coupled-24, 7 July 2022,
<https://www.ietf.org/archive/id/draft-ietf-tsvwg-aqm- <https://datatracker.ietf.org/api/v1/doc/document/draft-
dualq-coupled-24.txt>. ietf-tsvwg-aqm-dualq-coupled/>.
[I-D.ietf-tsvwg-ecn-encap-guidelines] [I-D.ietf-tsvwg-ecn-encap-guidelines]
Briscoe, B. and J. Kaippallimalil, "Guidelines for Adding Briscoe, B. and J. Kaippallimalil, "Guidelines for Adding
Congestion Notification to Protocols that Encapsulate IP", Congestion Notification to Protocols that Encapsulate IP",
Work in Progress, Internet-Draft, draft-ietf-tsvwg-ecn- Work in Progress, Internet-Draft, draft-ietf-tsvwg-ecn-
encap-guidelines-17, 11 July 2022, encap-guidelines-17, 11 July 2022,
<https://www.ietf.org/archive/id/draft-ietf-tsvwg-ecn- <https://datatracker.ietf.org/api/v1/doc/document/draft-
encap-guidelines-17.txt>. ietf-tsvwg-ecn-encap-guidelines/>.
[I-D.ietf-tsvwg-l4s-arch] [I-D.ietf-tsvwg-l4s-arch]
Briscoe, B., Schepper, K. D., Bagnulo, M., and G. White, Briscoe, B., Schepper, K. D., Bagnulo, M., and G. White,
"Low Latency, Low Loss, Scalable Throughput (L4S) Internet "Low Latency, Low Loss, Scalable Throughput (L4S) Internet
Service: Architecture", Work in Progress, Internet-Draft, Service: Architecture", Work in Progress, Internet-Draft,
draft-ietf-tsvwg-l4s-arch-19, 27 July 2022, draft-ietf-tsvwg-l4s-arch-19, 27 July 2022,
<https://www.ietf.org/archive/id/draft-ietf-tsvwg-l4s- <https://datatracker.ietf.org/api/v1/doc/document/draft-
arch-19.txt>. ietf-tsvwg-l4s-arch/>.
[I-D.ietf-tsvwg-l4sops] [I-D.ietf-tsvwg-l4sops]
White, G., "Operational Guidance for Deployment of L4S in White, G., "Operational Guidance for Deployment of L4S in
the Internet", Work in Progress, Internet-Draft, draft- the Internet", Work in Progress, Internet-Draft, draft-
ietf-tsvwg-l4sops-03, 28 April 2022, ietf-tsvwg-l4sops-03, 28 April 2022,
<https://www.ietf.org/archive/id/draft-ietf-tsvwg-l4sops- <https://datatracker.ietf.org/api/v1/doc/document/draft-
03.txt>. ietf-tsvwg-l4sops/>.
[I-D.ietf-tsvwg-nqb] [I-D.ietf-tsvwg-nqb]
White, G. and T. Fossati, "A Non-Queue-Building Per-Hop White, G. and T. Fossati, "A Non-Queue-Building Per-Hop
Behavior (NQB PHB) for Differentiated Services", Work in Behavior (NQB PHB) for Differentiated Services", Work in
Progress, Internet-Draft, draft-ietf-tsvwg-nqb-10, March Progress, Internet-Draft, draft-ietf-tsvwg-nqb-10, 4 March
2022, <https://www.ietf.org/archive/id/draft-ietf-tsvwg- 2022, <https://datatracker.ietf.org/api/v1/doc/document/
nqb-10.txt>. draft-ietf-tsvwg-nqb/>.
[I-D.ietf-tsvwg-rfc6040update-shim] [I-D.ietf-tsvwg-rfc6040update-shim]
Briscoe, B., "Propagating Explicit Congestion Notification Briscoe, B., "Propagating Explicit Congestion Notification
Across IP Tunnel Headers Separated by a Shim", Work in Across IP Tunnel Headers Separated by a Shim", Work in
Progress, Internet-Draft, draft-ietf-tsvwg-rfc6040update- Progress, Internet-Draft, draft-ietf-tsvwg-rfc6040update-
shim-15, 11 July 2022, <https://www.ietf.org/archive/id/ shim-15, 11 July 2022,
draft-ietf-tsvwg-rfc6040update-shim-15.txt>. <https://datatracker.ietf.org/api/v1/doc/document/draft-
ietf-tsvwg-rfc6040update-shim/>.
[I-D.mathis-iccrg-relentless-tcp]
Mathis, M., "Relentless Congestion Control", Work in
Progress, Internet-Draft, draft-mathis-iccrg-relentless-
tcp-00, 4 March 2009, <https://www.ietf.org/archive/id/
draft-mathis-iccrg-relentless-tcp-00.txt>.
[I-D.sridharan-tcpm-ctcp] [I-D.sridharan-tcpm-ctcp]
Sridharan, M., Tan, K., Bansal, D., and D. Thaler, Sridharan, M., Tan, K., Bansal, D., and D. Thaler,
"Compound TCP: A New TCP Congestion Control for High-Speed "Compound TCP: A New TCP Congestion Control for High-Speed
and Long Distance Networks", Work in Progress, Internet- and Long Distance Networks", Work in Progress, Internet-
Draft, draft-sridharan-tcpm-ctcp-02, 11 November 2008, Draft, draft-sridharan-tcpm-ctcp-02, 29 October 2007,
<https://www.ietf.org/archive/id/draft-sridharan-tcpm- <https://datatracker.ietf.org/api/v1/doc/document/draft-
ctcp-02.txt>. sridharan-tcpm-ctcp/>.
[I-D.stewart-tsvwg-sctpecn] [I-D.stewart-tsvwg-sctpecn]
Stewart, R. R., Tuexen, M., and X. Dong, "ECN for Stream Stewart, R. R., Tuexen, M., and X. Dong, "ECN for Stream
Control Transmission Protocol (SCTP)", Work in Progress, Control Transmission Protocol (SCTP)", Work in Progress,
Internet-Draft, draft-stewart-tsvwg-sctpecn-05, 15 January Internet-Draft, draft-stewart-tsvwg-sctpecn-05, 15 January
2014, <https://www.ietf.org/archive/id/draft-stewart- 2014, <https://www.ietf.org/archive/id/draft-stewart-
tsvwg-sctpecn-05.txt>. tsvwg-sctpecn-05.txt>.
[LinuxPacedChirping] [LinuxPacedChirping]
Misund, J. and B. Briscoe, "Paced Chirping - Rethinking Misund, J. and B. Briscoe, "Paced Chirping - Rethinking
TCP start-up", Proc. Linux Netdev 0x13 , March 2019, TCP start-up", Proc. Linux Netdev 0x13 , March 2019,
<https://www.netdevconf.org/0x13/session.html?talk-chirp>. <https://www.netdevconf.org/0x13/session.html?talk-chirp>.
[Mathis09] Mathis, M., "Relentless Congestion Control", PFLDNeT'09 ,
May 2009, <http://www.hpcc.jp/pfldnet2009/
Program_files/1569198525.pdf>.
[Paced-Chirping] [Paced-Chirping]
Misund, J., "Rapid Acceleration in TCP Prague", Masters Misund, J., "Rapid Acceleration in TCP Prague", Master's
Thesis , May 2018, Thesis , May 2018,
<https://riteproject.files.wordpress.com/2018/07/ <https://riteproject.files.wordpress.com/2018/07/
misundjoakimmastersthesissubmitted180515.pdf>. misundjoakimmastersthesissubmitted180515.pdf>.
[PI2] De Schepper, K., Bondarenko, O., Tsang, I., and B. [PI2] De Schepper, K., Bondarenko, O., Tsang, I., and B.
Briscoe, "PI^2 : A Linearized AQM for both Classic and Briscoe, "PI^2 : A Linearized AQM for both Classic and
Scalable TCP", Proc. ACM CoNEXT 2016 pp.105-119, December Scalable TCP", Proc. ACM CoNEXT 2016 pp.105-119, December
2016, 2016,
<http://dl.acm.org/citation.cfm?doid=2999572.2999578>. <https://dl.acm.org/citation.cfm?doid=2999572.2999578>.
[PragueLinux] [PragueLinux]
Briscoe, B., De Schepper, K., Albisser, O., Misund, J., Briscoe, B., De Schepper, K., Albisser, O., Misund, J.,
Tilmans, O., Kühlewind, M., and A.S. Ahmed, "Implementing Tilmans, O., Kühlewind, M., and A.S. Ahmed, "Implementing
the `TCP Prague' Requirements for Low Latency Low Loss the `TCP Prague' Requirements for Low Latency Low Loss
Scalable Throughput (L4S)", Proc. Linux Netdev 0x13 , Scalable Throughput (L4S)", Proc. Linux Netdev 0x13 ,
March 2019, <https://www.netdevconf.org/0x13/ March 2019, <https://www.netdevconf.org/0x13/
session.html?talk-tcp-prague-l4s>. session.html?talk-tcp-prague-l4s>.
[QV] Briscoe, B. and P. Hurtig, "Up to Speed with Queue View", [QV] Briscoe, B. and P. Hurtig, "Up to Speed with Queue View",
skipping to change at page 40, line 48 skipping to change at page 41, line 24
Queue Management and Congestion Avoidance in the Queue Management and Congestion Avoidance in the
Internet", RFC 2309, DOI 10.17487/RFC2309, April 1998, Internet", RFC 2309, DOI 10.17487/RFC2309, April 1998,
<https://www.rfc-editor.org/info/rfc2309>. <https://www.rfc-editor.org/info/rfc2309>.
[RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black,
"Definition of the Differentiated Services Field (DS "Definition of the Differentiated Services Field (DS
Field) in the IPv4 and IPv6 Headers", RFC 2474, Field) in the IPv4 and IPv6 Headers", RFC 2474,
DOI 10.17487/RFC2474, December 1998, DOI 10.17487/RFC2474, December 1998,
<https://www.rfc-editor.org/info/rfc2474>. <https://www.rfc-editor.org/info/rfc2474>.
[RFC3246] Davie, B., Charny, A., Bennet, J C R., Benson, K., Le [RFC3246] Davie, B., Charny, A., Bennet, J.C.R., Benson, K., Le
Boudec, J Y., Courtney, W., Davari, S., Firoiu, V., and D. Boudec, J.Y., Courtney, W., Davari, S., Firoiu, V., and D.
Stiliadis, "An Expedited Forwarding PHB (Per-Hop Stiliadis, "An Expedited Forwarding PHB (Per-Hop
Behavior)", RFC 3246, DOI 10.17487/RFC3246, March 2002, Behavior)", RFC 3246, DOI 10.17487/RFC3246, March 2002,
<https://www.rfc-editor.org/info/rfc3246>. <https://www.rfc-editor.org/info/rfc3246>.
[RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit
Congestion Notification (ECN) Signaling with Nonces", Congestion Notification (ECN) Signaling with Nonces",
RFC 3540, DOI 10.17487/RFC3540, June 2003, RFC 3540, DOI 10.17487/RFC3540, June 2003,
<https://www.rfc-editor.org/info/rfc3540>. <https://www.rfc-editor.org/info/rfc3540>.
[RFC3649] Floyd, S., "HighSpeed TCP for Large Congestion Windows", [RFC3649] Floyd, S., "HighSpeed TCP for Large Congestion Windows",
skipping to change at page 43, line 5 skipping to change at page 43, line 32
[RFC6040] Briscoe, B., "Tunnelling of Explicit Congestion [RFC6040] Briscoe, B., "Tunnelling of Explicit Congestion
Notification", RFC 6040, DOI 10.17487/RFC6040, November Notification", RFC 6040, DOI 10.17487/RFC6040, November
2010, <https://www.rfc-editor.org/info/rfc6040>. 2010, <https://www.rfc-editor.org/info/rfc6040>.
[RFC6077] Papadimitriou, D., Ed., Welzl, M., Scharf, M., and B. [RFC6077] Papadimitriou, D., Ed., Welzl, M., Scharf, M., and B.
Briscoe, "Open Research Issues in Internet Congestion Briscoe, "Open Research Issues in Internet Congestion
Control", RFC 6077, DOI 10.17487/RFC6077, February 2011, Control", RFC 6077, DOI 10.17487/RFC6077, February 2011,
<https://www.rfc-editor.org/info/rfc6077>. <https://www.rfc-editor.org/info/rfc6077>.
[RFC6347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer
Security Version 1.2", RFC 6347, DOI 10.17487/RFC6347,
January 2012, <https://www.rfc-editor.org/info/rfc6347>.
[RFC6660] Briscoe, B., Moncaster, T., and M. Menth, "Encoding Three [RFC6660] Briscoe, B., Moncaster, T., and M. Menth, "Encoding Three
Pre-Congestion Notification (PCN) States in the IP Header Pre-Congestion Notification (PCN) States in the IP Header
Using a Single Diffserv Codepoint (DSCP)", RFC 6660, Using a Single Diffserv Codepoint (DSCP)", RFC 6660,
DOI 10.17487/RFC6660, July 2012, DOI 10.17487/RFC6660, July 2012,
<https://www.rfc-editor.org/info/rfc6660>. <https://www.rfc-editor.org/info/rfc6660>.
[RFC6675] Blanton, E., Allman, M., Wang, L., Jarvinen, I., Kojo, M., [RFC6675] Blanton, E., Allman, M., Wang, L., Jarvinen, I., Kojo, M.,
and Y. Nishida, "A Conservative Loss Recovery Algorithm and Y. Nishida, "A Conservative Loss Recovery Algorithm
Based on Selective Acknowledgment (SACK) for TCP", Based on Selective Acknowledgment (SACK) for TCP",
RFC 6675, DOI 10.17487/RFC6675, August 2012, RFC 6675, DOI 10.17487/RFC6675, August 2012,
skipping to change at page 44, line 5 skipping to change at page 44, line 30
[RFC8083] Perkins, C. and V. Singh, "Multimedia Congestion Control: [RFC8083] Perkins, C. and V. Singh, "Multimedia Congestion Control:
Circuit Breakers for Unicast RTP Sessions", RFC 8083, Circuit Breakers for Unicast RTP Sessions", RFC 8083,
DOI 10.17487/RFC8083, March 2017, DOI 10.17487/RFC8083, March 2017,
<https://www.rfc-editor.org/info/rfc8083>. <https://www.rfc-editor.org/info/rfc8083>.
[RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage
Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085,
March 2017, <https://www.rfc-editor.org/info/rfc8085>. March 2017, <https://www.rfc-editor.org/info/rfc8085>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>.
[RFC8257] Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L., [RFC8257] Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L.,
and G. Judd, "Data Center TCP (DCTCP): TCP Congestion and G. Judd, "Data Center TCP (DCTCP): TCP Congestion
Control for Data Centers", RFC 8257, DOI 10.17487/RFC8257, Control for Data Centers", RFC 8257, DOI 10.17487/RFC8257,
October 2017, <https://www.rfc-editor.org/info/rfc8257>. October 2017, <https://www.rfc-editor.org/info/rfc8257>.
[RFC8290] Hoeiland-Joergensen, T., McKenney, P., Taht, D., Gettys, [RFC8290] Hoeiland-Joergensen, T., McKenney, P., Taht, D., Gettys,
J., and E. Dumazet, "The Flow Queue CoDel Packet Scheduler J., and E. Dumazet, "The Flow Queue CoDel Packet Scheduler
and Active Queue Management Algorithm", RFC 8290, and Active Queue Management Algorithm", RFC 8290,
DOI 10.17487/RFC8290, January 2018, DOI 10.17487/RFC8290, January 2018,
<https://www.rfc-editor.org/info/rfc8290>. <https://www.rfc-editor.org/info/rfc8290>.
skipping to change at page 45, line 5 skipping to change at page 45, line 34
[RFC9000] Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based [RFC9000] Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based
Multiplexed and Secure Transport", RFC 9000, Multiplexed and Secure Transport", RFC 9000,
DOI 10.17487/RFC9000, May 2021, DOI 10.17487/RFC9000, May 2021,
<https://www.rfc-editor.org/info/rfc9000>. <https://www.rfc-editor.org/info/rfc9000>.
[RFC9001] Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure [RFC9001] Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure
QUIC", RFC 9001, DOI 10.17487/RFC9001, May 2021, QUIC", RFC 9001, DOI 10.17487/RFC9001, May 2021,
<https://www.rfc-editor.org/info/rfc9001>. <https://www.rfc-editor.org/info/rfc9001>.
[RFC9147] Rescorla, E., Tschofenig, H., and N. Modadugu, "The
Datagram Transport Layer Security (DTLS) Protocol Version
1.3", RFC 9147, DOI 10.17487/RFC9147, April 2022,
<https://www.rfc-editor.org/info/rfc9147>.
[Savage-TCP] [Savage-TCP]
Savage, S., Cardwell, N., Wetherall, D., and T. Anderson, Savage, S., Cardwell, N., Wetherall, D., and T. Anderson,
"TCP Congestion Control with a Misbehaving Receiver", ACM "TCP Congestion Control with a Misbehaving Receiver", ACM
SIGCOMM Computer Communication Review 29(5):71--78, SIGCOMM Computer Communication Review 29(5):71--78,
October 1999. October 1999.
[SCReAM-L4S] [SCReAM-L4S]
Johansson, I., "SCReAM", github repository; , Johansson, I., "SCReAM", GitHub repository; ,
<https://github.com/EricssonResearch/scream/blob/master/ <https://github.com/EricssonResearch/scream/blob/master/
README.md>. README.md>.
[sub-mss-prob] [sub-mss-prob]
Briscoe, B. and K. De Schepper, "Scaling TCP's Congestion Briscoe, B. and K. De Schepper, "Scaling TCP's Congestion
Window for Small Round Trip Times", BT Technical Report Window for Small Round Trip Times", BT Technical Report
TR-TUB8-2015-002, May 2015, TR-TUB8-2015-002, May 2015,
<https://arxiv.org/abs/1904.07598>. <https://arxiv.org/abs/1904.07598>.
[TCP-CA] Jacobson, V. and M.J. Karels, "Congestion Avoidance and [TCP-CA] Jacobson, V. and M.J. Karels, "Congestion Avoidance and
Control", Laurence Berkeley Labs Technical Report , Control", Laurence Berkeley Labs Technical Report ,
November 1988, <http://ee.lbl.gov/papers/congavoid.pdf>. November 1988, <https://ee.lbl.gov/papers/congavoid.pdf>.
[TCPPrague] [TCPPrague]
Briscoe, B., "Notes: DCTCP evolution 'bar BoF': Tue 21 Jul Briscoe, B., "Notes: DCTCP evolution 'bar BoF': Tue 21 Jul
2015, 17:40, Prague", tcpprague mailing list archive , 2015, 17:40, Prague", tcpprague mailing list archive ,
July 2015, <https://www.ietf.org/mail- July 2015, <https://www.ietf.org/mail-
archive/web/tcpprague/current/msg00001.html>. archive/web/tcpprague/current/msg00001.html>.
[VCP] Xia, Y., Subramanian, L., Stoica, I., and S. Kalyanaraman, [VCP] Xia, Y., Subramanian, L., Stoica, I., and S. Kalyanaraman,
"One more bit is enough", Proc. SIGCOMM'05, ACM CCR "One more bit is enough", Proc. SIGCOMM'05, ACM CCR
35(4)37--48, 2005, 35(4)37--48, 2005,
<http://doi.acm.org/10.1145/1080091.1080098>. <https://doi.acm.org/10.1145/1080091.1080098>.
Appendix A. Rationale for the 'Prague L4S Requirements' Appendix A. Rationale for the 'Prague L4S Requirements'
This appendix is informative, not normative. It gives a list of This appendix is informative, not normative. It gives a list of
modifications to current scalable congestion controls so that they modifications to current scalable congestion controls so that they
can be deployed over the public Internet and coexist safely with can be deployed over the public Internet and coexist safely with
existing traffic. The list complements the normative requirements in existing traffic. The list complements the normative requirements in
Section 4 that a sender has to comply with before it can set the L4S Section 4 that a sender has to comply with before it can set the L4S
identifier in packets it sends into the Internet. As well as identifier in packets it sends into the Internet. As well as
rationale for safety improvements (the requirements in Section 4) rationale for safety improvements (the requirements in Section 4)
this appendix also includes preferable performance improvements this appendix also includes preferable performance improvements
(optimizations). (optimizations).
The requirements and recommendations in Section 4) have become know The requirements and recommendations in Section 4) have become known
as the Prague L4S Requirements, because they were originally as the Prague L4S Requirements, because they were originally
identified at an ad hoc meeting during IETF-94 in Prague [TCPPrague]. identified at an ad hoc meeting during IETF-94 in Prague [TCPPrague].
They were originally called the 'TCP Prague Requirements', but they They were originally called the 'TCP Prague Requirements', but they
are not solely applicable to TCP, so the name and wording has been are not solely applicable to TCP, so the name and wording has been
generalized for all transport protocols, and the name 'TCP Prague' is generalized for all transport protocols, and the name 'TCP Prague' is
now used for a specific implementation of the requirements. now used for a specific implementation of the requirements.
At the time of writing, DCTCP [RFC8257] is the most widely used At the time of writing, DCTCP [RFC8257] is the most widely used
scalable transport protocol. In its current form, DCTCP is specified scalable transport protocol. In its current form, DCTCP is specified
to be deployable only in controlled environments. Deploying it in to be deployable only in controlled environments. Deploying it in
skipping to change at page 55, line 9 skipping to change at page 55, line 35
occurred by counting reordered packets, _all_ networks will have occurred by counting reordered packets, _all_ networks will have
to keep reducing the time over which they keep packets in order. to keep reducing the time over which they keep packets in order.
If some link technologies keep the time within which reordering If some link technologies keep the time within which reordering
occurs roughly unchanged, then loss over these links, as perceived occurs roughly unchanged, then loss over these links, as perceived
by these hosts, will appear to continually rise over the years. by these hosts, will appear to continually rise over the years.
* In contrast, if all senders detect loss in units of time, the time * In contrast, if all senders detect loss in units of time, the time
over which the network has to keep packets in order stays roughly over which the network has to keep packets in order stays roughly
invariant. invariant.
Therefore hosts have an incentive to detect loss in time units (so as Therefore, hosts have an incentive to detect loss in time units (so
not to fool themselves too often into detecting losses when there are as not to fool themselves too often into detecting losses when there
none). And for hosts that are changing their congestion control are none). And for hosts that are changing their congestion control
implementation to L4S, there is no downside to including time-based implementation to L4S, there is no downside to including time-based
loss detection code in the change (loss recovery implemented in loss detection code in the change (loss recovery implemented in
hardware is an exception, covered later). Therefore requiring L4S hardware is an exception, covered later). Therefore, requiring L4S
hosts to detect loss in time-based units would not be a burden. hosts to detect loss in time-based units would not be a burden.
If the requirement in Section 4.3 were not placed on L4S hosts, even If the requirement in Section 4.3 were not placed on L4S hosts, even
though it would be no burden on hosts to comply, all networks would though it would be no burden on hosts to comply, all networks would
face unnecessary uncertainty over whether some L4S hosts might be face unnecessary uncertainty over whether some L4S hosts might be
detecting loss by counting packets. Then _all_ link technologies detecting loss by counting packets. Then _all_ link technologies
will have to unnecessarily keep reducing the time within which will have to unnecessarily keep reducing the time within which
reordering occurs. That is not a problem for some link technologies, reordering occurs. That is not a problem for some link technologies,
but it becomes increasingly challenging for other link technologies but it becomes increasingly challenging for other link technologies
to continue to scale, particularly those relying on channel bonding to continue to scale, particularly those relying on channel bonding
skipping to change at page 57, line 12 skipping to change at page 57, line 27
Detecting loss in time units also prevents the ACK-splitting attacks Detecting loss in time units also prevents the ACK-splitting attacks
described in [Savage-TCP]. described in [Savage-TCP].
A.2. Scalable Transport Protocol Optimizations A.2. Scalable Transport Protocol Optimizations
A.2.1. Setting ECT in Control Packets and Retransmissions A.2.1. Setting ECT in Control Packets and Retransmissions
Description: This item concerns TCP and its derivatives (e.g. SCTP) Description: This item concerns TCP and its derivatives (e.g. SCTP)
as well as RTP/RTCP [RFC6679]. The original specification of ECN for as well as RTP/RTCP [RFC6679]. The original specification of ECN for
TCP precluded the use of ECN on control packets and retransmissions. TCP precluded the use of ECN on control packets and retransmissions.
Similarly RFC 6679 precludes the use of ECT on RTCP datagrams, in Similarly, RFC 6679 precludes the use of ECT on RTCP datagrams, in
case the path changes after it has been checked for ECN traversal. case the path changes after it has been checked for ECN traversal.
To improve performance, scalable transport protocols ought to enable To improve performance, scalable transport protocols ought to enable
ECN at the IP layer in TCP control packets (SYN, SYN-ACK, pure ACKs, ECN at the IP layer in TCP control packets (SYN, SYN-ACK, pure ACKs,
etc.) and in retransmitted packets. The same is true for other etc.) and in retransmitted packets. The same is true for other
transports, e.g. SCTP, RTCP. transports, e.g. SCTP, RTCP.
Motivation (TCP): RFC 3168 prohibits the use of ECN on these types of Motivation (TCP): RFC 3168 prohibits the use of ECN on these types of
TCP packet, based on a number of arguments. This means these packets TCP packet, based on a number of arguments. This means these packets
are not protected from congestion loss by ECN, which considerably are not protected from congestion loss by ECN, which considerably
harms performance, particularly for short flows. harms performance, particularly for short flows.
skipping to change at page 57, line 45 skipping to change at page 58, line 15
A.2.2. Faster than Additive Increase A.2.2. Faster than Additive Increase
Description: It would improve performance if scalable congestion Description: It would improve performance if scalable congestion
controls did not limit their congestion window increase to the controls did not limit their congestion window increase to the
standard additive increase of 1 SMSS per round trip [RFC5681] during standard additive increase of 1 SMSS per round trip [RFC5681] during
congestion avoidance. The same is true for derivatives of TCP congestion avoidance. The same is true for derivatives of TCP
congestion control, including similar approaches used for real-time congestion control, including similar approaches used for real-time
media. media.
Motivation: As currently defined [RFC8257], DCTCP uses the Motivation: As currently defined [RFC8257], DCTCP uses the
traditional Reno additive increase in congestion avoidance phase. conventional Reno additive increase in congestion avoidance phase.
When the available capacity suddenly increases (e.g. when another When the available capacity suddenly increases (e.g. when another
flow finishes, or if radio capacity increases) it can take very many flow finishes, or if radio capacity increases) it can take very many
round trips to take advantage of the new capacity. TCP round trips to take advantage of the new capacity. TCP
Cubic [RFC8312] was designed to solve this problem, but as flow rates Cubic [RFC8312] was designed to solve this problem, but as flow rates
have continued to increase, the delay accelerating into available have continued to increase, the delay accelerating into available
capacity has become prohibitive. See, for instance, the examples in capacity has become prohibitive. See, for instance, the examples in
Section 5.1 of the L4S architecture [I-D.ietf-tsvwg-l4s-arch]. Even Section 5.1 of the L4S architecture [I-D.ietf-tsvwg-l4s-arch]. Even
when out of its Reno-compatibility mode, every 8x scaling of Cubic's when out of its Reno-compatibility mode, every 8x scaling of Cubic's
flow rate leads to 2x more acceleration delay. flow rate leads to 2x more acceleration delay.
skipping to change at page 65, line 8 skipping to change at page 65, line 25
TCP authentication option (TCP-AO [RFC5925]), QUIC's end-to-end TCP authentication option (TCP-AO [RFC5925]), QUIC's end-to-end
protection [RFC9001] or end-to-end IPsec integrity protection protection [RFC9001] or end-to-end IPsec integrity protection
[RFC4303] can be used to detect any tampering with congestion [RFC4303] can be used to detect any tampering with congestion
feedback (whether malicious or accidental). respectively in TCP, feedback (whether malicious or accidental). respectively in TCP,
QUIC or any transport. TCP-AO covers the main TCP header and TCP QUIC or any transport. TCP-AO covers the main TCP header and TCP
options by default, but it is often too brittle to use on many options by default, but it is often too brittle to use on many
end-to-end paths, where middleboxes can make verification fail in end-to-end paths, where middleboxes can make verification fail in
their attempts to improve performance or security, e.g. by their attempts to improve performance or security, e.g. by
resegmentation or shifting the sequence space. resegmentation or shifting the sequence space.
At the time of writing, It is not common to protect the integrity of
congestion feedback, whether loss or Classic ECN. If this position
changes during the L4S experiment, one or more of the above
techniques might need to be developed and deployed.
C.2. Notification of Less Severe Congestion than CE C.2. Notification of Less Severe Congestion than CE
Various researchers have proposed to use ECT(1) as a less severe Various researchers have proposed to use ECT(1) as a less severe
congestion notification than CE, particularly to enable flows to fill congestion notification than CE, particularly to enable flows to fill
available capacity more quickly after an idle period, when another available capacity more quickly after an idle period, when another
flow departs or when a flow starts, e.g. VCP [VCP], Queue View flow departs or when a flow starts, e.g. VCP [VCP], Queue View
(QV) [QV]. (QV) [QV].
Before assigning ECT(1) as an identifier for L4S, we must carefully Before assigning ECT(1) as an identifier for L4S, we must carefully
consider whether it might be better to hold ECT(1) in reserve for consider whether it might be better to hold ECT(1) in reserve for
future standardisation of rapid flow acceleration, which is an future standardisation of rapid flow acceleration, which is an
important and enduring problem [RFC6077]. important and enduring problem [RFC6077].
Pre-Congestion Notification (PCN) is another scheme that assigns Pre-Congestion Notification (PCN) is another scheme that assigns
alternative semantics to the ECN field. It uses ECT(1) to signify a alternative semantics to the ECN field. It uses ECT(1) to signify a
less severe level of pre-congestion notification than CE [RFC6660]. less severe level of pre-congestion notification than CE [RFC6660].
However, the ECN field only takes on the PCN semantics if packets However, the ECN field only takes on the PCN semantics if packets
carry a Diffserv codepoint defined to indicate PCN marking within a carry a Diffserv codepoint defined to indicate PCN marking within a
controlled environment. PCN is required to be applied solely to the controlled environment. PCN is required to be applied solely to the
outer header of a tunnel across the controlled region in order not to outer header of a tunnel across the controlled region in order not to
interfere with any end-to-end use of the ECN field. Therefore a PCN interfere with any end-to-end use of the ECN field. Therefore, a PCN
region on the path would not interfere with the L4S service region on the path would not interfere with the L4S service
identifier defined in Section 2. identifier defined in Section 2.
Acknowledgements Acknowledgements
Thanks to Richard Scheffenegger, John Leslie, David Taeht, Jonathan Thanks to Richard Scheffenegger, John Leslie, David Taeht, Jonathan
Morton, Gorry Fairhurst, Michael Welzl, Mikael Abrahamsson and Andrew Morton, Gorry Fairhurst, Michael Welzl, Mikael Abrahamsson and Andrew
McGregor for the discussions that led to this specification. Ing-jyh McGregor for the discussions that led to this specification. Ing-jyh
(Inton) Tsang was a contributor to the early drafts of this document. (Inton) Tsang was a contributor to the early drafts of this document.
And thanks to Mikael Abrahamsson, Lloyd Wood, Nicolas Kuhn, Greg And thanks to Mikael Abrahamsson, Lloyd Wood, Nicolas Kuhn, Greg
White, Tom Henderson, David Black, Gorry Fairhurst, Brian Carpenter, White, Tom Henderson, David Black, Gorry Fairhurst, Brian Carpenter,
Jake Holland, Rod Grimes, Richard Scheffenegger, Sebastian Moeller, Jake Holland, Rod Grimes, Richard Scheffenegger, Sebastian Moeller,
Neal Cardwell, Praveen Balasubramanian, Reza Marandian Hagh, Pete Neal Cardwell, Praveen Balasubramanian, Reza Marandian Hagh, Pete
Heist, Stuart Cheshire, Vidhi Goel, Mirja Kuehlewind, Ermin Sakic and Heist, Stuart Cheshire, Vidhi Goel, Mirja Kuehlewind, Ermin Sakic and
Martin Duke for providing help and reviewing this draft and thanks to Martin Duke for providing help and reviewing this draft, and thanks
Ingemar Johansson for reviewing and providing substantial text. to Ingemar Johansson for reviewing and providing substantial text.
Thanks also to the area reviewers: Valery Smyslov, Ines Robles and Thanks also to the area reviewers: Valery Smyslov, Maria Ines Robles,
Bernard Aboba. Thanks to Sebastian Moeller for identifying the Bernard Aboba, Lars Eggert, Roman Danyliw and Eric Vyncke. Thanks to
interaction with VPN anti-replay and to Jonathan Morton for Sebastian Moeller for identifying the interaction with VPN anti-
identifying the attack based on this. Particular thanks to tsvwg replay and to Jonathan Morton for identifying the attack based on
chairs Gorry Fairhurst, David Black and Wes Eddy for patiently this. Particular thanks to tsvwg chairs Gorry Fairhurst, David Black
helping this and the other L4S drafts through the IETF process. and Wes Eddy for patiently helping this and the other L4S drafts
Appendix A listing the Prague L4S Requirements is based on text through the IETF process. Appendix A listing the Prague L4S
authored by Marcelo Bagnulo Braun that was originally an appendix to Requirements is based on text authored by Marcelo Bagnulo Braun that
[I-D.ietf-tsvwg-l4s-arch]. That text was in turn based on the was originally an appendix to [I-D.ietf-tsvwg-l4s-arch]. That text
collective output of the attendees listed in the minutes of a 'bar was in turn based on the collective output of the attendees listed in
BoF' on DCTCP Evolution during IETF-94 [TCPPrague]. the minutes of a 'bar BoF' on DCTCP Evolution during
IETF-94 [TCPPrague].
The authors' contributions were part-funded by the European Community The authors' contributions were part-funded by the European Community
under its Seventh Framework Programme through the Reducing Internet under its Seventh Framework Programme through the Reducing Internet
Transport Latency (RITE) project (ICT-317700). The contribution of Transport Latency (RITE) project (ICT-317700). The contribution of
Koen De Schepper was also part-funded by the 5Growth and DAEMON EU Koen De Schepper was also part-funded by the 5Growth and DAEMON EU
H2020 projects. Bob Briscoe was also funded partly by the Research H2020 projects. Bob Briscoe was also funded partly by the Research
Council of Norway through the TimeIn project, partly by CableLabs and Council of Norway through the TimeIn project, partly by CableLabs and
partly by the Comcast Innovation Fund. The views expressed here are partly by the Comcast Innovation Fund. The views expressed here are
solely those of the authors. solely those of the authors.
skipping to change at page 66, line 19 skipping to change at page 67, line 4
The authors' contributions were part-funded by the European Community The authors' contributions were part-funded by the European Community
under its Seventh Framework Programme through the Reducing Internet under its Seventh Framework Programme through the Reducing Internet
Transport Latency (RITE) project (ICT-317700). The contribution of Transport Latency (RITE) project (ICT-317700). The contribution of
Koen De Schepper was also part-funded by the 5Growth and DAEMON EU Koen De Schepper was also part-funded by the 5Growth and DAEMON EU
H2020 projects. Bob Briscoe was also funded partly by the Research H2020 projects. Bob Briscoe was also funded partly by the Research
Council of Norway through the TimeIn project, partly by CableLabs and Council of Norway through the TimeIn project, partly by CableLabs and
partly by the Comcast Innovation Fund. The views expressed here are partly by the Comcast Innovation Fund. The views expressed here are
solely those of the authors. solely those of the authors.
Authors' Addresses Authors' Addresses
Koen De Schepper Koen De Schepper
Nokia Bell Labs Nokia Bell Labs
Antwerp Antwerp
Belgium Belgium
Email: koen.de_schepper@nokia.com Email: koen.de_schepper@nokia.com
URI: https://www.bell-labs.com/usr/koen.de_schepper URI: https://www.bell-labs.com/about/researcher-profiles/
koende_schepper/
Bob Briscoe (editor) Bob Briscoe (editor)
Independent Independent
United Kingdom United Kingdom
Email: ietf@bobbriscoe.net Email: ietf@bobbriscoe.net
URI: http://bobbriscoe.net/ URI: https://bobbriscoe.net/
 End of changes. 93 change blocks. 
168 lines changed or deleted 191 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/