| < draft-ietf-tsvwg-aqm-dualq-coupled-25g.txt | draft-ietf-tsvwg-aqm-dualq-coupled-25h.txt > | |||
|---|---|---|---|---|
| Transport Area working group (tsvwg) K. De Schepper | Transport Area working group (tsvwg) K. De Schepper | |||
| Internet-Draft Nokia Bell Labs | Internet-Draft Nokia Bell Labs | |||
| Intended status: Experimental B. Briscoe, Ed. | Intended status: Experimental B. Briscoe, Ed. | |||
| Expires: 28 February 2023 Independent | Expires: 1 March 2023 Independent | |||
| G. White | G. White | |||
| CableLabs | CableLabs | |||
| 27 August 2022 | 28 August 2022 | |||
| DualQ Coupled AQMs for Low Latency, Low Loss and Scalable Throughput | DualQ Coupled AQMs for Low Latency, Low Loss and Scalable Throughput | |||
| (L4S) | (L4S) | |||
| draft-ietf-tsvwg-aqm-dualq-coupled-25 | draft-ietf-tsvwg-aqm-dualq-coupled-25 | |||
| Abstract | Abstract | |||
| This specification defines a framework for coupling the Active Queue | This specification defines a framework for coupling the Active Queue | |||
| Management (AQM) algorithms in two queues intended for flows with | Management (AQM) algorithms in two queues intended for flows with | |||
| different responses to congestion. This provides a way for the | different responses to congestion. This provides a way for the | |||
| skipping to change at page 1, line 49 ¶ | skipping to change at page 1, line 49 ¶ | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on 28 February 2023. | This Internet-Draft will expire on 1 March 2023. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2022 IETF Trust and the persons identified as the | Copyright (c) 2022 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents (https://trustee.ietf.org/ | Provisions Relating to IETF Documents (https://trustee.ietf.org/ | |||
| license-info) in effect on the date of publication of this document. | license-info) in effect on the date of publication of this document. | |||
| Please review these documents carefully, as they describe your rights | Please review these documents carefully, as they describe your rights | |||
| skipping to change at page 4, line 33 ¶ | skipping to change at page 4, line 33 ¶ | |||
| to induce an average queue that roughly doubles the base RTT, | to induce an average queue that roughly doubles the base RTT, | |||
| adding 5-15 ms of queuing on average (cf. 500 microseconds with | adding 5-15 ms of queuing on average (cf. 500 microseconds with | |||
| L4S for the same mix of long-running and web traffic). However, | L4S for the same mix of long-running and web traffic). However, | |||
| for many applications low delay is not useful unless it is | for many applications low delay is not useful unless it is | |||
| consistently low. With these AQMs, 99th percentile queuing delay | consistently low. With these AQMs, 99th percentile queuing delay | |||
| is 20-30 ms (cf. 2 ms with the same traffic over L4S). | is 20-30 ms (cf. 2 ms with the same traffic over L4S). | |||
| * Similarly, recent research into using e2e congestion control | * Similarly, recent research into using e2e congestion control | |||
| without needing an AQM in the network (e.g. BBR | without needing an AQM in the network (e.g. BBR | |||
| [I-D.cardwell-iccrg-bbr-congestion-control]) seems to have hit a | [I-D.cardwell-iccrg-bbr-congestion-control]) seems to have hit a | |||
| similar lower limit to queuing delay of about 20ms on average but | similar lower limit to queuing delay of about 20ms on average, but | |||
| there are also regular 25ms delay spikes due to bandwidth probes | there are also regular 25ms delay spikes due to bandwidth probes | |||
| and 60ms spikes due to flow-starts. | and 60ms spikes due to flow-starts. | |||
| L4S learns from the experience of Data Center TCP [RFC8257], which | L4S learns from the experience of Data Center TCP [RFC8257], which | |||
| shows the power of complementary changes both in the network and on | shows the power of complementary changes both in the network and on | |||
| end-systems. DCTCP teaches us that two small but radical changes to | end-systems. DCTCP teaches us that two small but radical changes to | |||
| congestion control are needed to cut the two major outstanding causes | congestion control are needed to cut the two major outstanding causes | |||
| of queuing delay variability: | of queuing delay variability: | |||
| 1. Far smaller rate variations (sawteeth) than Reno-friendly | 1. Far smaller rate variations (sawteeth) than Reno-friendly | |||
| skipping to change at page 6, line 43 ¶ | skipping to change at page 6, line 43 ¶ | |||
| intervention, applications can exploit this new network capability as | intervention, applications can exploit this new network capability as | |||
| their operating systems migrate to Scalable congestion controls, | their operating systems migrate to Scalable congestion controls, | |||
| which can then evolve _while_ their benefits are being enjoyed by | which can then evolve _while_ their benefits are being enjoyed by | |||
| everyone on the Internet. | everyone on the Internet. | |||
| The DualQ Coupled AQM framework can incorporate any AQM designed for | The DualQ Coupled AQM framework can incorporate any AQM designed for | |||
| a single queue that generates a statistical or deterministic mark/ | a single queue that generates a statistical or deterministic mark/ | |||
| drop probability driven by the queue dynamics. Pseudocode examples | drop probability driven by the queue dynamics. Pseudocode examples | |||
| of two different DualQ Coupled AQMs are given in the appendices. In | of two different DualQ Coupled AQMs are given in the appendices. In | |||
| many cases the framework simplifies the basic control algorithm, and | many cases the framework simplifies the basic control algorithm, and | |||
| requires little extra processing. Therefore it is believed the | requires little extra processing. Therefore, it is believed the | |||
| Coupled AQM would be applicable and easy to deploy in all types of | Coupled AQM would be applicable and easy to deploy in all types of | |||
| buffers; buffers in cost-reduced mass-market residential equipment; | buffers; buffers in cost-reduced mass-market residential equipment; | |||
| buffers in end-system stacks; buffers in carrier-scale equipment | buffers in end-system stacks; buffers in carrier-scale equipment | |||
| including remote access servers, routers, firewalls and Ethernet | including remote access servers, routers, firewalls and Ethernet | |||
| switches; buffers in network interface cards, buffers in virtualized | switches; buffers in network interface cards, buffers in virtualized | |||
| network appliances, hypervisors, and so on. | network appliances, hypervisors, and so on. | |||
| For the public Internet, nearly all the benefit will typically be | For the public Internet, nearly all the benefit will typically be | |||
| achieved by deploying the Coupled AQM into either end of the access | achieved by deploying the Coupled AQM into either end of the access | |||
| link between a 'site' and the Internet, which is invariably the | link between a 'site' and the Internet, which is invariably the | |||
| skipping to change at page 7, line 45 ¶ | skipping to change at page 7, line 45 ¶ | |||
| The main results have been validated independently when using the | The main results have been validated independently when using the | |||
| Prague congestion control [Boru20] (experiments are run using Prague | Prague congestion control [Boru20] (experiments are run using Prague | |||
| and DCTCP, but only the former are relevant for validation, because | and DCTCP, but only the former are relevant for validation, because | |||
| Prague fixes a number of problems with the Linux DCTCP code that make | Prague fixes a number of problems with the Linux DCTCP code that make | |||
| it unsuitable for the public Internet). | it unsuitable for the public Internet). | |||
| 1.3. Terminology | 1.3. Terminology | |||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
| document are to be interpreted as described in [RFC2119] when, and | document are to be interpreted as described in [RFC2119] [RFC8174] | |||
| only when, they appear in all capitals, as shown here. | when, and only when, they appear in all capitals, as shown here. | |||
| The DualQ Coupled AQM uses two queues for two services. Each of the | The DualQ Coupled AQM uses two queues for two services. Each of the | |||
| following terms identifies both the service and the queue that | following terms identifies both the service and the queue that | |||
| provides the service: | provides the service: | |||
| Classic service/queue: The Classic service is intended for all the | Classic service/queue: The Classic service is intended for all the | |||
| congestion control behaviours that co-exist with Reno [RFC5681] | congestion control behaviours that co-exist with Reno [RFC5681] | |||
| (e.g. Reno itself, Cubic [RFC8312], TFRC [RFC5348]). | (e.g. Reno itself, Cubic [RFC8312], TFRC [RFC5348]). | |||
| Low-Latency, Low-Loss Scalable throughput (L4S) service/queue: The | Low-Latency, Low-Loss Scalable throughput (L4S) service/queue: The | |||
| 'L4S' service is intended for traffic from scalable congestion | 'L4S' service is intended for traffic from scalable congestion | |||
| control algorithms, such as TCP Prague | control algorithms, such as TCP Prague | |||
| [I-D.briscoe-iccrg-prague-congestion-control], which was derived | [I-D.briscoe-iccrg-prague-congestion-control], which was derived | |||
| from Data Center TCP [RFC8257]. The L4S service is for more | from Data Center TCP [RFC8257]. The L4S service is for more | |||
| general traffic than just TCP Prague -- it allows the set of | general traffic than just TCP Prague -- it allows the set of | |||
| congestion controls with similar scaling properties to Prague to | congestion controls with similar scaling properties to Prague to | |||
| evolve, such as the examples listed earlier (Relentless, SCReAM, | evolve, such as the examples of Scalable congestion controls | |||
| etc.). | listed below (Relentless, SCReAM, etc.). | |||
| Classic Congestion Control: A congestion control behaviour that can | Classic Congestion Control: A congestion control behaviour that can | |||
| co-exist with standard TCP Reno [RFC5681] without causing | co-exist with standard TCP Reno [RFC5681] without causing | |||
| significantly negative impact on its flow rate [RFC5033]. With | significantly negative impact on its flow rate [RFC5033]. With | |||
| Classic congestion controls, such as Reno or Cubic, because flow | Classic congestion controls, such as Reno or Cubic, because flow | |||
| rate has scaled since TCP congestion control was first designed in | rate has scaled since TCP congestion control was first designed in | |||
| 1988, it now takes hundreds of round trips (and growing) to | 1988, it now takes hundreds of round trips (and growing) to | |||
| recover after a congestion signal (whether a loss or an ECN mark) | recover after a congestion signal (whether a loss or an ECN mark) | |||
| as shown in the examples in section 5.1 of the L4S | as shown in the examples in section 5.1 of the L4S | |||
| architecture [I-D.ietf-tsvwg-l4s-arch] and in [RFC3649]. | architecture [I-D.ietf-tsvwg-l4s-arch] and in [RFC3649]. | |||
| Therefore control of queuing and utilization becomes very slack, | Therefore, control of queuing and utilization becomes very slack, | |||
| and the slightest disturbances (e.g. from new flows starting) | and the slightest disturbances (e.g. from new flows starting) | |||
| prevent a high rate from being attained. | prevent a high rate from being attained. | |||
| Scalable Congestion Control: A congestion control where the average | Scalable Congestion Control: A congestion control where the average | |||
| time from one congestion signal to the next (the recovery time) | time from one congestion signal to the next (the recovery time) | |||
| remains invariant as the flow rate scales, all other factors being | remains invariant as the flow rate scales, all other factors being | |||
| equal. This maintains the same degree of control over queueing | equal. This maintains the same degree of control over queueing | |||
| and utilization whatever the flow rate, as well as ensuring that | and utilization whatever the flow rate, as well as ensuring that | |||
| high throughput is robust to disturbances. For instance, DCTCP | high throughput is robust to disturbances. For instance, DCTCP | |||
| averages 2 congestion signals per round-trip whatever the flow | averages 2 congestion signals per round-trip whatever the flow | |||
| rate, as do other recently developed scalable congestion controls, | rate, as do other recently developed scalable congestion controls, | |||
| e.g. Relentless TCP [Mathis09], TCP Prague | e.g. Relentless TCP [I-D.mathis-iccrg-relentless-tcp], TCP Prague | |||
| [I-D.briscoe-iccrg-prague-congestion-control], [PragueLinux], | [I-D.briscoe-iccrg-prague-congestion-control], [PragueLinux], | |||
| BBRv2 [BBRv2], [I-D.cardwell-iccrg-bbr-congestion-control] and the | BBRv2 [BBRv2], [I-D.cardwell-iccrg-bbr-congestion-control] and the | |||
| L4S variant of SCREAM for real-time media [SCReAM], [RFC8298]). | L4S variant of SCREAM for real-time media [SCReAM], [RFC8298]). | |||
| For the public Internet a Scalable transport has to comply with | For the public Internet a Scalable transport has to comply with | |||
| the requirements in Section 4 of [I-D.ietf-tsvwg-ecn-l4s-id] | the requirements in Section 4 of [I-D.ietf-tsvwg-ecn-l4s-id] | |||
| (aka. the 'Prague L4S requirements'). | (aka. the 'Prague L4S requirements'). | |||
| C: Abbreviation for Classic, e.g. when used as a subscript. | C: Abbreviation for Classic, e.g. when used as a subscript. | |||
| L: Abbreviation for L4S, e.g. when used as a subscript. | L: Abbreviation for L4S, e.g. when used as a subscript. | |||
| The terms Classic or L4S can also qualify other nouns, such as | The terms Classic or L4S can also qualify other nouns, such as | |||
| 'codepoint', 'identifier', 'classification', 'packet', 'flow'. | 'codepoint', 'identifier', 'classification', 'packet', 'flow'. | |||
| For example: an L4S packet means a packet with an L4S identifier | For example: an L4S packet means a packet with an L4S identifier | |||
| sent from an L4S congestion control. | sent from an L4S congestion control. | |||
| Both Classic and L4S services can cope with a proportion of | Both Classic and L4S services can cope with a proportion of | |||
| unresponsive or less-responsive traffic as well, but in the L4S | unresponsive or less-responsive traffic as well, but in the L4S | |||
| case its rate has to be smooth enough or low enough not to build a | case its rate has to be smooth enough or low enough not to build a | |||
| queue (e.g. DNS, VoIP, game sync datagrams, etc). The DualQ | queue (e.g. DNS, VoIP, game sync datagrams, etc.). The DualQ | |||
| Coupled AQM behaviour is defined to be similar to a single FIFO | Coupled AQM behaviour is defined to be similar to a single FIFO | |||
| queue with respect to unresponsive and overload traffic. | queue with respect to unresponsive and overload traffic. | |||
| Reno-friendly: The subset of Classic traffic that is friendly to the | Reno-friendly: The subset of Classic traffic that is friendly to the | |||
| standard Reno congestion control defined for TCP in [RFC5681]. | standard Reno congestion control defined for TCP in [RFC5681]. | |||
| Reno-friendly is used in place of 'TCP-friendly', given the latter | Reno-friendly is used in place of 'TCP-friendly', given the latter | |||
| has become imprecise, because the TCP protocol is now used with so | has become imprecise, because the TCP protocol is now used with so | |||
| many different congestion control behaviours, and Reno is used in | many different congestion control behaviours, and Reno is used in | |||
| non-TCP transports such as QUIC. | non-TCP transports such as QUIC. | |||
| skipping to change at page 9, line 40 ¶ | skipping to change at page 9, line 40 ¶ | |||
| ECN field are unchanged from those defined in [RFC3168]: Not ECT, | ECN field are unchanged from those defined in [RFC3168]: Not ECT, | |||
| ECT(0), ECT(1) and CE, where ECT stands for ECN-Capable Transport | ECT(0), ECT(1) and CE, where ECT stands for ECN-Capable Transport | |||
| and CE stands for Congestion Experienced. A packet marked with | and CE stands for Congestion Experienced. A packet marked with | |||
| the CE codepoint is termed 'ECN-marked' or sometimes just 'marked' | the CE codepoint is termed 'ECN-marked' or sometimes just 'marked' | |||
| where the context makes ECN obvious. | where the context makes ECN obvious. | |||
| 1.4. Features | 1.4. Features | |||
| The AQM couples marking and/or dropping from the Classic queue to the | The AQM couples marking and/or dropping from the Classic queue to the | |||
| L4S queue in such a way that a flow will get roughly the same | L4S queue in such a way that a flow will get roughly the same | |||
| throughput whichever it uses. Therefore both queues can feed into | throughput whichever it uses. Therefore, both queues can feed into | |||
| the full capacity of a link and no rates need to be configured for | the full capacity of a link and no rates need to be configured for | |||
| the queues. The L4S queue enables Scalable congestion controls like | the queues. The L4S queue enables Scalable congestion controls like | |||
| DCTCP or TCP Prague to give very low and predictably low latency, | DCTCP or TCP Prague to give very low and predictably low latency, | |||
| without compromising the performance of competing 'Classic' Internet | without compromising the performance of competing 'Classic' Internet | |||
| traffic. | traffic. | |||
| Thousands of tests have been conducted in a typical fixed residential | Thousands of tests have been conducted in a typical fixed residential | |||
| broadband setting. Experiments used a range of base round trip | broadband setting. Experiments used a range of base round trip | |||
| delays up to 100ms and link rates up to 200 Mb/s between the data | delays up to 100ms and link rates up to 200 Mb/s between the data | |||
| centre and home network, with varying amounts of background traffic | centre and home network, with varying amounts of background traffic | |||
| skipping to change at page 10, line 16 ¶ | skipping to change at page 10, line 16 ¶ | |||
| the extensive experiments are available [DualPI2Linux], [PI2], | the extensive experiments are available [DualPI2Linux], [PI2], | |||
| [DCttH19]. Subjective testing using very demanding high bandwidth | [DCttH19]. Subjective testing using very demanding high bandwidth | |||
| low latency applications over a single shared access link is also | low latency applications over a single shared access link is also | |||
| described in [L4Sdemo16] and summarized in the section about | described in [L4Sdemo16] and summarized in the section about | |||
| applications in the L4S architecture [I-D.ietf-tsvwg-l4s-arch] . | applications in the L4S architecture [I-D.ietf-tsvwg-l4s-arch] . | |||
| In all these experiments, the host was connected to the home network | In all these experiments, the host was connected to the home network | |||
| by fixed Ethernet, in order to quantify the queuing delay that can be | by fixed Ethernet, in order to quantify the queuing delay that can be | |||
| achieved by a user who cares about delay. It should be emphasized | achieved by a user who cares about delay. It should be emphasized | |||
| that L4S support at the bottleneck link cannot 'undelay' bursts | that L4S support at the bottleneck link cannot 'undelay' bursts | |||
| introduced by another link on the path, for instance by legacy WiFi | introduced by another link on the path, for instance by legacy Wi-Fi | |||
| equipment. However, if L4S support is added to the queue feeding the | equipment. However, if L4S support is added to the queue feeding the | |||
| _outgoing_ WAN link of a home gateway, it would be counterproductive | _outgoing_ WAN link of a home gateway, it would be counterproductive | |||
| not to also reduce the burstiness of the _incoming_ WiFi. Also, | not to also reduce the burstiness of the _incoming_ Wi-Fi. Also, | |||
| trials of WiFi equipment with an L4S DualQ Coupled AQM on the | trials of Wi-Fi equipment with an L4S DualQ Coupled AQM on the | |||
| _outgoing_ WiFi interface are in progress, and early results of an | _outgoing_ Wi-Fi interface are in progress, and early results of an | |||
| L4S DualQ Coupled AQM in a 5G radio access network testbed with | L4S DualQ Coupled AQM in a 5G radio access network testbed with | |||
| emulated outdoor cell edge radio fading are given in [L4S_5G]. | emulated outdoor cell edge radio fading are given in [L4S_5G]. | |||
| Unlike Diffserv Expedited Forwarding, the L4S queue does not have to | Unlike Diffserv Expedited Forwarding, the L4S queue does not have to | |||
| be limited to a small proportion of the link capacity in order to | be limited to a small proportion of the link capacity in order to | |||
| achieve low delay. The L4S queue can be filled with a heavy load of | achieve low delay. The L4S queue can be filled with a heavy load of | |||
| capacity-seeking flows (TCP Prague etc.) and still achieve low delay. | capacity-seeking flows (TCP Prague etc.) and still achieve low delay. | |||
| The L4S queue does not rely on the presence of other traffic in the | The L4S queue does not rely on the presence of other traffic in the | |||
| Classic queue that can be 'overtaken'. It gives low latency to L4S | Classic queue that can be 'overtaken'. It gives low latency to L4S | |||
| traffic whether or not there is Classic traffic. The tail latency of | traffic whether or not there is Classic traffic. The tail latency of | |||
| skipping to change at page 12, line 33 ¶ | skipping to change at page 12, line 33 ¶ | |||
| the form: | the form: | |||
| p_C = ( p_CL / k )^2 (1) | p_C = ( p_CL / k )^2 (1) | |||
| where k is the constant of proportionality, which is termed the | where k is the constant of proportionality, which is termed the | |||
| coupling factor. | coupling factor. | |||
| 2.2. Dual Queue | 2.2. Dual Queue | |||
| Classic traffic needs to build a large queue to prevent under- | Classic traffic needs to build a large queue to prevent under- | |||
| utilization. Therefore a separate queue is provided for L4S traffic, | utilization. Therefore, a separate queue is provided for L4S | |||
| and it is scheduled with priority over the Classic queue. Priority | traffic, and it is scheduled with priority over the Classic queue. | |||
| is conditional to prevent starvation of Classic traffic in certain | Priority is conditional to prevent starvation of Classic traffic in | |||
| conditions (see Section 2.4). | certain conditions (see Section 2.4). | |||
| Nonetheless, coupled marking ensures that giving priority to L4S | Nonetheless, coupled marking ensures that giving priority to L4S | |||
| traffic still leaves the right amount of spare scheduling time for | traffic still leaves the right amount of spare scheduling time for | |||
| Classic flows to each get equivalent throughput to DCTCP flows (all | Classic flows to each get equivalent throughput to DCTCP flows (all | |||
| other factors such as RTT being equal). | other factors such as RTT being equal). | |||
| 2.3. Traffic Classification | 2.3. Traffic Classification | |||
| Both the Coupled AQM and DualQ mechanisms need an identifier to | Both the Coupled AQM and DualQ mechanisms need an identifier to | |||
| distinguish L4S (L) and Classic (C) packets. Then the coupling | distinguish L4S (L) and Classic (C) packets. Then the coupling | |||
| skipping to change at page 14, line 34 ¶ | skipping to change at page 14, line 34 ¶ | |||
| p_L = max(p'_L, p_CL), (4) | p_L = max(p'_L, p_CL), (4) | |||
| which has also been found to work very well in practice. | which has also been found to work very well in practice. | |||
| The two transformations of p' in equations (2) and (3) implement the | The two transformations of p' in equations (2) and (3) implement the | |||
| required coupling given in equation (1) earlier. | required coupling given in equation (1) earlier. | |||
| The constant of proportionality or coupling factor, k, in equation | The constant of proportionality or coupling factor, k, in equation | |||
| (1) determines the ratio between the congestion probabilities (loss | (1) determines the ratio between the congestion probabilities (loss | |||
| or marking) experienced by L4S and Classic traffic. Thus k | or marking) experienced by L4S and Classic traffic. Thus, k | |||
| indirectly determines the ratio between L4S and Classic flow rates, | indirectly determines the ratio between L4S and Classic flow rates, | |||
| because flows (assuming they are responsive) adjust their rate in | because flows (assuming they are responsive) adjust their rate in | |||
| response to congestion probability. Appendix C.2 gives guidance on | response to congestion probability. Appendix C.2 gives guidance on | |||
| the choice of k and its effect on relative flow rates. | the choice of k and its effect on relative flow rates. | |||
| _________ | _________ | |||
| | | ,------. | | | ,------. | |||
| L4S (L) queue | |===>| ECN | | L4S (L) queue | |===>| ECN | | |||
| ,'| _______|_| |marker|\ | ,'| _______|_| |marker|\ | |||
| <' | | `------'\\ | <' | | `------'\\ | |||
| skipping to change at page 15, line 43 ¶ | skipping to change at page 15, line 43 ¶ | |||
| forwards their packets to the link. Even though the scheduler gives | forwards their packets to the link. Even though the scheduler gives | |||
| priority to the L queue, it is not as strong as the coupling from the | priority to the L queue, it is not as strong as the coupling from the | |||
| C queue. This is because, as the C queue grows, the base AQM applies | C queue. This is because, as the C queue grows, the base AQM applies | |||
| more congestion signals to L traffic (as well as C). As L flows | more congestion signals to L traffic (as well as C). As L flows | |||
| reduce their rate in response, they use less than the scheduling | reduce their rate in response, they use less than the scheduling | |||
| share for L traffic. So, because the scheduler is work preserving, | share for L traffic. So, because the scheduler is work preserving, | |||
| it schedules any C traffic in the gaps. | it schedules any C traffic in the gaps. | |||
| Giving priority to the L queue has the benefit of very low L queue | Giving priority to the L queue has the benefit of very low L queue | |||
| delay, because the L queue is kept empty whenever L traffic is | delay, because the L queue is kept empty whenever L traffic is | |||
| controlled by the coupling. Also there only has to be a coupling in | controlled by the coupling. Also, there only has to be a coupling in | |||
| one direction - from Classic to L4S. Priority has to be conditional | one direction - from Classic to L4S. Priority has to be conditional | |||
| in some way to prevent the C queue being starved in the short-term | in some way to prevent the C queue being starved in the short-term | |||
| (see Section 4.2.2) to give C traffic a means to push in, as | (see Section 4.2.2) to give C traffic a means to push in, as | |||
| explained next. With normal responsive L traffic, the coupled ECN | explained next. With normal responsive L traffic, the coupled ECN | |||
| marking gives C traffic the ability to push back against even strict | marking gives C traffic the ability to push back against even strict | |||
| priority, by congestion marking the L traffic to make it yield some | priority, by congestion marking the L traffic to make it yield some | |||
| space. However, if there is just a small finite set of C packets | space. However, if there is just a small finite set of C packets | |||
| (e.g. a DNS request or an initial window of data) some Classic AQMs | (e.g. a DNS request or an initial window of data) some Classic AQMs | |||
| will not induce enough ECN marking in the L queue, no matter how long | will not induce enough ECN marking in the L queue, no matter how long | |||
| the small set of C packets waits. Then, if the L queue happens to | the small set of C packets waits. Then, if the L queue happens to | |||
| skipping to change at page 16, line 27 ¶ | skipping to change at page 16, line 27 ¶ | |||
| DualPI2 uses a Proportional-Integral (PI) controller as the Base AQM. | DualPI2 uses a Proportional-Integral (PI) controller as the Base AQM. | |||
| Indeed, this Base AQM with just the squared output and no L4S queue | Indeed, this Base AQM with just the squared output and no L4S queue | |||
| can be used as a drop-in replacement for PIE [RFC8033], in which case | can be used as a drop-in replacement for PIE [RFC8033], in which case | |||
| it is just called PI2 [PI2]. PI2 is a principled simplification of | it is just called PI2 [PI2]. PI2 is a principled simplification of | |||
| PIE that is both more responsive and more stable in the face of | PIE that is both more responsive and more stable in the face of | |||
| dynamically varying load. | dynamically varying load. | |||
| Curvy RED is derived from RED [RFC2309], except its configuration | Curvy RED is derived from RED [RFC2309], except its configuration | |||
| parameters are delay-based to make them insensitive to link rate and | parameters are delay-based to make them insensitive to link rate and | |||
| it requires less operations per packet than RED. However, DualPI2 is | it requires fewer operations per packet than RED. However, DualPI2 | |||
| more responsive and stable over a wider range of RTTs than Curvy RED. | is more responsive and stable over a wider range of RTTs than Curvy | |||
| As a consequence, at the time of writing, DualPI2 has attracted more | RED. As a consequence, at the time of writing, DualPI2 has attracted | |||
| development and evaluation attention than Curvy RED, leaving the | more development and evaluation attention than Curvy RED, leaving the | |||
| Curvy RED design not so fully evaluated. | Curvy RED design not so fully evaluated. | |||
| Both AQMs regulate their queue against targets configured in units of | Both AQMs regulate their queue against targets configured in units of | |||
| time rather than bytes. As already explained, this ensures | time rather than bytes. As already explained, this ensures | |||
| configuration can be invariant for different drain rates. With AQMs | configuration can be invariant for different drain rates. With AQMs | |||
| in a dualQ structure this is particularly important because the drain | in a dualQ structure this is particularly important because the drain | |||
| rate of each queue can vary rapidly as flows for the two queues | rate of each queue can vary rapidly as flows for the two queues | |||
| arrive and depart, even if the combined link rate is constant. | arrive and depart, even if the combined link rate is constant. | |||
| It would be possible to control the queues with other alternative | It would be possible to control the queues with other alternative | |||
| skipping to change at page 21, line 36 ¶ | skipping to change at page 21, line 36 ¶ | |||
| two will measure proactive AQM discard; | two will measure proactive AQM discard; | |||
| * ECN packets marked, non-ECN packets dropped, ECN packets dropped, | * ECN packets marked, non-ECN packets dropped, ECN packets dropped, | |||
| which can be combined with the three total packet counts above to | which can be combined with the three total packet counts above to | |||
| calculate marking and dropping probabilities; | calculate marking and dropping probabilities; | |||
| * Queue delay (not including serialization delay of the head packet | * Queue delay (not including serialization delay of the head packet | |||
| or medium acquisition delay) - see further notes below. | or medium acquisition delay) - see further notes below. | |||
| Unlike the other statistics, queue delay cannot be captured in a | Unlike the other statistics, queue delay cannot be captured in a | |||
| simple accumulating counter. Therefore the type of queue delay | simple accumulating counter. Therefore, the type of queue delay | |||
| statistics produced (mean, percentiles, etc.) will depend on | statistics produced (mean, percentiles, etc.) will depend on | |||
| implementation constraints. To facilitate comparative evaluation | implementation constraints. To facilitate comparative evaluation | |||
| of different implementations and approaches, an implementation | of different implementations and approaches, an implementation | |||
| SHOULD allow mean and 99th percentile queue delay to be derived | SHOULD allow mean and 99th percentile queue delay to be derived | |||
| (per queue per sample interval). A relatively simple way to do | (per queue per sample interval). A relatively simple way to do | |||
| this would be to store a coarse-grained histogram of queue delay. | this would be to store a coarse-grained histogram of queue delay. | |||
| This could be done with a small number of bins with configurable | This could be done with a small number of bins with configurable | |||
| edges that represent contiguous ranges of queue delay. Then, over | edges that represent contiguous ranges of queue delay. Then, over | |||
| a sample interval, each bin would accumulate a count of the number | a sample interval, each bin would accumulate a count of the number | |||
| of packets that had fallen within each range. The maximum queue | of packets that had fallen within each range. The maximum queue | |||
| skipping to change at page 22, line 16 ¶ | skipping to change at page 22, line 16 ¶ | |||
| An experimental DualQ Coupled AQM SHOULD asynchronously report the | An experimental DualQ Coupled AQM SHOULD asynchronously report the | |||
| following data about anomalous conditions: | following data about anomalous conditions: | |||
| * Start-time and duration of overload state. | * Start-time and duration of overload state. | |||
| A hysteresis mechanism SHOULD be used to prevent flapping in and | A hysteresis mechanism SHOULD be used to prevent flapping in and | |||
| out of overload causing an event storm. For instance, exit from | out of overload causing an event storm. For instance, exit from | |||
| overload state could trigger one report, but also latch a timer. | overload state could trigger one report, but also latch a timer. | |||
| Then, during that time, if the AQM enters and exits overload state | Then, during that time, if the AQM enters and exits overload state | |||
| any number of times, the duration in overload state is accumulated | any number of times, the duration in overload state is | |||
| but no new report is generated until the first time the AQM is out | accumulated, but no new report is generated until the first time | |||
| of overload once the timer has expired. | the AQM is out of overload once the timer has expired. | |||
| 2.5.2.4. Deployment, Coexistence and Scaling | 2.5.2.4. Deployment, Coexistence and Scaling | |||
| [RFC5706] suggests that deployment, coexistence and scaling should | [RFC5706] suggests that deployment, coexistence and scaling should | |||
| also be covered as management requirements. The raison d'etre of the | also be covered as management requirements. The raison d'etre of the | |||
| DualQ Coupled AQM is to enable deployment and coexistence of Scalable | DualQ Coupled AQM is to enable deployment and coexistence of Scalable | |||
| congestion controls - as incremental replacements for today's Reno- | congestion controls - as incremental replacements for today's Reno- | |||
| friendly controls that do not scale with bandwidth-delay product. | friendly controls that do not scale with bandwidth-delay product. | |||
| Therefore there is no need to repeat these motivating issues here | Therefore, there is no need to repeat these motivating issues here | |||
| given they are already explained in the Introduction and detailed in | given they are already explained in the Introduction and detailed in | |||
| the L4S architecture [I-D.ietf-tsvwg-l4s-arch]. | the L4S architecture [I-D.ietf-tsvwg-l4s-arch]. | |||
| The descriptions of specific DualQ Coupled AQM algorithms in the | The descriptions of specific DualQ Coupled AQM algorithms in the | |||
| appendices cover scaling of their configuration parameters, e.g. with | appendices cover scaling of their configuration parameters, e.g. with | |||
| respect to RTT and sampling frequency. | respect to RTT and sampling frequency. | |||
| 3. IANA Considerations (to be removed by RFC Editor) | 3. IANA Considerations (to be removed by RFC Editor) | |||
| This specification contains no IANA considerations. | This specification contains no IANA considerations. | |||
| skipping to change at page 23, line 21 ¶ | skipping to change at page 23, line 21 ¶ | |||
| (a 'zero-sum game'), whereas queuing delay can be reduced for | (a 'zero-sum game'), whereas queuing delay can be reduced for | |||
| everyone, without any need for someone else to lose out. It also | everyone, without any need for someone else to lose out. It also | |||
| explains that, on the current Internet, scheduling usually enforces | explains that, on the current Internet, scheduling usually enforces | |||
| separation of bandwidth between 'sites' (e.g. households, businesses | separation of bandwidth between 'sites' (e.g. households, businesses | |||
| or mobile users), but it is not common to need to schedule or police | or mobile users), but it is not common to need to schedule or police | |||
| the bandwidth used by individual application flows. | the bandwidth used by individual application flows. | |||
| By the above arguments, per-flow rate policing might not be necessary | By the above arguments, per-flow rate policing might not be necessary | |||
| and in trusted environments (e.g. private data centres) it is | and in trusted environments (e.g. private data centres) it is | |||
| certainly unlikely to be needed. Therefore, because it is hard to | certainly unlikely to be needed. Therefore, because it is hard to | |||
| avoid complexity and unintended side-effects with per-flow rate | avoid complexity and unintended side effects with per-flow rate | |||
| policing, it needs to be separable from a basic AQM, as an option, | policing, it needs to be separable from a basic AQM, as an option, | |||
| under policy control. On this basis, the DualQ Coupled AQM provides | under policy control. On this basis, the DualQ Coupled AQM provides | |||
| low delay without prejudging the question of per-flow rate policing. | low delay without prejudging the question of per-flow rate policing. | |||
| Nonetheless, the interests of users or flows might conflict, e.g. in | Nonetheless, the interests of users or flows might conflict, e.g. in | |||
| case of accident or malice. Then per-flow rate control could be | case of accident or malice. Then per-flow rate control could be | |||
| necessary. If flow-rate control is needed, it can be provided as a | necessary. If flow-rate control is needed, it can be provided as a | |||
| modular addition to a DualQ. And similarly, if protection against | modular addition to a DualQ. And similarly, if protection against | |||
| excessive queue delay is needed, a per-flow queue protection option | excessive queue delay is needed, a per-flow queue protection option | |||
| can be added to a DualQ (e.g. [I-D.briscoe-docsis-q-protection]). | can be added to a DualQ (e.g. [I-D.briscoe-docsis-q-protection]). | |||
| skipping to change at page 25, line 19 ¶ | skipping to change at page 25, line 19 ¶ | |||
| Section 2.5.1) to avoid short-term starvation of Classic. Otherwise, | Section 2.5.1) to avoid short-term starvation of Classic. Otherwise, | |||
| as explained in Section 2.4, even a lone responsive L4S flow could | as explained in Section 2.4, even a lone responsive L4S flow could | |||
| temporarily block a small finite set of C packets (e.g. an initial | temporarily block a small finite set of C packets (e.g. an initial | |||
| window or DNS request). The blockage would only be brief, but it | window or DNS request). The blockage would only be brief, but it | |||
| could be longer for certain AQM implementations that can only | could be longer for certain AQM implementations that can only | |||
| increase the congestion signal coupled from the C queue when C | increase the congestion signal coupled from the C queue when C | |||
| packets are actually being dequeued. There is then the question of | packets are actually being dequeued. There is then the question of | |||
| whether to sacrifice L4S throughput or L4S delay (or some other | whether to sacrifice L4S throughput or L4S delay (or some other | |||
| policy) to make the priority conditional: | policy) to make the priority conditional: | |||
| Sacrifice L4S throughput: By using weighted round robin as the | Sacrifice L4S throughput: By using weighted round-robin as the | |||
| conditional priority scheduler, the L4S service can sacrifice some | conditional priority scheduler, the L4S service can sacrifice some | |||
| throughput during overload. This can either be thought of as | throughput during overload. This can either be thought of as | |||
| guaranteeing a minimum throughput service for Classic traffic, or | guaranteeing a minimum throughput service for Classic traffic, or | |||
| as guaranteeing a maximum delay for a packet at the head of the | as guaranteeing a maximum delay for a packet at the head of the | |||
| Classic queue. | Classic queue. | |||
| Cautionary note: a WRR scheduler can only guarantee Classic | Cautionary note: a WRR scheduler can only guarantee Classic | |||
| throughput if Classic sources are sending enough to use it -- | throughput if Classic sources are sending enough to use it -- | |||
| congestion signals can undermine scheduling because they determine | congestion signals can undermine scheduling because they determine | |||
| how much responsive traffic of each class arrives for scheduling | how much responsive traffic of each class arrives for scheduling | |||
| skipping to change at page 29, line 33 ¶ | skipping to change at page 29, line 33 ¶ | |||
| [AQMmetrics] | [AQMmetrics] | |||
| Kwon, M. and S. Fahmy, "A Comparison of Load-based and | Kwon, M. and S. Fahmy, "A Comparison of Load-based and | |||
| Queue- based Active Queue Management Algorithms", Proc. | Queue- based Active Queue Management Algorithms", Proc. | |||
| Int'l Soc. for Optical Engineering (SPIE) 4866:35--46 DOI: | Int'l Soc. for Optical Engineering (SPIE) 4866:35--46 DOI: | |||
| 10.1117/12.473021, 2002, | 10.1117/12.473021, 2002, | |||
| <https://www.cs.purdue.edu/homes/fahmy/papers/ldc.pdf>. | <https://www.cs.purdue.edu/homes/fahmy/papers/ldc.pdf>. | |||
| [ARED01] Floyd, S., Gummadi, R., and S. Shenker, "Adaptive RED: An | [ARED01] Floyd, S., Gummadi, R., and S. Shenker, "Adaptive RED: An | |||
| Algorithm for Increasing the Robustness of RED's Active | Algorithm for Increasing the Robustness of RED's Active | |||
| Queue Management", ACIRI Technical Report , August 2001, | Queue Management", ACIRI Technical Report , August 2001, | |||
| <http://www.icir.org/floyd/red.html>. | <https://www.icir.org/floyd/red.html>. | |||
| [BBRv2] Cardwell, N., "BRTCP BBR v2 Alpha/Preview Release", github | [BBRv2] Cardwell, N., "BRTCP BBR v2 Alpha/Preview Release", GitHub | |||
| repository; Linux congestion control module, | repository; Linux congestion control module, | |||
| <https://github.com/google/bbr/blob/v2alpha/README.md>. | <https://github.com/google/bbr/blob/v2alpha/README.md>. | |||
| [Boru20] Boru Oljira, D., Grinnemo, K-J., Brunstrom, A., and J. | [Boru20] Boru Oljira, D., Grinnemo, K-J., Brunstrom, A., and J. | |||
| Taheri, "Validating the Sharing Behavior and Latency | Taheri, "Validating the Sharing Behavior and Latency | |||
| Characteristics of the L4S Architecture", ACM CCR | Characteristics of the L4S Architecture", ACM CCR | |||
| 50(2):37--44, May 2020, | 50(2):37--44, May 2020, | |||
| <https://dl.acm.org/doi/abs/10.1145/3402413.3402419>. | <https://dl.acm.org/doi/abs/10.1145/3402413.3402419>. | |||
| [CCcensus19] | [CCcensus19] | |||
| Mishra, A., Sun, X., Jain, A., Pande, S., Joshi, R., and | Mishra, A., Sun, X., Jain, A., Pande, S., Joshi, R., and | |||
| B. Leong, "The Great Internet TCP Congestion Control | B. Leong, "The Great Internet TCP Congestion Control | |||
| Census", Proc. ACM on Measurement and Analysis of | Census", Proc. ACM on Measurement and Analysis of | |||
| Computing Systems 3(3), December 2019, | Computing Systems 3(3), December 2019, | |||
| <https://doi.org/10.1145/3366693>. | <https://doi.org/10.1145/3366693>. | |||
| [CoDel] Nichols, K. and V. Jacobson, "Controlling Queue Delay", | [CoDel] Nichols, K. and V. Jacobson, "Controlling Queue Delay", | |||
| ACM Queue 10(5), May 2012, | ACM Queue 10(5), May 2012, | |||
| <http://queue.acm.org/issuedetail.cfm?issue=2208917>. | <https://queue.acm.org/issuedetail.cfm?issue=2208917>. | |||
| [CRED_Insights] | [CRED_Insights] | |||
| Briscoe, B., "Insights from Curvy RED (Random Early | Briscoe, B., "Insights from Curvy RED (Random Early | |||
| Detection)", BT Technical Report TR-TUB8-2015-003 | Detection)", BT Technical Report TR-TUB8-2015-003 | |||
| arXiv:1904.07339 [cs.NI], July 2015, | arXiv:1904.07339 [cs.NI], July 2015, | |||
| <https://arxiv.org/abs/1904.07339>. | <https://arxiv.org/abs/1904.07339>. | |||
| [DCttH19] De Schepper, K., Bondarenko, O., Tilmans, O., and B. | [DCttH19] De Schepper, K., Bondarenko, O., Tilmans, O., and B. | |||
| Briscoe, "`Data Centre to the Home': Ultra-Low Latency for | Briscoe, "`Data Centre to the Home': Ultra-Low Latency for | |||
| All", Updated RITE project Technical Report , July 2019, | All", Updated RITE project Technical Report , July 2019, | |||
| skipping to change at page 30, line 36 ¶ | skipping to change at page 30, line 36 ¶ | |||
| [DualPI2Linux] | [DualPI2Linux] | |||
| Albisser, O., De Schepper, K., Briscoe, B., Tilmans, O., | Albisser, O., De Schepper, K., Briscoe, B., Tilmans, O., | |||
| and H. Steen, "DUALPI2 - Low Latency, Low Loss and | and H. Steen, "DUALPI2 - Low Latency, Low Loss and | |||
| Scalable (L4S) AQM", Proc. Linux Netdev 0x13 , March 2019, | Scalable (L4S) AQM", Proc. Linux Netdev 0x13 , March 2019, | |||
| <https://www.netdevconf.org/0x13/session.html?talk- | <https://www.netdevconf.org/0x13/session.html?talk- | |||
| DUALPI2-AQM>. | DUALPI2-AQM>. | |||
| [DualQ-Test] | [DualQ-Test] | |||
| Steen, H., "Destruction Testing: Ultra-Low Delay using | Steen, H., "Destruction Testing: Ultra-Low Delay using | |||
| Dual Queue Coupled Active Queue Management", Masters | Dual Queue Coupled Active Queue Management", Master's | |||
| Thesis, Dept of Informatics, Uni Oslo , May 2017, | Thesis, Dept of Informatics, Uni Oslo , May 2017, | |||
| <https://www.duo.uio.no/bitstream/handle/10852/57424/ | <https://www.duo.uio.no/bitstream/handle/10852/57424/ | |||
| thesis-henrste.pdf?sequence=1>. | thesis-henrste.pdf?sequence=1>. | |||
| [Dukkipati06] | [Dukkipati06] | |||
| Dukkipati, N. and N. McKeown, "Why Flow-Completion Time is | Dukkipati, N. and N. McKeown, "Why Flow-Completion Time is | |||
| the Right Metric for Congestion Control", ACM CCR | the Right Metric for Congestion Control", ACM CCR | |||
| 36(1):59--62, January 2006, | 36(1):59--62, January 2006, | |||
| <https://dl.acm.org/doi/10.1145/1111322.1111336>. | <https://dl.acm.org/doi/10.1145/1111322.1111336>. | |||
| [Heist21] Heist, P. and J. Morton, "L4S Tests", github README, | [Heist21] Heist, P. and J. Morton, "L4S Tests", GitHub README, | |||
| August 2021, <https://github.com/heistp/l4s- | August 2021, <https://github.com/heistp/l4s- | |||
| tests/#underutilization-with-bursty-traffic>. | tests/#underutilization-with-bursty-traffic>. | |||
| [I-D.briscoe-docsis-q-protection] | [I-D.briscoe-docsis-q-protection] | |||
| Briscoe, B. and G. White, "The DOCSIS(r) Queue Protection | Briscoe, B. and G. White, "The DOCSIS(r) Queue Protection | |||
| Algorithm to Preserve Low Latency", Work in Progress, | Algorithm to Preserve Low Latency", Work in Progress, | |||
| Internet-Draft, draft-briscoe-docsis-q-protection-06, 13 | Internet-Draft, draft-briscoe-docsis-q-protection-06, 13 | |||
| May 2022, | May 2022, | |||
| <https://datatracker.ietf.org/api/v1/doc/document/draft- | <https://datatracker.ietf.org/api/v1/doc/document/draft- | |||
| briscoe-docsis-q-protection/>. | briscoe-docsis-q-protection/>. | |||
| skipping to change at page 31, line 44 ¶ | skipping to change at page 31, line 44 ¶ | |||
| cardwell-iccrg-bbr-congestion-control/>. | cardwell-iccrg-bbr-congestion-control/>. | |||
| [I-D.ietf-tsvwg-l4s-arch] | [I-D.ietf-tsvwg-l4s-arch] | |||
| Briscoe, B., Schepper, K. D., Bagnulo, M., and G. White, | Briscoe, B., Schepper, K. D., Bagnulo, M., and G. White, | |||
| "Low Latency, Low Loss, Scalable Throughput (L4S) Internet | "Low Latency, Low Loss, Scalable Throughput (L4S) Internet | |||
| Service: Architecture", Work in Progress, Internet-Draft, | Service: Architecture", Work in Progress, Internet-Draft, | |||
| draft-ietf-tsvwg-l4s-arch-19, 27 July 2022, | draft-ietf-tsvwg-l4s-arch-19, 27 July 2022, | |||
| <https://datatracker.ietf.org/api/v1/doc/document/draft- | <https://datatracker.ietf.org/api/v1/doc/document/draft- | |||
| ietf-tsvwg-l4s-arch/>. | ietf-tsvwg-l4s-arch/>. | |||
| [I-D.mathis-iccrg-relentless-tcp] | ||||
| Mathis, M., "Relentless Congestion Control", Work in | ||||
| Progress, Internet-Draft, draft-mathis-iccrg-relentless- | ||||
| tcp-00, 4 March 2009, <https://www.ietf.org/archive/id/ | ||||
| draft-mathis-iccrg-relentless-tcp-00.txt>. | ||||
| [L4Sdemo16] | [L4Sdemo16] | |||
| Bondarenko, O., De Schepper, K., Tsang, I., and B. | Bondarenko, O., De Schepper, K., Tsang, I., and B. | |||
| Briscoe, "Ultra-Low Delay for All: Live Experience, Live | Briscoe, "Ultra-Low Delay for All: Live Experience, Live | |||
| Analysis", Proc. MMSYS'16 pp33:1--33:4, May 2016, | Analysis", Proc. MMSYS'16 pp33:1--33:4, May 2016, | |||
| <http://dl.acm.org/citation.cfm?doid=2910017.2910633 | <https//dl.acm.org/citation.cfm?doid=2910017.2910633 | |||
| (videos of demos: | (videos of demos: | |||
| https://riteproject.eu/dctth/#1511dispatchwg )>. | https://riteproject.eu/dctth/#1511dispatchwg )>. | |||
| [L4S_5G] Willars, P., Wittenmark, E., Ronkainen, H., Östberg, C., | [L4S_5G] Willars, P., Wittenmark, E., Ronkainen, H., Östberg, C., | |||
| Johansson, I., Strand, J., Lédl, P., and D. Schnieders, | Johansson, I., Strand, J., Lédl, P., and D. Schnieders, | |||
| "Enabling time-critical applications over 5G with rate | "Enabling time-critical applications over 5G with rate | |||
| adaptation", Ericsson - Deutsche Telekom White Paper BNEW- | adaptation", Ericsson - Deutsche Telekom White Paper BNEW- | |||
| 21:025455 Uen, May 2021, <https://www.ericsson.com/en/ | 21:025455 Uen, May 2021, <https://www.ericsson.com/en/ | |||
| reports-and-papers/white-papers/enabling-time-critical- | reports-and-papers/white-papers/enabling-time-critical- | |||
| applications-over-5g-with-rate-adaptation>. | applications-over-5g-with-rate-adaptation>. | |||
| skipping to change at page 32, line 24 ¶ | skipping to change at page 32, line 28 ¶ | |||
| Labovitz, C., Iekel-Johnson, S., McPherson, D., Oberheide, | Labovitz, C., Iekel-Johnson, S., McPherson, D., Oberheide, | |||
| J., and F. Jahanian, "Internet Inter-Domain Traffic", Proc | J., and F. Jahanian, "Internet Inter-Domain Traffic", Proc | |||
| ACM SIGCOMM; ACM CCR 40(4):75--86, August 2010, | ACM SIGCOMM; ACM CCR 40(4):75--86, August 2010, | |||
| <https://doi.org/10.1145/1851275.1851194>. | <https://doi.org/10.1145/1851275.1851194>. | |||
| [LLD] White, G., Sundaresan, K., and B. Briscoe, "Low Latency | [LLD] White, G., Sundaresan, K., and B. Briscoe, "Low Latency | |||
| DOCSIS: Technology Overview", CableLabs White Paper , | DOCSIS: Technology Overview", CableLabs White Paper , | |||
| February 2019, <https://cablela.bs/low-latency-docsis- | February 2019, <https://cablela.bs/low-latency-docsis- | |||
| technology-overview-february-2019>. | technology-overview-february-2019>. | |||
| [Mathis09] Mathis, M., "Relentless Congestion Control", PFLDNeT'09 , | ||||
| May 2009, <http://www.hpcc.jp/pfldnet2009/ | ||||
| Program_files/1569198525.pdf>. | ||||
| [MEDF] Menth, M., Schmid, M., Heiss, H., and T. Reim, "MEDF - a | [MEDF] Menth, M., Schmid, M., Heiss, H., and T. Reim, "MEDF - a | |||
| simple scheduling algorithm for two real-time transport | simple scheduling algorithm for two real-time transport | |||
| service classes with application in the UTRAN", Proc. IEEE | service classes with application in the UTRAN", Proc. IEEE | |||
| Conference on Computer Communications (INFOCOM'03) Vol.2 | Conference on Computer Communications (INFOCOM'03) Vol.2 | |||
| pp.1116-1122, March 2003, | pp.1116-1122, March 2003, | |||
| <http://infocom2003.ieee-infocom.org/papers/27_04.PDF>. | <https://infocom2003.ieee-infocom.org/papers/27_04.PDF>. | |||
| [PI2] De Schepper, K., Bondarenko, O., Briscoe, B., and I. | [PI2] De Schepper, K., Bondarenko, O., Briscoe, B., and I. | |||
| Tsang, "PI2: A Linearized AQM for both Classic and | Tsang, "PI2: A Linearized AQM for both Classic and | |||
| Scalable TCP", ACM CoNEXT'16 , December 2016, | Scalable TCP", ACM CoNEXT'16 , December 2016, | |||
| <https://riteproject.files.wordpress.com/2015/10/ | <https://riteproject.files.wordpress.com/2015/10/ | |||
| pi2_conext.pdf>. | pi2_conext.pdf>. | |||
| [PI2param] Briscoe, B., "PI2 Parameters", Technical Report TR-BB- | [PI2param] Briscoe, B., "PI2 Parameters", Technical Report TR-BB- | |||
| 2021-001 arXiv:2107.01003 [cs.NI], July 2021, | 2021-001 arXiv:2107.01003 [cs.NI], July 2021, | |||
| <https://arxiv.org/abs/2107.01003>. | <https://arxiv.org/abs/2107.01003>. | |||
| skipping to change at page 34, line 22 ¶ | skipping to change at page 34, line 22 ¶ | |||
| Lightweight Control Scheme to Address the Bufferbloat | Lightweight Control Scheme to Address the Bufferbloat | |||
| Problem", RFC 8033, DOI 10.17487/RFC8033, February 2017, | Problem", RFC 8033, DOI 10.17487/RFC8033, February 2017, | |||
| <https://www.rfc-editor.org/info/rfc8033>. | <https://www.rfc-editor.org/info/rfc8033>. | |||
| [RFC8034] White, G. and R. Pan, "Active Queue Management (AQM) Based | [RFC8034] White, G. and R. Pan, "Active Queue Management (AQM) Based | |||
| on Proportional Integral Controller Enhanced PIE) for | on Proportional Integral Controller Enhanced PIE) for | |||
| Data-Over-Cable Service Interface Specifications (DOCSIS) | Data-Over-Cable Service Interface Specifications (DOCSIS) | |||
| Cable Modems", RFC 8034, DOI 10.17487/RFC8034, February | Cable Modems", RFC 8034, DOI 10.17487/RFC8034, February | |||
| 2017, <https://www.rfc-editor.org/info/rfc8034>. | 2017, <https://www.rfc-editor.org/info/rfc8034>. | |||
| [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | ||||
| 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | ||||
| May 2017, <https://www.rfc-editor.org/info/rfc8174>. | ||||
| [RFC8257] Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L., | [RFC8257] Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L., | |||
| and G. Judd, "Data Center TCP (DCTCP): TCP Congestion | and G. Judd, "Data Center TCP (DCTCP): TCP Congestion | |||
| Control for Data Centers", RFC 8257, DOI 10.17487/RFC8257, | Control for Data Centers", RFC 8257, DOI 10.17487/RFC8257, | |||
| October 2017, <https://www.rfc-editor.org/info/rfc8257>. | October 2017, <https://www.rfc-editor.org/info/rfc8257>. | |||
| [RFC8290] Hoeiland-Joergensen, T., McKenney, P., Taht, D., Gettys, | [RFC8290] Hoeiland-Joergensen, T., McKenney, P., Taht, D., Gettys, | |||
| J., and E. Dumazet, "The Flow Queue CoDel Packet Scheduler | J., and E. Dumazet, "The Flow Queue CoDel Packet Scheduler | |||
| and Active Queue Management Algorithm", RFC 8290, | and Active Queue Management Algorithm", RFC 8290, | |||
| DOI 10.17487/RFC8290, January 2018, | DOI 10.17487/RFC8290, January 2018, | |||
| <https://www.rfc-editor.org/info/rfc8290>. | <https://www.rfc-editor.org/info/rfc8290>. | |||
| skipping to change at page 34, line 47 ¶ | skipping to change at page 35, line 5 ¶ | |||
| [RFC8312] Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and | [RFC8312] Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and | |||
| R. Scheffenegger, "CUBIC for Fast Long-Distance Networks", | R. Scheffenegger, "CUBIC for Fast Long-Distance Networks", | |||
| RFC 8312, DOI 10.17487/RFC8312, February 2018, | RFC 8312, DOI 10.17487/RFC8312, February 2018, | |||
| <https://www.rfc-editor.org/info/rfc8312>. | <https://www.rfc-editor.org/info/rfc8312>. | |||
| [RFC8404] Moriarty, K., Ed. and A. Morton, Ed., "Effects of | [RFC8404] Moriarty, K., Ed. and A. Morton, Ed., "Effects of | |||
| Pervasive Encryption on Operators", RFC 8404, | Pervasive Encryption on Operators", RFC 8404, | |||
| DOI 10.17487/RFC8404, July 2018, | DOI 10.17487/RFC8404, July 2018, | |||
| <https://www.rfc-editor.org/info/rfc8404>. | <https://www.rfc-editor.org/info/rfc8404>. | |||
| [SCReAM] Johansson, I., "SCReAM", github repository; , | [SCReAM] Johansson, I., "SCReAM", GitHub repository; , | |||
| <https://github.com/EricssonResearch/scream/blob/master/ | <https://github.com/EricssonResearch/scream/blob/master/ | |||
| README.md>. | README.md>. | |||
| [SigQ-Dyn] Briscoe, B., "Rapid Signalling of Queue Dynamics", | [SigQ-Dyn] Briscoe, B., "Rapid Signalling of Queue Dynamics", | |||
| Technical Report TR-BB-2017-001 arXiv:1904.07044 [cs.NI], | Technical Report TR-BB-2017-001 arXiv:1904.07044 [cs.NI], | |||
| September 2017, <https://arxiv.org/abs/1904.07044>. | September 2017, <https://arxiv.org/abs/1904.07044>. | |||
| Appendix A. Example DualQ Coupled PI2 Algorithm | Appendix A. Example DualQ Coupled PI2 Algorithm | |||
| As a first concrete example, the pseudocode below gives the DualPI2 | As a first concrete example, the pseudocode below gives the DualPI2 | |||
| skipping to change at page 39, line 9 ¶ | skipping to change at page 39, line 9 ¶ | |||
| 28: } | 28: } | |||
| 29: return FALSE | 29: return FALSE | |||
| 30: } | 30: } | |||
| Figure 4: Example Dequeue Pseudocode for DualQ Coupled PI2 AQM | Figure 4: Example Dequeue Pseudocode for DualQ Coupled PI2 AQM | |||
| When packets arrive, first a common queue limit is checked as shown | When packets arrive, first a common queue limit is checked as shown | |||
| in line 2 of the enqueuing pseudocode in Figure 3. This assumes a | in line 2 of the enqueuing pseudocode in Figure 3. This assumes a | |||
| shared buffer for the two queues (Note b discusses the merits of | shared buffer for the two queues (Note b discusses the merits of | |||
| separate buffers). In order to avoid any bias against larger | separate buffers). In order to avoid any bias against larger | |||
| packets, 1 MTU of space is always allowed and the limit is | packets, 1 MTU of space is always allowed, and the limit is | |||
| deliberately tested before enqueue. | deliberately tested before enqueue. | |||
| If limit is not exceeded, the packet is timestamped in line 4 (only | If limit is not exceeded, the packet is timestamped in line 4 (only | |||
| if the sojourn time technique is being used to measure queue delay; | if the sojourn time technique is being used to measure queue delay; | |||
| see Note a for alternatives). | see Note a for alternatives). | |||
| At lines 5-9, the packet is classified and enqueued to the Classic or | At lines 5-9, the packet is classified and enqueued to the Classic or | |||
| L4S queue dependent on the least significant bit of the ECN field in | L4S queue dependent on the least significant bit of the ECN field in | |||
| the IP header (line 6). Packets with a codepoint having an LSB of 0 | the IP header (line 6). Packets with a codepoint having an LSB of 0 | |||
| (Not-ECT and ECT(0)) will be enqueued in the Classic queue. | (Not-ECT and ECT(0)) will be enqueued in the Classic queue. | |||
| skipping to change at page 42, line 32 ¶ | skipping to change at page 42, line 32 ¶ | |||
| significant outlier and, on reflection, the experimental technique | significant outlier and, on reflection, the experimental technique | |||
| seemed inappropriate to the CDN market in China. | seemed inappropriate to the CDN market in China. | |||
| * g is taken as 0.38. The factor g is a geometry factor that | * g is taken as 0.38. The factor g is a geometry factor that | |||
| characterizes the shape of the sawteeth of prevalent Classic | characterizes the shape of the sawteeth of prevalent Classic | |||
| congestion controllers. The geometry factor is the fraction of | congestion controllers. The geometry factor is the fraction of | |||
| the amplitude of the sawtooth variability in queue delay that lies | the amplitude of the sawtooth variability in queue delay that lies | |||
| below the AQM's target. For instance, at low bit rate, the | below the AQM's target. For instance, at low bit rate, the | |||
| geometry factor of standard Reno is 0.5, but at higher rates it | geometry factor of standard Reno is 0.5, but at higher rates it | |||
| tends to just under 1. According to the census of congestion | tends to just under 1. According to the census of congestion | |||
| controllers conducted by Mishra _et al_ in Jul-Oct | controllers conducted by Mishra et al. in Jul-Oct | |||
| 2019 [CCcensus19], most Classic TCP traffic uses Cubic. And, | 2019 [CCcensus19], most Classic TCP traffic uses Cubic. And, | |||
| according to the analysis in [PI2param], if running over a PI2 | according to the analysis in [PI2param], if running over a PI2 | |||
| AQM, a large proportion of this Cubic traffic would be in its | AQM, a large proportion of this Cubic traffic would be in its | |||
| Reno-Friendly mode, which has a geometry factor of ~0.39 (all | Reno-Friendly mode, which has a geometry factor of ~0.39 (all | |||
| known implementations). The rest of the Cubic traffic would be in | known implementations). The rest of the Cubic traffic would be in | |||
| true Cubic mode, which has a geometry factor of ~0.36. Without | true Cubic mode, which has a geometry factor of ~0.36. Without | |||
| modelling the sawtooth profiles from all the other less prevalent | modelling the sawtooth profiles from all the other less prevalent | |||
| congestion controllers, we estimate a 7:3 weighted average of | congestion controllers, we estimate a 7:3 weighted average of | |||
| these two, resulting in an average geometry factor of 0.38. | these two, resulting in an average geometry factor of 0.38. | |||
| * f is taken as 2. The factor f is a safety factor that increases | * f is taken as 2. The factor f is a safety factor that increases | |||
| the target queue to allow for the distribution of RTT_typ around | the target queue to allow for the distribution of RTT_typ around | |||
| its mean. Otherwise the target queue would only avoid | its mean. Otherwise, the target queue would only avoid | |||
| underutilization for those users below the mean. It also provides | underutilization for those users below the mean. It also provides | |||
| a safety margin for the proportion of paths in use that span | a safety margin for the proportion of paths in use that span | |||
| beyond the distance between a user and their local CDN. Currently | beyond the distance between a user and their local CDN. | |||
| no data is available on the variance of queue delay around the | Currently, no data is available on the variance of queue delay | |||
| mean in each region, so there is plenty of room for this guess to | around the mean in each region, so there is plenty of room for | |||
| become more educated. | this guess to become more educated. | |||
| * [PI2param] recommends target = RTT_typ * g * f = 25ms * 0.38 * 2 = | * [PI2param] recommends target = RTT_typ * g * f = 25ms * 0.38 * 2 = | |||
| 19 ms. However a further adjustment is warranted, because target | 19 ms. However, a further adjustment is warranted, because target | |||
| is moving year on year. The paper is based on data collected in | is moving year-on-year. The paper is based on data collected in | |||
| 2019, and it mentions evidence from speedtest.net that suggests | 2019, and it mentions evidence from speedtest.net that suggests | |||
| RTT_typ reduced by 17% (fixed) or 12% (mobile) between 2020 and | RTT_typ reduced by 17% (fixed) or 12% (mobile) between 2020 and | |||
| 2021. Therefore we recommend a default of target = 15 ms at the | 2021. Therefore, we recommend a default of target = 15 ms at the | |||
| time of writing (2021). | time of writing (2021). | |||
| Operators can always use the data and discussion in [PI2param] to | Operators can always use the data and discussion in [PI2param] to | |||
| configure a more appropriate target for their environment. For | configure a more appropriate target for their environment. For | |||
| instance, an operator might wish to question the assumptions called | instance, an operator might wish to question the assumptions called | |||
| out in that paper, such as the goal of no underutilization for a | out in that paper, such as the goal of no underutilization for a | |||
| large majority of single flow transfers (given many large transfers | large majority of single flow transfers (given many large transfers | |||
| use multiple flows to avoid the scaling limitations of Classic | use multiple flows to avoid the scaling limitations of Classic | |||
| flows). | flows). | |||
| skipping to change at page 44, line 4 ¶ | skipping to change at page 44, line 4 ¶ | |||
| The choice of alpha and beta also determines the AQM's stable | The choice of alpha and beta also determines the AQM's stable | |||
| operating range. The AQM ought to change p' as fast as possible in | operating range. The AQM ought to change p' as fast as possible in | |||
| response to changes in load without over-compensating and therefore | response to changes in load without over-compensating and therefore | |||
| causing oscillations in the queue. Therefore, the values of alpha | causing oscillations in the queue. Therefore, the values of alpha | |||
| and beta also depend on the RTT of the expected worst-case flow | and beta also depend on the RTT of the expected worst-case flow | |||
| (RTT_max). | (RTT_max). | |||
| The maximum RTT of a PI controller (RTT_max in line 10 of Figure 2) | The maximum RTT of a PI controller (RTT_max in line 10 of Figure 2) | |||
| is not an absolute maximum, but more instability (more queue | is not an absolute maximum, but more instability (more queue | |||
| variability) sets in for long-running flows with an RTT above this | variability) sets in for long-running flows with an RTT above this | |||
| value. The propagation delay half way round the planet and back in | value. The propagation delay halfway round the planet and back in | |||
| glass fibre is 200 ms. However, hardly any traffic traverses such | glass fibre is 200 ms. However, hardly any traffic traverses such | |||
| extreme paths and, since the significant consolidation of Internet | extreme paths and, since the significant consolidation of Internet | |||
| traffic between 2007 and 2009 [Labovitz10], a high and growing | traffic between 2007 and 2009 [Labovitz10], a high and growing | |||
| proportion of all Internet traffic (roughly two-thirds at the time of | proportion of all Internet traffic (roughly two-thirds at the time of | |||
| writing) has been served from content distribution networks (CDNs) or | writing) has been served from content distribution networks (CDNs) or | |||
| 'cloud' services distributed close to end-users. The Internet might | 'cloud' services distributed close to end-users. The Internet might | |||
| change again, but for now, designing for a maximum RTT of 100ms is a | change again, but for now, designing for a maximum RTT of 100ms is a | |||
| good compromise between faster queue control at low RTT and some | good compromise between faster queue control at low RTT and some | |||
| instability on the occasions when a longer path is necessary. | instability on the occasions when a longer path is necessary. | |||
| skipping to change at page 45, line 29 ¶ | skipping to change at page 45, line 29 ¶ | |||
| Notes: | Notes: | |||
| a. The drain rate of the queue can vary if it is scheduled relative | a. The drain rate of the queue can vary if it is scheduled relative | |||
| to other queues, or to cater for fluctuations in a wireless | to other queues, or to cater for fluctuations in a wireless | |||
| medium. To auto-adjust to changes in drain rate, the queue needs | medium. To auto-adjust to changes in drain rate, the queue needs | |||
| to be measured in time, not bytes or packets [AQMmetrics], | to be measured in time, not bytes or packets [AQMmetrics], | |||
| [CoDel]. Queuing delay could be measured directly as the sojourn | [CoDel]. Queuing delay could be measured directly as the sojourn | |||
| time (aka. service time) of the queue, by storing a per-packet | time (aka. service time) of the queue, by storing a per-packet | |||
| time-stamp as each packet is enqueued, and subtracting this from | time-stamp as each packet is enqueued, and subtracting this from | |||
| the system time when the packet is dequeued. If time- stamping | the system time when the packet is dequeued. If time-stamping is | |||
| is not easy to introduce with certain hardware, queuing delay | not easy to introduce with certain hardware, queuing delay could | |||
| could be predicted indirectly by dividing the size of the queue | be predicted indirectly by dividing the size of the queue by the | |||
| by the predicted departure rate, which might be known precisely | predicted departure rate, which might be known precisely for some | |||
| for some link technologies (see for example in DOCSIS PIE | link technologies (see for example in DOCSIS PIE [RFC8034]). | |||
| [RFC8034]). | ||||
| However, sojourn time is slow to detect bursts. For instance, if | However, sojourn time is slow to detect bursts. For instance, if | |||
| a burst arrives at an empty queue, the sojourn time only fully | a burst arrives at an empty queue, the sojourn time only fully | |||
| measures the burst's delay when its last packet is dequeued, even | measures the burst's delay when its last packet is dequeued, even | |||
| though the queue has known the size of the burst since its last | though the queue has known the size of the burst since its last | |||
| packet was enqueued - so it could have signalled congestion | packet was enqueued - so it could have signalled congestion | |||
| earlier. To remedy this, each head packet can be marked when it | earlier. To remedy this, each head packet can be marked when it | |||
| is dequeued based on the expected delay of the tail packet behind | is dequeued based on the expected delay of the tail packet behind | |||
| it, as explained below, rather than based on the head packet's | it, as explained below, rather than based on the head packet's | |||
| own delay due to the packets in front of it. [Heist21] identifies | own delay due to the packets in front of it. [Heist21] identifies | |||
| skipping to change at page 46, line 20 ¶ | skipping to change at page 46, line 19 ¶ | |||
| memory than the otherwise equivalent 'scaled sojourn time' | memory than the otherwise equivalent 'scaled sojourn time' | |||
| metric, which is the sojourn time of a packet scaled by the ratio | metric, which is the sojourn time of a packet scaled by the ratio | |||
| of the queue sizes when the packet departed and | of the queue sizes when the packet departed and | |||
| arrived [SigQ-Dyn]. | arrived [SigQ-Dyn]. | |||
| b. Line 2 of the dualpi2_enqueue() function (Figure 3) assumes an | b. Line 2 of the dualpi2_enqueue() function (Figure 3) assumes an | |||
| implementation where lq and cq share common buffer memory. An | implementation where lq and cq share common buffer memory. An | |||
| alternative implementation could use separate buffers for each | alternative implementation could use separate buffers for each | |||
| queue, in which case the arriving packet would have to be | queue, in which case the arriving packet would have to be | |||
| classified first to determine which buffer to check for available | classified first to determine which buffer to check for available | |||
| space. The choice is a trade off; a shared buffer can use less | space. The choice is a trade-off; a shared buffer can use less | |||
| memory whereas separate buffers isolate the L4S queue from tail- | memory whereas separate buffers isolate the L4S queue from tail- | |||
| drop due to large bursts of Classic traffic (e.g. a Classic Reno | drop due to large bursts of Classic traffic (e.g. a Classic Reno | |||
| TCP during slow-start over a long RTT). | TCP during slow-start over a long RTT). | |||
| c. There has been some concern that using the step function of DCTCP | c. There has been some concern that using the step function of DCTCP | |||
| for the Native L4S AQM requires end-systems to smooth the signal | for the Native L4S AQM requires end-systems to smooth the signal | |||
| for an unnecessarily large number of round trips to ensure | for an unnecessarily large number of round trips to ensure | |||
| sufficient fidelity. A ramp is no worse than a step in initial | sufficient fidelity. A ramp is no worse than a step in initial | |||
| experiments with existing DCTCP. Therefore, it is recommended | experiments with existing DCTCP. Therefore, it is recommended | |||
| that a ramp is configured in place of a step, which will allow | that a ramp is configured in place of a step, which will allow | |||
| skipping to change at page 46, line 45 ¶ | skipping to change at page 46, line 44 ¶ | |||
| effectively turn the ramp into a step function, as used by DCTCP, | effectively turn the ramp into a step function, as used by DCTCP, | |||
| by setting the range to zero. There will not be a divide by zero | by setting the range to zero. There will not be a divide by zero | |||
| problem at line 5 of Figure 5 because, if minTh is equal to | problem at line 5 of Figure 5 because, if minTh is equal to | |||
| maxTh, the condition for this ramp calculation cannot arise. | maxTh, the condition for this ramp calculation cannot arise. | |||
| A.2. Pass #2: Edge-Case Details | A.2. Pass #2: Edge-Case Details | |||
| This section takes a second pass through the pseudocode adding | This section takes a second pass through the pseudocode adding | |||
| details of two edge-cases: low link rate and overload. Figure 7 | details of two edge-cases: low link rate and overload. Figure 7 | |||
| repeats the dequeue function of Figure 4, but with details of both | repeats the dequeue function of Figure 4, but with details of both | |||
| edge-cases added. Similarly Figure 8 repeats the core PI algorithm | edge-cases added. Similarly, Figure 8 repeats the core PI algorithm | |||
| of Figure 6, but with overload details added. The initialization, | of Figure 6, but with overload details added. The initialization, | |||
| enqueue, L4S AQM and recur functions are unchanged. | enqueue, L4S AQM and recur functions are unchanged. | |||
| The link rate can be so low that it takes a single packet queue | The link rate can be so low that it takes a single packet queue | |||
| longer to serialize than the threshold delay at which ECN marking | longer to serialize than the threshold delay at which ECN marking | |||
| starts to be applied in the L queue. Therefore, a minimum marking | starts to be applied in the L queue. Therefore, a minimum marking | |||
| threshold parameter in units of packets rather than time is necessary | threshold parameter in units of packets rather than time is necessary | |||
| (Th_len, default 1 packet in line 19 of Figure 2) to ensure that the | (Th_len, default 1 packet in line 19 of Figure 2) to ensure that the | |||
| ramp does not trigger excessive marking on slow links. Where an | ramp does not trigger excessive marking on slow links. Where an | |||
| implementation knows the link rate, it can set up this minimum at the | implementation knows the link rate, it can set up this minimum at the | |||
| skipping to change at page 50, line 10 ¶ | skipping to change at page 50, line 10 ¶ | |||
| 4: p_CL = p' * k % Coupled L4S prob = base prob * coupling factor | 4: p_CL = p' * k % Coupled L4S prob = base prob * coupling factor | |||
| 5: p_C = p'^2 % Classic prob = (base prob)^2 | 5: p_C = p'^2 % Classic prob = (base prob)^2 | |||
| 6: prevq = curq | 6: prevq = curq | |||
| 7: } | 7: } | |||
| Figure 8: Example PI-Update Pseudocode for DualQ Coupled PI2 AQM | Figure 8: Example PI-Update Pseudocode for DualQ Coupled PI2 AQM | |||
| (Including Overload Code) | (Including Overload Code) | |||
| The choice of scheduler technology is critical to overload protection | The choice of scheduler technology is critical to overload protection | |||
| (see Section 4.2.2). | (see Section 4.2.2). | |||
| * A well-understood weighted scheduler such as weighted round robin | * A well-understood weighted scheduler such as weighted round-robin | |||
| (WRR) is recommended. As long as the scheduler weight for Classic | (WRR) is recommended. As long as the scheduler weight for Classic | |||
| is small (e.g. 1/16), its exact value is unimportant because it | is small (e.g. 1/16), its exact value is unimportant because it | |||
| does not normally determine capacity shares. The weight is only | does not normally determine capacity shares. The weight is only | |||
| important to prevent unresponsive L4S traffic starving Classic | important to prevent unresponsive L4S traffic starving Classic | |||
| traffic in the short term (see Section 4.2.2). This is because | traffic in the short term (see Section 4.2.2). This is because | |||
| capacity sharing between the queues is normally determined by the | capacity sharing between the queues is normally determined by the | |||
| coupled congestion signal, which overrides the scheduler, by | coupled congestion signal, which overrides the scheduler, by | |||
| making L4S sources leave roughly equal per-flow capacity available | making L4S sources leave roughly equal per-flow capacity available | |||
| for Classic flows. | for Classic flows. | |||
| skipping to change at page 61, line 38 ¶ | skipping to change at page 61, line 38 ¶ | |||
| p_C = ( p_CL / k )^2 (1) | p_C = ( p_CL / k )^2 (1) | |||
| k* = 1.64 * (R_C / R_L) (7) | k* = 1.64 * (R_C / R_L) (7) | |||
| We say that this coupling factor is theoretical, because it is in | We say that this coupling factor is theoretical, because it is in | |||
| terms of two RTTs, which raises two practical questions: i) for | terms of two RTTs, which raises two practical questions: i) for | |||
| multiple flows with different RTTs, the RTT for each traffic class | multiple flows with different RTTs, the RTT for each traffic class | |||
| would have to be derived from the RTTs of all the flows in that class | would have to be derived from the RTTs of all the flows in that class | |||
| (actually the harmonic mean would be needed); ii) a network node | (actually the harmonic mean would be needed); ii) a network node | |||
| cannot easily know the RTT of any of the flows anyway. | cannot easily know the RTT of the flows anyway. | |||
| RTT-dependence is caused by window-based congestion control, so it | RTT-dependence is caused by window-based congestion control, so it | |||
| ought to be reversed there, not in the network. Therefore, we use a | ought to be reversed there, not in the network. Therefore, we use a | |||
| fixed coupling factor in the network, and reduce RTT-dependence in | fixed coupling factor in the network, and reduce RTT-dependence in | |||
| L4S senders. We cannot expect Classic senders to all be updated to | L4S senders. We cannot expect Classic senders to all be updated to | |||
| reduce their RTT-dependence. But solely addressing the problem in | reduce their RTT-dependence. But solely addressing the problem in | |||
| L4S senders at least makes RTT-dependence no worse - not just between | L4S senders at least makes RTT-dependence no worse - not just between | |||
| L4S senders, but also between L4S and Classic senders. | L4S senders, but also between L4S and Classic senders. | |||
| Traditionally, throughput equivalence has been defined for flows | Traditionally, throughput equivalence has been defined for flows | |||
| skipping to change at page 63, line 34 ¶ | skipping to change at page 63, line 34 ¶ | |||
| ~= 0.85 * (R_bC + target) / (1.22 * max(R_bL, R_typ)) | ~= 0.85 * (R_bC + target) / (1.22 * max(R_bL, R_typ)) | |||
| ~= (R_bC + target) / (1.4 * max(R_bL, R_typ)) | ~= (R_bC + target) / (1.4 * max(R_bL, R_typ)) | |||
| It can be seen that, for base RTTs below target (15 ms), both the | It can be seen that, for base RTTs below target (15 ms), both the | |||
| numerator and the denominator plateau, which has the desired effect | numerator and the denominator plateau, which has the desired effect | |||
| of limiting RTT-dependence. | of limiting RTT-dependence. | |||
| At the start of the above derivations, an explanation was promised | At the start of the above derivations, an explanation was promised | |||
| for why the L4S throughput equation in equation (6) did not need to | for why the L4S throughput equation in equation (6) did not need to | |||
| model RTT-independence. This is because we only use one point - at | model RTT-independence. This is because we only use one point - at | |||
| the the typical base RTT where the operator chooses to calculate the | the typical base RTT where the operator chooses to calculate the | |||
| coupling factor. Then, throughput equivalence will at least hold at | coupling factor. Then, throughput equivalence will at least hold at | |||
| that chosen point. Nonetheless, assuming Prague senders implement | that chosen point. Nonetheless, assuming Prague senders implement | |||
| RTT-independence over a range of RTTs below this, the throughput | RTT-independence over a range of RTTs below this, the throughput | |||
| equivalence will then extend over that range as well. | equivalence will then extend over that range as well. | |||
| Congestion control designers can choose different ways to reduce RTT- | Congestion control designers can choose different ways to reduce RTT- | |||
| dependence. And each operator can make a policy choice to decide on | dependence. And each operator can make a policy choice to decide on | |||
| a different base RTT, and therefore a different k, at which it wants | a different base RTT, and therefore a different k, at which it wants | |||
| throughput equivalence. Nonetheless, for the Internet, it makes | throughput equivalence. Nonetheless, for the Internet, it makes | |||
| sense to choose what is believed to be the typical RTT most users | sense to choose what is believed to be the typical RTT most users | |||
| skipping to change at page 65, line 12 ¶ | skipping to change at page 65, line 12 ¶ | |||
| Centre to the Home broadband testbed on which DualQ Coupled AQM | Centre to the Home broadband testbed on which DualQ Coupled AQM | |||
| implementations were tested. | implementations were tested. | |||
| Authors' Addresses | Authors' Addresses | |||
| Koen De Schepper | Koen De Schepper | |||
| Nokia Bell Labs | Nokia Bell Labs | |||
| Antwerp | Antwerp | |||
| Belgium | Belgium | |||
| Email: koen.de_schepper@nokia.com | Email: koen.de_schepper@nokia.com | |||
| URI: https://www.bell-labs.com/usr/koen.de_schepper | URI: https://www.bell-labs.com/about/researcher-profiles/ | |||
| koende_schepper/ | ||||
| Bob Briscoe (editor) | Bob Briscoe (editor) | |||
| Independent | Independent | |||
| United Kingdom | United Kingdom | |||
| Email: ietf@bobbriscoe.net | Email: ietf@bobbriscoe.net | |||
| URI: http://bobbriscoe.net/ | URI: https://bobbriscoe.net/ | |||
| Greg White | Greg White | |||
| CableLabs | CableLabs | |||
| Louisville, CO, | Louisville, CO, | |||
| United States of America | United States of America | |||
| Email: G.White@CableLabs.com | Email: G.White@CableLabs.com | |||
| End of changes. 48 change blocks. | ||||
| 70 lines changed or deleted | 76 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||