<?xml  version="1.0"  encoding="US-ASCII"?>
<!DOCTYPE  rfc SYSTEM  "rfc2629.dtd"  [  <!ENTITY
rfc6733 PUBLIC ""
"http://xml.resource.org/public/rfc/bibxml/reference.RFC.6733.xml">  <!ENTITY
rfc5390 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5390.xml">  <!ENTITY
draft-ietf-dime-overload-reqs                              PUBLIC                              ""
"http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-dime-overload-reqs.xml"> <!ENTITY
draft-roach-dime-overload-ctrl                              PUBLIC                              ""
"http://xml.resource.org/public/rfc/bibxml3/reference.I-D.roach-dime-overload-ctrl.xml"> <!ENTITY
draft-korhonen-dime-ovl                              PUBLIC                              ""
"http://xml.resource.org/public/rfc/bibxml3/reference.I-D.korhonen-dime-ovl.xml">
  ]>
<?xml-stylesheet  type='text/xsl'   href='rfc2629.xslt'  ?>
<?rfc   toc="yes"?>
<?rfc compact="yes" ?>
<!-- conserve vertical whitespace -->
<?rfc subcompact="no" ?>
<!-- but keep a blank line between list items -->
<?rfc  sortrefs="no"  ?>
<?rfc  symrefs="yes"  ?>
<rfc  category="info" ipr="trust200902" docName="draft-campbell-dime-overload-issues-01">
  <front>
    <title>Diameter Overload Control Solution Issues</title>
    <author fullname="Ben Campbell" initials="B." surname="Campbell">
      <organization>Tekelec</organization>
      <address>
        <postal>
          <street>17210 Campbell Rd.</street>
          <street>Suite 250</street>
          <city>Dallas</city>
          <region>TX</region>
          <code>75252</code>
          <country>US</country>
        </postal>
        <email>ben@nostrum.com</email>
      </address>
    </author>
    <date/>
    <area>Operations</area>
    <abstract>
      <t>
       The Diameter Maintenance and Extensions (DIME) working group has
       undertaken an "overload control" work item, with the goal of
       standardizing a mechanism to allow Diameter nodes to report overload
       information among themselves. Requirements currently include, among
       others, the need to accurately report the scope of overload conditions,
       and the ability to report overload information between nodes that are
       not directly connected at the transport layer. These requirements
       introduce complex issues. This document describes those issues, in the
       hope that it will assist the working group's decision process.
      </t>
    </abstract>
  </front>
  <middle>
    <section  anchor = "intro" title="Introduction">
      <t>
       When a <xref target = "RFC6733">Diameter</xref> server or agent becomes
       overloaded, it needs to be able to gracefully reduce its load,
       typically by requesting other nodes to reduce the number of Diameter
       requests for some period of time.
      </t>
      <t>
       The Diameter Overload Control
       <xref target="I-D.ietf-dime-overload-reqs">Requirements</xref> describe
       requirements for overload control mechanisms. Requirement 31 states
       that Diameter nodes must be able to report overload with sufficient
       granularity to avoid forcing available capacity to go unused.
       Requirement 34 requires the ability to report overload across Diameter
       nodes that do not support the mechanism. These requirements introduce
       significant and interrelated complexities to potential solutions. This
       document describes the related issues. The author hopes that this
       document will assist the working group's decision process related to
       these requirements.
      </t>
      <t>
       At the time of this writing, there have been two proposals for Diameter
       overload control solutions.
       <xref target="I-D.roach-dime-overload-ctrl">"A Mechanism for Diameter
       Overload Control" (MDOC)</xref> defines a solution that piggybacks
       overload and load state information over existing Diameter messages.
       <xref target="I-D.korhonen-dime-ovl">"The Diameter Overload Control
       Application" (DOCA)</xref> defines a solution that uses a new dedicated
       Diameter application to communicate similar information.
      </t>
      <t>
       <list>
         <t>
          While there are significant differences between the two proposals,
          they carry similar information. In many ways, the issues related to
          Requirements 31 and 34 apply to both proposals. This discussion is
          not specific to one proposal or the other, unless explicitly
          mentioned.
         </t>
       </list>
      </t>
      <t>
       This document serves two purposes. The primary purpose is to explore
       the issues related to Requirement 34, that is, the requirement for the
       overload control mechanism to support sending load and overload
       information across intermediaries that do not support the mechanism
       (referred to herein as "non-adjacent" overload reporting.) The document
       describes two use cases for non-adjacent overload reporting. It does
       not, however, attempt to describe the use cases for Diameter agents in
       general. For a more thorough treatment of Diameter agent use cases in
       the context of overload control, please see
       <xref target="I-D.ietf-dime-overload-reqs"/>.
      </t>
      <t>
       The secondary purpose is to help the reader understand the concept of
       overload scopes, and make recommendations about what kinds of overload
       scope should be supported by the mechanism. These purposes are
       interrelated, since an understanding of overload scopes is necessary to
       fully understand some of the issues with non-adjacent overload
       reporting.
      </t>
    </section>
    <section title="Document Conventions">
      <t>
       This document uses terms defined in <xref target="RFC6733" /> and
       <xref target="I-D.ietf-dime-overload-reqs"/>. In particular, the terms
       "client", "server","upstream", and "downstream" are used as defined in
       RFC 6733. In addition, this document uses the following terms:
      </t>
      <t>
       <list style='hanging' hangIndent='10'>
         <t hangText="Overload:">
          A condition where a Diameter node needs a reduction in the number of
          requests that it must handle.
         </t>
         <t hangText="Overload Report:">
          A request to reduce traffic that contributes to an overload
          condition.
         </t>
         <t hangText="Overload Scope:">
          A classifier that defines the set of requests that may contribute to
          particular overload conditions. Alternatively, the purposes for
          which a node may be overloaded. For example, if a server is
          overloaded for the purposes of one Diameter application but not
          another, the overload condition can be considered "scoped" to that
          application.
         </t>
         <t hangText="Reporting Node:">
          The node that sends an overload report. Also known as an "overloaded
          node".
         </t>
         <t hangText="Reacting Node:">
          A node that consumes and possibly acts on an overload report.
         </t>
         <t hangText="Adjacent Overload Reporting:">
          Overload reports exchanged between adjacent Diameter peers.
         </t>
         <t hangText="Non-Adjacent Overload Reporting:">
          Overload reports sent between Diameter nodes separated by one or
          more intermediate Diameter agents (i.e. relays or proxies) .
         </t>
         <t hangText="Piggybacked Overload Reporting:">
          The inclusion of overload reports in existing Diameter messages.
         </t>
         <t hangText="Application-Based Overload Reporting:">
          The sending of overload reports in a separate, dedicated Diameter
          application.
         </t>
       </list>
      </t>
    </section>
    <section title="Non-adjacent Overload Information" anchor="nonadjacent">
      <t>
       Requirement 34 of <xref target="I-D.ietf-dime-overload-reqs"/> says
       that the selected Diameter overload control mechanism "SHOULD" be able
       to communicate overload and load information across intermediaries that
       do not support the mechanism. This requirement introduces a number of
       complications to the solution effort, creating complications in how
       Diameters negotiate support for overload control, address and route
       overload reports to the right places, and act on received overload
       reports.
      </t>
      <t>
       While the requirement does not explicitly say it, we interpret
       "intermediaries" in this context to mean Diameter agents. The
       requirement is irrelevant for lower layer intermediaries (e.g.
       routers), and cannot be reasonably applied for non-Diameter entities,
       or hybrid entities such as gateways between Diameter and other
       protocols.
      </t>
      <t>
       The requirement to traverse non-supporting intermediaries is not
       necessarily the same thing as a requirement for end-to-end
       communication of overload reports between Diameter clients and servers.
       Non-adjacent reporting can include client-to-server scenarios. They can
       also include server-to-agent scenarios and agent-to-client scenarios.
       All such scenarios may include one or more intervening agents. Since
       Diameter allows transactions to be sent from server to client, all
       scenarios may be reversed. Therefore, we refer to this requirement as
       "Non-adjacent Overload Control".
      </t>
      <section title="Use-Cases for Non-adjacent Overload Control">
        <t>
         There are two primary use-cases for non-adjacent overload control.
        </t>
        <section title="Interconnect" anchor="interconnect">
          <t>
           The first significant non-adjacent use-case is the interconnect
           scenario described in section 2.3 of the
           <xref target="I-D.ietf-dime-overload-reqs">overload control
           requirements</xref>. Two or more Diameter network operators
           communicate with each other across a third-party interconnect
           provider that brokers Diameter traffic between the operators.
           <xref target="fig-interconnect" /> illustrates the interconnect use
           case.
          </t>
          <figure anchor="fig-interconnect" title="Two Operator Interconnect Scenario">
            <artwork>
		        +-------------------------------------------+
		        |               Interconnect                |
		        |                                           |
		        |   +--------------+      +--------------+  |
		        |   |     Agent    |------|     Agent    |  |
		        |   +--------------+      +--------------+  |
		        |         .'                      `.        |
		        +------.-'--------------------------`.------+
		             .'                               `.
		          .-'                                   `.
	------------.'-----+                             +----`.------------
		  +----------+ |                             | +----------+
		  |Edge Agent|                               | |Edge Agent|
		  +----------+ |                             | +----------+
		               |                             |
		    Operator 1 |                             |  Operator 2
	-------------------+                             +------------------
		 </artwork>
          </figure>
          <t>
           If the interconnect provider does not support Diameter overload
           control, each operator network becomes an island of overload
           control, similar to those in the
           <xref target="nonsupporting-agents">non-supporting agent
           use-case</xref>. Even if the interconnect provider does support
           overload control, the operators may not trust it to generate and
           act on overload reports on the operators' behalves, and may prefer
           to exchange overload and load information directly with each other.
          </t>
          <t>
           The interconnect use-case may introduce additional security
           concerns. While the non-supporting agent use case typically (but
           not necessarily) occurs inside a single administrative domain, the
           interconnect case will almost always involve sending overload
           reports across multiple administrative domains. Since a malicious
           or incorrect overload report can effectively shut down Diameter
           processing, the current lack of a viable solution for end-to-end
           integrity protection of Diameter messages may be a problem.
          </t>
        </section>
        <section title="Non-Supporting Agents" anchor="nonsupporting-agents">
          <t>
           <xref target="I-D.ietf-dime-overload-reqs"/> requires the solution
           to function in networks where not all Diameter elements support it.
           That is, the solution must allow gradual deployment, and must not
           require a flag-day cutover. If non-adjacent overload control is not
           supported, one or more non-supporting Diameter Agents can divide a
           network into overload control islands, where overload information
           is communicated inside each island, but not among separate islands.
          </t>
          <t>
           <list>
             <t>
              In the author's strictly personal opinion, the non-supporting
              agent use case is less compelling than the interconnect case.
              The non-supporting agent case would typically occur inside one
              administrative domain. The operator of that domain has
              considerably more control over the implementations used in the
              domain than it might have for third-party domains.
             </t>
           </list>
          </t>
        </section>
      </section>
      <section title="Issues with Non-Adjacent Overload Control">
        <section title="Topology Issues" anchor="topology">
          <t>
           Many of the issues with non-adjacent overload control derive from
           the fact that a Diameter node is unlikely to know the topology of
           the Diameter network past its immediate peers. In a trivial
           topology, that is, a Diameter network with only clients and
           servers, this is not a problem. But if the immediate peer is a
           Diameter agent, a node is unlikely to know what next hop the relay
           will select for a given Diameter message. This is particularly
           difficult if the agent hides topology in either direction, or uses
           dynamic peer discovery. While a node may be able to infer the path
           a given message will take in some specific cases (e.g. for
           mid-session messages), they cannot do this in general. And even
           those specific cases may fail if an agent on the message path
           performs topology hiding.
          </t>
          <t>
           This lack of topology knowledge impacts the way that nodes can
           negotiate overload-control support, the ways they send overload
           reports, and the ways a reacting node can act to mitigate overload.
           A non-adjacent overload-control mechanism will need to solve the
           topology issues, either by offering ways to discover non-adjacent
           topologies, or offering ways to constrain overload-control relevant
           parts of such topologies in ways where a node could reasonably know
           them in advance.
          </t>
        </section>
        <section title="Support Negotiation">
          <t>
           Diameter nodes need to negotiate or otherwise indicate their
           support for overload control to other nodes. This includes
           indicating support for overload control in general, as well as
           potentially indicating support of certain parameters of the
           overload control solution. For example, a node may need to indicate
           which overload algorithms it supports. This becomes complex if two
           non-adjacent nodes need to negotiate support.
          </t>
          <t>
           In a Diameter application-based solution, support for the overload
           control application would occur during the capabilities exchange
           between peers. Diameter capabilities exchange occurs strictly
           between peers; Diameter offers no mechanism for indicating support
           of a given Application-ID between non-adjacent nodes.
          </t>
          <t>
           Diameter allows non-negotiated use of an arbitrary Application-Id
           between non-adjacent nodes across Diameter agents that implement
           the Diameter Relay application. In theory, this means that an
           application-based, non-adjacent overload control could only
           traverse Diameter relays, or Diameter proxies that explicitly
           support the overload-control Application-Id. In the latter case, we
           assume that a proxy will not indicate support for the
           overload-control Application-Id unless it supports the
           overload-control mechanism; such a proxy cannot be considered a
           non-supporting agent.
          </t>
          <t>
           In practice, a Diameter agent can act as a proxy for some purposes
           and a relay for others. If a Diameter proxy indicates support for
           the Diameter relay application, we assume that it will relay any
           arbitrary application. This means it can be considered a relay for
           the purposes of overload control.
          </t>
          <t>
           For both application-based and piggybacked solutions, a supporting
           node needs know the other nodes with which it should negotiate. For
           overload-control between Diameter peers, this is easy; a node
           exchanges support information with its immediate peers. But for
           non-adjacent overload control, this is more difficult for reasons
           discussed in <xref target="topology"/>.
          </t>
          <t>
           Therefore, for non-adjacent overload control negotiation, each
           supporting node either needs advance knowledge of all nodes with
           which it may negotiate overload-control support, or it needs a
           mechanism for discovering that knowledge dynamically.
          </t>
        </section>
        <section title="Overload Report Delivery">
          <t>
           With adjacent overload control reporting, overload report
           addressing and delivery is relatively simple. A node sends overload
           reports directly to its peers. This becomes more complex for
           non-adjacent overload-control.
          </t>
          <t>
           For application-based overload control, nodes could address
           overload reports to specific endpoint nodes using the
           Destination-Host AVP. Doing so would be subject to the same
           non-adjacent topology issues described in
           <xref target="topology"/>. That is, a node can only send overload
           reports to non-adjacent clients or servers that it knows about,
           either from prior knowledge (i.e. provisioning) or from which it
           has observed previous Diameter messages.
          </t>
          <t>
           An application-based mechanism could possibly address reports to
           non-adjacent Diameter agents using the Destination-Host AVP. This
           would effectively make the agent into an endpoint for the
           overload-control application.
          </t>
          <t>
           A piggy-backed mechanism will have more difficulty addressing
           non-adjacent overload reports. A piggy-backed mechanism sends
           overload reports in already existing Diameter requests; That is,
           requests that have their own purposes and destinations independent
           of the overload-report. Thus, nodes can only select the destination
           of an overload report by bundling it into a Diameter message that
           was already going to that destination. While a piggy-backed
           mechanism might be able to send overload-reports across quiescent
           transport connections using watchdog (DWR/DWA) messages, these
           message are cannot be exchanged between non-adjacent nodes.
          </t>
          <t>
           <list>
             <t>
              In some cases, the limit of sending overload reports to
              destinations to which existing traffic is bound may be
              acceptable. If a node is contributing to an overload condition,
              then it's reasonable to assume that node is regularly exchanging
              traffic with the overloaded node. However, there may be cases
              where an overload report causes a connection become quiescent.
              If the reporting node needed to tell a reacting node that the
              condition has resolved or improved, it would need to send a new
              report across the now quiescent connection. There may also be
              cases where a reacting node redirects traffic along a different
              path, causing a previously quiescent node to suddenly start
              sending requests to the overloaded node. Thus, without careful
              selection of the overload report scope, an overloaded node may
              find itself engaged in a game of
              <xref target="Whac-a-Mole">Whack-a-Mole</xref> with previously
              quiescent non-adjacent nodes.
             </t>
           </list>
          </t>
          <t>
           For both piggy-backed and application-based solutions, non-adjacent
           overload control introduces a need to identify the sender of a
           report, or at least determine whether the report is from an
           adjacent or non-adjacent node. This is not required for purely
           adjacent solutions, since the sender could always be assumed to be
           the peer.
          </t>
          <t>
           For example, a non-adjacent report with a "Connection" scope does
           not make sense. If a node receives one, it should ignore it. But in
           order to make that decision, it must be able distinguish a
           non-adjacent report from an adjacent one. For example, in an
           application-based mechanism,
          </t>
        </section>
        <section title="Non-Adjacent Overload Scopes">
          <t>
           A reacting node will typically attempt to mitigate an overload
           condition by either reducing the number of requests that contribute
           to the condition, or by rerouting part of that traffic to avoid the
           problem. In both cases, the reacting node's is limited by its
           ability to determine to which Diameter requests contribute to the
           overload condition in the first place. The
           <xref target="scopes">overload scope concept</xref> offers a way
           for overloaded nodes to indicate what traffic is likely to
           contribute to an overload condition and should be abated.
          </t>
          <t>
           Not all of the scope-types described in <xref target="scopes"/>
           make sense for non-adjacent overload control. The "Connection"
           scope-type is an obvious example, since the reacting node will
           never share a transport connection with a non-adjacent node; this
           is the very definition of non-adjacent nodes.
          </t>
          <t>
           Since a Diameter node cannot control how requests are forwarded to
           non-adjacent nodes, the "Peer" scope-type also does not work well,
           especially when there are multiple possible destinations up or
           downstream from the adjacent peer. For example in
           <xref target="non-adjacent-routing"/>, Node A sends Diameter
           requests to Nodes B and C across a non-supporting agent. If Node B
           becomes overloaded but Node C does not, Node A cannot reroute
           requests to Node C, since it has very little way to influence where
           the agent will forward any given request. If Node A tries to reduce
           traffic by 50%, the agent will likely still send half of the
           remaining traffic to Node B. If B and C are endpoints, Node A may
           in some cases be able to use the Destination-Host AVP for this
           purpose (in which case the "Destination-Host" scope-type would be
           more appropriate), but this does not help if B and C are also
           agents rather than servers.
          </t>
          <figure anchor="non-adjacent-routing" title="Non-Adjacent Routing">
            <artwork>
                      +--------+       +--------+
                      | Node B |       | Node C |
                      +----+---+       +---+----+
                           |               |
                           +-------+-------+
                                   |
                           +-------+--------+
                           | Non-Supporting | 
                           |  Agent         |
                           +-------+--------+
                                   |
                                   |
                              +----+----+
                              | Node  A |
                              +---------+
		</artwork>
          </figure>
          <t>
           Scope-types that classify traffic by origin or final destinations,
           such as "Origin-Host","Destination-Realm", "Application-ID", and
           "Destination-Host" can be used for non-adjacent overload control.
           In general, scope-types that may denote non-adjacent intermediary
           devices, such "Peer" cannot, nor can scope-types that refer only to
           peers, e.g. "Connection".
          </t>
          <t>
           Even for destination-oriented scope-types, the sender of an
           overload report must be authoritative for the indicated scope. That
           is, it must have full knowledge of the congestion state for the
           scope. For example, if Node B and C both serve the ream
           "example.com", and B becomes 50% overloaded while C does not, B
           cannot simply report 50% overload at realm scope. If it did, Node A
           would reduce its generated traffic by 50%. Since the overall realm
           is really only overloaded by 75%, this would leave the realm
           operating beneath available capacity.
          </t>
          <t>
           <list>
             <t>
              The need to be authoritative for an indicated scope is also true
              for strictly adjacent reporting mechanisms. But in an adjacent
              mechanism, it is easier for an intervening agent to learn the
              overload state of upstream nodes. In the example, if the agent
              supported the overload control mechanism, it would most likely
              receive reports from Nodes B and C, and could then construct
              downstream reports that incorporate the state of B, C, and its
              own local state. This contrasts with the non-adjacent case where
              B must understand the current state of C even though it is not
              in the path of overload reports from C.
             </t>
           </list>
          </t>
          <t>
           Therefore, a given node must only report overload for scopes for
           which it has full knowledge of the load and overload state. That
           is, it must be a "scope authority" for any scope it reports. In the
           example, nodes B and C (and any other nodes serving "example.com")
           would be required to share current load and overload state. The
           state-sharing requirement could be substantial for high-capacity
           nodes.
          </t>
          <t>
           When a node reports overload for a certain scope, reacting nodes
           will treat the overload condition as uniform across the entire
           scope. For example, if a node reports overload for an entire realm,
           reacting nodes will reduce traffic equally for all servers that
           serve that realm. If the servers are unequally overloaded, they
           must use a more granular scope-type, for example,
           "Destination-Host".
          </t>
        </section>
      </section>
      <section title="Non-adjacent Overload Control Recommendations">
        <t>
         An adjacent reporting mechanism allows for very flexible and fine
         grained overload control. It solves or simplifies a number of issues,
         such as negotiation of support and parameters, requirements for
         topology knowledge, end-to-end security, etc, by avoiding them in the
         first place. Adding non-adjacent support to such a mechanism would
         complicate it considerably.
        </t>
        <t>
         Non-adjacent overload control mechanism are better for connecting
         islands of overload control. Such a mechanism works well for larger
         scopes and relatively static topologies.
        </t>
        <t>
         The author believes that we are unlikely to find a single solution
         that works well for both adjacent and non-adjacent overload control.
         While a single solution is more desirable in general, a single
         solution that works well for both cases is likely to be extremely
         complicated. Therefore, the working group should consider a separate
         mechanism for the non-adjacent delivery of overload reports.
        </t>
        <t>
         If the group chooses to accept two separate solutions, we should be
         able to specify a single data model and set of AVPs that work for
         both, with some restrictions. (For example, the non-adjacent solution
         would likely forbid the use of the "Connection" scope-type.)
        </t>
        <t>
         If the working group chooses to add non-adjacent features to MDOC or
         DOCA, we will need to change the support negotiation mechanisms to
         allow for the non-adjacent case, specify how a node can determine
         whether a report is adjacent or non-adjacent, and state what subset
         of scope-types are allowed in non-adjacent supports. We will also
         need to study how we can meet the
         <xref target="I-D.ietf-dime-overload-reqs">security-related
         requirements</xref> given the current lack of end-to-end security
         features in Diameter.
        </t>
      </section>
    </section>
    <section title="Overload Scopes" anchor="scopes">
      <t>
       Diameter overload does not necessarily affect all kinds of Diameter
       traffic. A node may become overloaded for some requests but not others.
       For example, a Diameter agent may handle requests for more than one
       Diameter Application, and may route requests to a different set of
       servers for each application. If one server set becomes overloaded, but
       the other does not, then the agent itself is effectively overloaded for
       one application, but can process the other at normal capacity.
      </t>
      <t>
       The Diameter overload
       <xref target="I-D.ietf-dime-overload-reqs">requirements</xref> list
       several scenarios that illustrate overload that affects some requests
       but not others. We refer to the set of requests affected by a
       particular overload event as the "scope" of the overload event. The
       overload requirements require the mechanism to be able to report
       overload reports that are "scoped" to (that is, they affect requests
       targeted to) a particular Diameter node, a Realm, or a Diameter
       Application.
      </t>
      <t>
       <list>
         <t>
          The concept of scope may also be useful when applied to reported
          load even without an overload condition. This usage is out of
          "scope" for this document.
         </t>
       </list>
      </t>
      <t>
       A scope indication in an overload report is a set of classifiers that
       identify requests likely to contribute to the overload condition. In
       general, this could include any aspect of a Diameter message that a
       reacting node can observe. For example, requests could be classified by
       Attribute Value Pair (AVP) values or next-hop routing decisions.
      </t>
      <t>
       The ability to express the scope of an overload condition is only
       useful when reacting nodes can act on the information. There are only a
       small number of actions a reacting node may take to mitigate overload.
       Essentially these actions boil down to reducing the number of requests
       that "match" the scope, either by sending fewer requests in the first
       place, or by routing around the problem. The former is limited by the
       node's ability to distinguish between requests that match the overload
       scope, and request that do not. The latter is limited by the node's
       ability to predict or influence how a request will be routed.
      </t>
      <t>
       <list>
         <t>
          Reacting nodes most likely take additional application-specific
          actions to mitigate overload conditions. If a client reduces the
          number of messages it sends, it almost certainly has to take
          additional application-specific steps that affect its own client
          application. Depending on the application, it might refuse some
          client application requests, redirect some of its own clients to
          different services (e.g. offloading mobile data sessions to local
          WiFi networks), or assert an overload condition in the client
          application protocol (e.g. The Session Initiation Protocol (SIP) ).
         </t>
       </list>
      </t>
      <t>
       This section discusses the meanings of the required scope-types, and
       analyses their implications for the selected mechanism.
      </t>
      <section title="Explicit vs Implicit Indication of Scopes" anchor="explicit">
        <t>
         Both MDOC and DOCA use explicit scope indication. That is, the scope
         of an overload report is not, in general, implied by the type of
         message that carries the report. For example, if an overload report
         is scoped to a particular Diameter Application-Id, the report
         explicitly indicates affected Application-Id, rather than leaving the
         reacting-node to infer the Application-ID based on that of the
         message that carries the report. There are a few exceptions to this;
         for example MDOC supports a "Connection" scope that, when specified,
         pertains to requests to be sent over the same transport connection
         over which the overload report arrived.
        </t>
        <t>
         <list>
           <t>
            List discussions have shown a common assumption that overload
            reports sent over a piggy-backed solution such as MDOC would only
            affect requests associated with the same Diameter Application-Id.
            For MDOC, this is a false assumption. MDOC's explicit use of
            scopes allows overload reports sent over one application to affect
            requests for any arbitrary application. On the other hand,
            solutions that use a dedicated Application-Id (such as DOCA)
            necessarily require the ability to report overload for arbitrary
            applications; otherwise it would only be possible for an overload
            control application to report overload on itself.
           </t>
         </list>
        </t>
        <t>
         Some list participants have suggested that the solution include a
         concept of a default scope, that is, a scope that is implied if no
         other scope is explicitly indicated. The concept of default or
         implicit scopes requires further study by the working group.
        </t>
      </section>
      <section title="Types of Overload Scopes">
        <t>
         There are several different kinds, or types, of overload scopes. The
         type of a scope defines how the reacting node interprets it.
         <xref target="scope_table" /> gives a summary of the scope types
         discussed in this document. The "Scope Type" column gives the name of
         the scope. The "Affected Traffic" column describes what Diameter
         requests are impacted by the scope-type. The "Reacting-Node" column
         describes which Diameter nodes may be able to take action on an
         overload report with the respective scope-type. Finally, the "Draft"
         column describes which proposed solution includes the respective
         scope-type.
        </t>
        <texttable title="Summary of Overload Scope Types" anchor="scope_table">
          <ttcol>Scope Type</ttcol>
          <ttcol>Affected Traffic</ttcol>
          <ttcol>Reacting-Node</ttcol>
          <ttcol>Draft</ttcol>
          <c>Connection</c>
          <c>Requests sent to directly to the reporting-node on a particular transport connection
          </c>
          <c>Adjacent Peer</c>
          <c>MDOC, DOCA</c>
          <c>Peer</c>
          <c>Requests routed directly to reporting-node.</c>
          <c>Adjacent Peer</c>
          <c>MDOC, DOCA</c>
          <c>Destination-Host</c>
          <c>Requests with a matching Destination-Host AVP</c>
          <c>Any</c>
          <c>MDOC</c>
          <c>Origin Host</c>
          <c>Requests including a matching Origin-Host AVP</c>
          <c>Any</c>
          <c>DOCA?</c>
          <c>Diameter Application</c>
          <c>Requests with a matching Application-Id AVP</c>
          <c>Any</c>
          <c>MDOC, DOCA</c>
          <c>Destination Realm</c>
          <c>Requests with a matching Destination-Realm AVP</c>
          <c>Any</c>
          <c>MDOC, DOCA</c>
          <c>Session</c>
          <c>Requests with a matching Session-Id AVP</c>
          <c>Any</c>
          <c>MDOC</c>
          <c>Session-Group</c>
          <c>Requests belonging to sessions assigned matching labels</c>
          <c>Any</c>
          <c>MDOC</c>
        </texttable>
        <section title="Connection Scope-Type">
          <t>
           The "Connection" scope-type indicates that the reacting node should
           reduce traffic sent on the transport connection on which it
           received the overload report. A Connection scope indicate does not
           include an explicit value; rather it implies "this connection".
          </t>
        </section>
        <section title="Peer Scope-Type">
          <t>
           The "Peer" scope-type indicates that a particular Diameter node is
           overloaded. Other nodes should mitigate the overload by reducing
           the number of requests that will land on the overloaded node,
           either by sending fewer requests, or by attempting to route
           requests around the overloaded node.
          </t>
          <t>
           <list>
             <t>
              In both MDOC and DOCA, the "Peer" scope-type is named "Host". In
              practice, only immediate peers can act as the reacting node for
              a Host scoped overload report. This is due to the fact that
              non-adjacent nodes have limited ability to influence routing
              decisions beyond the immediate next hop. This document uses the
              term "Peer" to illustrate that fact.
             </t>
           </list>
          </t>
          <t>
           Large-scale Diameter nodes are often implemented as clusters of IP
           hosts, which may or may not share their knowledge about upstream
           overload conditions. Certain IP hosts in a cluster could become
           overloaded when others do not. Furthermore, if the reacting-node is
           also clustered, it may be difficult for the cluster members to
           share real-time knowledge of the reporting-node's overload state.
           This can make it difficult for a node to know conclusively whether
           any two connections that appear to connect to the same peer can be
           treated as such for the purposes of overload control. The working
           group should study whether the Peer scope-type should be deprecated
           in favor of the "Connection" scope-type.
          </t>
        </section>
        <section title="Destination-Host Scope-Type">
          <t>
           The "Destination-Host" scope type pertains to requests that contain
           a Destination-Host AVP that matches the indicated Destination-Host
           value. Destination-Host always refers to the endpoint for a given
           Diameter request.
          </t>
          <t>
           The best the reacting node can do is reduce the number of requests
           that contain a Destination-Host AVP that match the overloaded node.
           Rerouting will not help in general, since the requests will simply
           take different routes to arrive at the same overloaded server.
           Unless the destination node is also direct peer, the reacting node
           cannot do much about requests that don't contain a Destination-Host
           AVP in the first place, since it cannot predict whether these
           requests will land on the overloaded endpoint. The Destination-Host
           scope type is useful for requests bound to a particular server, for
           example, mid-session requests for a session-stateful application.
          </t>
          <t>
           Go ahead and cover details for "session" and "session-groups", and
           argue for removal of "session".
          </t>
        </section>
        <section title="Origin-Host Scope-Type">
          <t>
           While most scope-types refer to where a request is likely to go,
           the "Origin-Host" scope-type refers to where the request
           originates. That is, any request with a matching Origin-Host AVP
           would match. The Origin-Host scope type is useful for situations
           where a specific client or set of clients sends an excessive number
           of requests. An overload report with an Origin-Host scope would
           tell matching clients to reduce traffic, or agents to throttle
           requests that came from matching clients.
          </t>
          <t>
           <list>
             <t>
              Note that the Origin-Host scope-type is not explicitly mentioned
              in the requirements document. The authors include it here
              because others have mentioned the need in conversation.
             </t>
           </list>
          </t>
        </section>
        <section title="Diameter-Application Scope-Type">
          <t>
           The "Diameter Application" scope-type indicates overload for a
           particular Diameter application. That is, it impacts all requests
           with the matching value in an Application-Id AVP.
          </t>
          <t>
           The Diameter Application scope-type is useful for declaring an
           overload condition that affects a specific Diameter service,
           typically, but not necessarily, in a specific realm.
          </t>
          <t>
           Since the Diameter Application scope-type indicates overload for an
           entire application, reacting nodes should reduce the number of
           requests sent for that application. Similarly to the Realm
           scope-type, it will rarely if ever make sense for a Diameter node
           to reroute traffic to a different Diameter application.
          </t>
        </section>
        <section title="Destination-Realm Scope-Type">
          <t>
           The "Destination-Realm" scope-type indicates overload for all
           servers that handle requests for the particular Diameter realm.
           That is, it impacts all requests with the particular realm in the
           Destination-Realm AVP.
          </t>
          <t>
           The Realm scope-type is useful for declaring a global overload
           condition within a network serving a single realm. It is also
           useful for requesting third-parties to reduce Diameter traffic sent
           to a particular realm, for example, in roaming scenarios.
          </t>
          <t>
           Since the Realm scope-type indicates overload for an entire realm,
           reacting nodes should reduce the number of messages sent for the
           realm. Rerouting traffic does not make sense for the Realm scope
           type, since it would probably never be useful for Diameter nodes to
           reroute traffic destined for an overloaded realm to a different,
           non-overloaded realm. Client applications might, however, be able
           to choose to use services from a different operator if the Diameter
           realm of one operator reports an overload condition.
          </t>
          <t>
           MDOC currently makes the Realm scope-type mandatory to implement.
           List participants have indicated that there may be use cases where
           all Diameter traffic on a network uses the same Realm, and that the
           use of the Realm scope-type would be redundant in such networks.
           Whether the Realm scope-type should remain mandatory or become
           optional to implement requires further study.
          </t>
        </section>
        <section title="Session Scope-Type">
          <t>
           MDOC currently includes a "Session" scope-type. This scope-type
           refers to messages that include a matching Session-Id.
           Conceptually, this applies to all requests that are part of a
           previously established session. This scope-type could potentially
           be useful for a session-stateful agent that assigns
           session-establishing requests to a certain server, and then sends
           all future requests in that session to the same server. If that
           server became overloaded, the agent could send an overload report
           scoped to the assigned session.
          </t>
          <t>
           However, the Session scope-type will become unwieldy for anything
           other than very small-scale installations. The number of sessions
           assigned to any specific server is likely to be quite large.
           Therefore, the number of Session scope values would probably become
           quite large. The working group should consider deprecating the
           Session scope-type. In non-topology hiding agents, the
           Destination-Host scope-type can be used to affect all sessions
           assigned to a particular server. For topology-hiding agents, the
           session-group mechanism can do the same.
          </t>
        </section>
        <section title="Session-Group Scope-Type">
          <t>
           Diameter agents that implement certain topology-hiding schemes may
           modify Origin-Host AVPs inserted by servers, and use some local
           mechanism to bind sessions to specific servers. The
           "Destination-Host" type may not function correctly in this case.
           MDOC specifies a "session-group" scope-type, where an agent or
           server can assign a common identifier to sessions that are
           fate-shared in some way, such as being bound to the same server. If
           that server becomes overloaded, the agent can send an overload
           report that matches requests in all sessions with the matching
           identifier.
          </t>
          <t>
           This scope-type may be useful under certain circumstances, but may
           also be complex to implement. Further discussion is needed to
           determine if the session-group type should be included in the base
           mechanism. Since the mechanism is required to allow extensible
           scope-types, session-groups could still be added in the future. The
           working group should study whether the Session-Group mechanism
           should be included in the base overload control solution, or
           removed with the potential to add as an extension scope-type in the
           future.
          </t>
        </section>
      </section>
      <section title="Scope Values">
        <t>
         Scope labels in an overload report will typically take the form of a
         scope-type and a value. For example, if the "example.com" realm is
         overloaded for all services, the overload report would indicate a
         scope-type of "Realm" and a scope-value of "example.com"
        </t>
        <t>
         The Connection scope-type is an exception. Since an overload report
         with a Connection scope is only actionable by one of the peers
         connected via the specified connection, it makes sense to treat the
         Connection scope-type as always having a value of "this connection".
        </t>
      </section>
      <section title="Combining Scopes">
        <t>
         Diameter nodes will commonly need to construct overload reports that
         apply to a combination of scopes. For example, if a given realm is
         overloaded for subset of the applications it supports, it might
         indicate both a realm scope and and one or more Diameter application
         scopes.
        </t>
        <t>
         Logically, combining multiple scopes of different types reduces the
         overall set of requests to which the overload report would apply.
         Combining multiple scopes of the same type increases the applicable
         set. A function that determines the requests affected by an overload
         report could model this as a logical "and" or "intersection" operator
         for combining scopes of different types, and a logical "or" or
         "union" operator for combining scopes of the same type.
        </t>
        <t>
         The working group should study whether all possible combinations
         should be allowed. For example, it may or may not make sense to
         combine a "Connection" scope with other scopes, or to allow more than
         one "Connection" scope-value for a single overload report.
        </t>
      </section>
      <section title="Scope Extensibility">
        <t>
         <xref target="I-D.ietf-dime-overload-reqs"/> requires scope-types to
         be extensible. This requirement implies that the chosen mechanism or
         mechanisms must discuss how new scope-types can be added, how support
         for specific scope-types should be declared or negotiated, and which
         scope-types might be mandatory to support.
        </t>
      </section>
      <section title="Scope Recommendations">
        <t>
         In the author's opinion, the selected solution or solutions should
         support, at a minimum, the "Connection", "Destination-Host", "Realm"
         and "Application-ID" scope-types. The working group should consider
         also adding the "Origin-Host" scope-type.
        </t>
        <t>
         The working group should consider whether the advantages of the
         "session-group" concept and scope-type are worth the complexity. The
         group should also study whether the Peer scope-type adds sufficient
         utility over the Connection scope-type to warrant it's inclusion.
        </t>
      </section>
    </section>
    <section anchor="iana-considerations" title="IANA Considerations">
      <t>
       This draft makes no requests of IANA.
      </t>
    </section>
    <section title="Security Considerations">
      <t>
       Overload reports induce Diameter nodes to reduce or reroute traffic.
       For large scopes, a single erroneous or malicious overload report could
       effectively shut down Diameter processing for an entire realm. A
       Diameter overload control solution needs mechanisms to ensure that
       overload reports are only accepted from trusted sources, and that
       nothing tampers with the reports en route.
      </t>
      <t>
       For adjacent approaches, the transport connection can be protected with
       TLS or IPSec. But this will not help for non-adjacent reporting, since
       no such transport connection exists.
      </t>
      <t>
       While such work is in progress in the DIME working group, Diameter has
       no currently viable mechanism for end-to-end authentication and
       integrity protection. The working group should consider either making
       non-adjacent overload control contingent on a generic Diameter
       end-to-end protection mechanism, or adding a specialized protection
       mechanism to any resulting non-adjacent overload control solution.
      </t>
    </section>
  </middle>
  <back>
    <references title="Normative References">
    &rfc6733;
    &draft-ietf-dime-overload-reqs;
    </references>
    <references title="Informative References">
		&draft-roach-dime-overload-ctrl;
		&draft-korhonen-dime-ovl;
		<reference anchor="Whac-a-Mole" target="http://en.wikipedia.org/wiki/Whack-a-mole#Colloquial_usage">
        <front>
          <title>Whack-a-Mole Colloquial Usage</title>
          <author/>
          <date/>
        </front>
      </reference>
    </references>
    <section title="Contributors">
      <t>
       Eric McMurry and Robert Sparks made significant contributions to the
       concepts in this draft.
      </t>
    </section>
  </back>
</rfc>
