Rate detection, or incoming event stream intensity measuring in the decay model


Heavy hitter detection algorithms help to solve such problems as network congestion mitigation, identifying network abnormalities and attacks, managing dynamic routing. For example, a well-known web server NGINX allows you to limit the rate of requests to a particular resource, and to do this, the rate must be measured quantitatively.

Heavy hitter detection problem

Problem classification

All the problems described are divided into several subclasses:

  • Threshold-t. The flows having a rate higher than a given fraction t of the rate of total incoming traffic are to be recognized.
  • Top-k. For a given number k, the the highest k rate flows are to be selected.
  • The flows having a rate higher than a given absolute value.

Target architecture

Depending on the platform on which this flow algorithm is supposed to run, a number of hardware limitations arise. Sometimes it is required to integrate the necessary functionality into the network equipment (switch, router, etc.). The proposed algorithm is supposed to be used on a universal processor, but for some parameter values, it could be embedded in network equipment.

Problem parameters

  • The absolute value of the flow rate, which is considered dangerous. The problem is to identify such flows.
  • The size of the key which identifies the event type. For instance, if the packets with the same source IP address are declared to be the same, then the key is defined as the 4-bytes value src.ip. In this implementation, as in many other algorithms, key values are stored in some table, so the key size affects memory costs.
  • A method for calculating the rate of a single flow from the timestamps of events belonging to the same type. In essence, this is an algorithmic definition of what is the rate of the flow. In this case, an exponential moving average is calculated, which has a single parameter — the typical time τ, during which the weight of the event after its arrival is taken into account.

Solution accuracy

Evaluation of the quality of the algorithm can be a relative or absolute error in estimating the rate of the flows. As a measure of accuracy estimation, the (ε, δ)-approximation may be used: if the error is not more than ε with probability 1 − δ, then the algorithms are said to have the accuracy measure (ε, δ).


As a rule, the main property of the algorithm is the maximum rate of incoming events, that it is still capable to process. That is, only the time of processing one event is important. However, in practice, other hardware limitations, such as the size of the memory used to store the accumulated information about the flow, also affect. Especially severe hardware limitations occur in a case of integrating functionality into network equipment. But on ordinary computers, the presence of a large amount of memory does not always help, since access to DRAM can take an unacceptably long time. To achieve the required performance, you have to compromise with the accuracy of the measurement and limit the size of auxiliary data so that they could fit into the processor cache. In this implementation, with meaningful parameter values, the table fits into the L2 cache of the CPU. As a result, it becomes possible to ensure that the processing time for one event takes a few dozen CPU clock cycles.

Methods for estimating the rate of the event flow

To determine the rate of a given event type, it is required to calculate the number of events for some time. To do this, you need to record the time of the event and somehow save it. Typically, a counter is used for this purpose, which is incremented when the corresponding event occurs. Then the rate is estimated as the ratio of the counter value to the time interval during which the measurement is performed. More accurate methods of measuring the actual rate value use variations of the moving average. For example, the exponential moving average (EMA) is a sort of weighted moving average with exponentially decreasing weights.

Methods of multiple flows accounting

In the problem of heavy hitter detection, the difficulty is not only the high rate of the input traffic but also the large number of different flows (event types) that you have to monitor. A naive implementation involves placing a separate counter for each flow, which results in a significant memory consumption. In this case, the risk is not only to exhaust all available memory but also a significant slowdown in performance due to cache misses. Therefore, instead of solve the problem of heavy hitter detection accurately, it is possible to drop some of the accumulated information about the flows. There known many methods for limiting the amount of memory used and reducing the processing time of an event, some of them are listed below:

  • Packet sampling
  • Space saving algorithm
  • HashParallel
  • HashPipe
  • Count-min sketch

Problem formalization

A sequence of similar events is given in the form of a set of elementary events {event_k}, where the index k runs through a finite or infinite discrete interval I ⊂ , k ∈ I.

  • Counting the number of events:
  • Counting the number of events taking the weight into account:
  • Calculation of exponential moving average (EMA) with the β parameter. Here the counter s = (v, t) stores two values: the current value of the EMA and the time of the last update.

The decay model

Here we introduce the decay model. It is described as follows:

“v” value for different flows

Counter update

A non-trivial operation is to update the counter value s when an event occurs. In this case, you need to replace s with a new value s′, which is determined by the following equation:

Rate definition

Let there be a uniform flow of events with rate r, which means that events occur with a pace p = 1/r. The uniform flow is given by definition.

r- and r+ function graphs for τ = 15

Applicability limitations of the decay model

There are some limitations to the rate value that could be correctly measured with the decay model.

Multiple flows accounting algorithms

Decay model features

Using EMA as a counter value gives some benefits. After the event flow stops, the accumulated value quickly (exponentially by time) degrades and becomes indistinguishable from zero. In the decay model, this fact is used to reset the counter automatically. The time T_vanish, for which any previously accumulated value degrades close to zero, depends on the τ parameter and the required accuracy. It is expressed as T_vanish = T_min + σ_max, where T_min is the decay time of the value accumulated as a result of a single event. In the section Table Implementation, the exact T_min expression is given, but here it suffices to bear in mind the following fact:

Numerical implementation of the decay-based count

Update value operation

Consider the following notation:

Calculating Rτ: FPU

The easiest way is to use floating-point arithmetic and directly evaluate ρ_τ by definition. The execution time of this operation is about 100 CPU cycles, which is quite a lot compared to the methods suggested below.

Calculating Rτ: table implementation

The idea of a table implementation is to store the pre-calculated values of a function in a table.

ρ function graph

Results and Conclusion

We compared three different tests realizing EMA:

  1. Naive implementation of EMA;
  2. Decay-based model with FPU calculations, i.e. with exp() and log() functions calls from standard math library;
  3. Decay-based model implemented via table.


This publication has been prepared within the pilot project of revealing the mechanisms of the Qrator Labs filtering network operation. We thank Anton Orlov, Artem Gavrichenkov, Eugene Nagradov, and Nikita Danilov for fruitful discussions and ideas.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Qrator Labs

Qrator Labs

DDoS Attacks Mitigation and Continuous Availability