Dubbo timing task time wheel (Time Wheel) algorithm detailed explanation

Dubbo timing task time wheel (Time Wheel) algorithm detailed explanation

1 Scheduled tasks

Netty, Quartz, Kafka and Linux all have timing task functions.

  • Conventional JDK tool classes such as java.util.Timer and DelayedQueue can implement simple timing tasks. The bottom layer uses a heap data structure, and the access complexity is O(nlog(n)), which cannot support massive timing tasks.
  • In scenarios with a large amount of timed tasks and high performance requirements, in order to reduce the time complexity of task access and cancellation operations to O(1), a time wheel scheme is used.

2 Time wheel model and its application

A scheduling model for efficient batch management of timed tasks. The time wheel is generally implemented as a ring structure, similar to a clock, divided into many slots, one slot represents a time interval, and each slot uses a doubly linked list to store timing tasks. The pointer jumps periodically and jumps to a slot to execute the timing task of that slot.

  • Hashed Timing Wheel structure diagram

Applicable scene

  • Recovery
  • flow control
  • Scheduling Algorithm
  • Control the life cycle of data packets in the network

Timer maintenance is expensive, if

  • The processor is interrupted at every clock tick
  • Use a fine-grained timer
  • Many unfinished timers

An efficient timer algorithm is needed to reduce the overall interrupt overhead.

The capacity and accuracy of a single-layer time wheel are limited. For scenarios where the accuracy requirements are particularly high, the time span is particularly large, or a large number of timed tasks need to be scheduled, a multi-level time wheel and a combination of persistent storage and time wheel are usually used. .

Models and performance indicators

Rules in the model

  • Client call:
    • START_TIMER (time interval, Request_ID, Expiry_Action)
    • STOP_TIMER (Request_ID)
  • Timer tick call:
    • PER_TICK_BOOKKEEPING
    • EXPIRY_PROCESSING

Performance

  • space

Memory used by data structure

  • delay

The time required to start and end any of the above routines

3 Dubbo's time wheel structure

Dubbo's time wheel implementation is located in the org.apache.dubbo.common.timer package of the dubbo-common module. Let's analyze the core interfaces and implementations involved in the time wheel.

Core interface

TimerTask

In Dubbo, all timing tasks must implement the TimerTask interface . Only one run() method is defined, and the input parameter is a Timeout interface object.

Timeout

There is a one-to-one correspondence between the Timeout object and the TimerTask object, similar to the relationship between the Future object returned by the thread pool and the task object submitted to the thread pool. Through the Timeout object, you can not only view the status of the timed task, but also operate the timed task (for example, cancel the associated timed task).

  • Methods in the Timeout interface:

The Timer interface defines the basic behavior of the timer. The core is newTimeout(): submit a timer task (TimerTask) and return the associated Timeout object, similar to submitting a task to a thread pool.

HashedWheelTimeout

HashedWheelTimeout is the only implementation of the Timeout interface and is an internal class of HashedWheelTimer. HashedWheelTimeout plays two roles:

  • The node of the doubly linked list in the time wheel, that is, the container of the timer task TimerTask in HashedWheelTimer
  • The Handle returned after TimerTask is submitted to HashedWheelTimer, used to view and control the timed task outside the time wheel

Core field

  • prev, next. The doubly linked list is used to chain timeouts (timing tasks) in HashedWheelTimerBucket. Since it only acts on WorkerThread, there is no need for synchronization/volatile.

  • task, the actually scheduled task

  • deadline, the time at which the timing task is executed. Specified when creating HashedWheelTimeout

Calculation formula: currentTime HashedWheelTimeout + delay - startTime HashedWheelTimer ns

  • state, the current state of the timing task

-The optional states are as follows:-There is also a STATE_UPDATER field for achieving the atomicity of state changes.

  • remainingRounds, the number of clock cycles remaining in the current task. The length of time that can be represented by the time wheel is limited. When the time difference between the task due time and the current moment exceeds the time length of a single lap of the time wheel, a loop will appear. This field value is required to indicate the remaining clock cycle.

Core API

  • isCancelled()

  • isExpired()

  • state()

Check the current HashedWheelTimeout status

  • cancel() method

  • expire() method

  • remove()

HashedWheelBucket

A slot in the time wheel. Time slot is actually round container for caching and management of a doubly linked list, a doubly linked list of each node is an HashedWheelTimeoutobject, which is associated with a TimerTasktimed task.

HashedWheelBucket holds the first and last nodes of the doubly linked list-head and tail, plus each HashedWheelTimeout node holds predecessor and successor references, you can traverse the entire linked list forward and backward.

Core API

  • addTimeout()

  • pollTimeout()

  • remove()

Remove the specified HashedWheelTimeout node from the doubly linked list.

  • clearTimeouts()

The pollTimeout() method is called cyclically to process the entire doubly linked list and returns all tasks that have not timed out or been cancelled.

  • expireTimeouts()

Traverse all HashedWheelTimeout nodes in the doubly linked list. When processing expired timed tasks, they will be taken out through the remove() method, and their expire() method will be called for execution; for cancelled tasks, they will be taken out through the remove() method and discarded directly; for unexpired tasks, they will be Decrease the remainingRounds field (the number of remaining clock cycles) by one.

HashedWheelTimer

Timer The realization of the interface implements a timer through the time wheel algorithm.

Function

Selected corresponding to the current time pointer wheel HashedWheelBucketslot, the iteration starts from the head of the list, is calculated for each HashedWheelTimeoutscheduled task:

  • If it belongs to the current clock cycle, take it out and run
  • If it does not belong, the remaining clock cycles will be reduced by one

Core domain

  • workerState

Time wheel current state, three optional value by AtomicIntegerFieldUpdaterwhich to achieve modification atoms.

  • startTime

The start time of the current time wheel and the deadline field value of the timed task submitted to the time wheel are calculated with this timestamp as the starting point.

  • wheel

Time round circular queue, each element is a slot. When the number of slots in the specified time wheel is n, the nearest power of n will be taken up

  • timeouts, cancelledTimeouts

HashedWheelTimer will process the data of these two queues before processing the doubly linked list of HashedWheelBucket:-timeouts queue buffers timing tasks in the external submission time wheel-cancelledTimeouts queue temporarily stores cancelled timing tasks

  • tick

At HashedWheelTimer$Workerthe pointer of the time wheel, a monotonically increasing counter with a step of 1

  • mask

Mask mask = wheel.length - 1, performing ticks & masktargets the clock corresponding to the groove

  • ticksDuration

The actual time represented by the time indicator incremented by 1 each time, in nanoseconds.

  • pendingTimeouts

The total number of timed tasks remaining in the current time round.

  • workerThread

The thread that actually executes timing tasks inside the time wheel.

  • worker

The logic to actually execute the timing task is encapsulated in this Runnable object.

newTimeout()

To submit a timed task, the start() method will be called to start the time wheel before the timed task enters the timeouts queue. The following two key steps will be completed:

  1. Determine the startTime field of the time wheel
  2. Start the workerThread thread and start executing worker tasks.

After the calculation according to the timing startTime task deadline, and finally to the timing of the task to be packaged HashedWheelTimeoutand added to timeoutsthe queue.

4 The execution process of one rotation of the time wheel hand

HashedWheelTimer$Worker.run() :

  1. The hand of the time wheel rotates and the cycle of the time wheel begins
  2. Cleaning task the user is actively cancel the timing, the timing of these tasks when the user cancels the recording into the cancelledTimeoutsqueue. Every time the hands turn, the time wheel will clear the queue
  3. Transfer the timing tasks cached in the timeouts queue to the corresponding slot in the time wheel
  4. Locate the corresponding slot according to the current pointer, and process the timing tasks in the doubly linked list of the slot
  5. Check the status of the time wheel. If the time wheel is in the running state, the above steps are executed cyclically, and the timing tasks are continuously executed. If the time wheel in a stopped state, perform the following steps to obtain the timing of the task is not performed and added to unprocessedTimeoutsthe queue: traversal time slot for each round, and call clearTimeouts() method; timeouts of the queue is not added to the wells cycle call poll()
  6. Finally, the timed tasks that the user actively canceled in the canceledTimeouts queue are cleaned up again.

5 Timing application

It is not directly used for periodic operations, but only a single timed task is submitted to the time wheel for execution. When the previous task execution is completed, the newTimeout() method is called to submit the current task again, so that the task will be executed in the next cycle. task. Even if there are GC, I/O blocking, etc. during the execution of the task, which causes the task to be delayed or stuck, there will not be the same task continuously submitted in, resulting in the accumulation of tasks.

Dubbo time wheel applications are mainly in the following areas:

  1. Failure retrying, for example, the retry operation when the Provider fails to register with the registry, or the failed retry when the Consumer subscribes to the registry, etc.
  2. Periodic timing tasks, such as sending heartbeat requests regularly, processing request timeouts, or reconnecting mechanisms after network disconnection

reference