2023-04-22

Distribute the load uniformly while minimizing rebalancing

Please tell me, is there any well-known algorithm that solves my problem, which seems quite common? Then I won't have to reinvent the wheel :) The problem is to distribute the load uniformly while minimizing rebalancing. Given:

  • N servers, each with a capacity of S.
  • A pool of processes that must be run on the servers. Each process requires P[i] server resources. The process cannot be divided among multiple servers. The process essentially runs infinitely and consumes resources indefinitely.
  • The sum of P[i] is much less than N*S (there are enough resources to run all processes). Each P[i] is less than S (servers are powerful enough that any process can be run).

The goal is to distribute the processes among the servers such that the imbalance is as small as possible: the ratio of (maximum load)/(minimum load) approaches unity.

This is the initial distribution. However, P[i] may change (either increase or decrease), leading to an imbalance. New processes may appear, and old ones may disappear, which also disrupts the balance.

An algorithm is needed to rebalance the load, maintaining uniform distribution while minimizing the number of process movements between servers. Rebalancing can occur, for example, once an hour. That is, there is no need to rebalance in real-time. Between rebalancing, new processes can simply be added to the least loaded server. The system can be left untouched if the imbalance ratio is less than some threshold value, such as 1.2. The imbalance ratio can be determined as (two different approaches):

  • The ratio of the maximum and minimum loads.
  • The maximum of the ratios (maximum load)/(average load) and (average load)/(minimum load).

The current imbalance ratio is checked once an hour, and rebalancing is initiated if necessary.

I certainly have some ideas on how to approach the solution. But then my algorithm would require proof of correctness. For example, that it does not lead to the constant migration of any process between servers. And I have a strong feeling that this problem is quite common and there are well-known algorithms for its solution.

Example:

Let's say the resource is CPU.

Three servers: S1, S2, S3. The capacity is the same and equals 20 CPUs.

Four processes P1=3(requeires 3 CPUs), P2=2, P3=5, P4=6

The desired initial distribution:

  • S1: P1, P2 Total: 5
  • S2: P3 Total: 5
  • S3: P4 Total: 6

One hour later: P4 disappeared, P1 requires 7 CPUs now, new P5=3: (P1=7, P2=2, P3=5, P5=3)

Rebalancing: P2 -> S3. New P5 -> S3

New state:

  • S1: P1 Total: 7
  • S2: P3 Total: 5
  • S3: P2, P5: Total: 5

One hour later: P1 requires 1 CPU, P2 requeres 10: (P1=1, P2=10, P3=5, P5=3)

Rebalancing: P5 -> S1

New state:

  • S1: P1, P5 Total: 4
  • S2: P3 Total: 5
  • S3: P2 Total: 10


No comments:

Post a Comment