2023-02-01

Understanding use case for max.in.flight.request property in Kafka

I'm building a Spring Boot consumer-producers project with Kafka as middleman between two microservices. The theme of the project is a basketball game. Here is a small state machine diagram, in which events are displayed. There will be many more different events, this is just a snippet.

State diagram

Start event:

{
  "id" : 5,
  "actualStartTime" : "someStartTime"
}

Point event:

{
   "game": 5,
   "type": "POINT",
    "payload": {
          "playerId": 44,
          "value": 3
    }
}

Assist event:

{
  "game": 4,
  "type": "ASSIST",
  "payload": {
    "playerId": 278,
    "value": 1
  }
}

Jump event:

 {
   "game": 2,
   "type": "JUMP",
   "payload": {
     "playerId": 55,
     "value": 1
   }
 }

End event:

{
    "id" : 5,
    "endTime" : "someStartTime"
}

Main thing to note here is that if there was an Assist event it must be followed with Point event.

Since I'm new to Kafka, I'll keep things simple and have one broker with one topic and one partition. For my use case I need to maintain ordering of each of these events as they actually happen live on the court (I have a json file with 7000 lines and bunch of these and other events).

So, let's say that from the Admin UI someone is sending these events (for instance via WebSockets) to the producers app. Producer app will be doing some simple validation or whatever it needs to do. Now, we can also image that we have two instances of producer app, one is at ip:8080 (prd1) and other one at ip:8081 (prd2).

In reality sequence of these three events happend: Assist -> Point -> Jump. The operator on the court send those three events in that order.

Assist event was sent on prd1 and Point was sent on prd2. Let's now imagine that there was a network glitch in communication between prd1 and Kafka cluster. Since we are using Kafka latest Kafka at the time of this writing, we already have enabled.idempotence=true and Assist event will not be sent twice.

During retry of Assist event on prd1 (towards Kafka), Point event on prd2 passed successfully. Then Assist event passed and after it Jump event (at any producer) also ended up in Kafka.

Now in queue we have: Point -> Assist -> Jump. This is not allowed.

My question is whether these types of problems should be handle by application's business logic (for example Spring State Machine) or this ordering can be handled by Kafka?

In case of latter, is property max.in.flight.request=1 responsible for ordering? Are there any other properties which might preserve ordering?

On the side note, is it a good tactic to use single partition for single match and multiple consumers for any of the partitions? Most probably I would be streaming different types of matches (basketball, soccer, golf, across different leagues and nations) and most of them will require some sort of ordering.

This maybe can be done with KStreams but I'm still on Kafka's steep learning curve.

Update 1 (after Jessica Vasey's comments):

Hi, thanks for very through comments. Unfortunately I didn't quite get all the pieces of the puzzle. What confuses me the most is some terminology you use and order of things happening. Not saying it's not correct, just I didn't understand.

I'll have two microservices, so two Producers. I got be be able to understand Kafka in microservices world, since I'm Java Spring developer and its all about microservices and multiple instances.

So let's say that on prd1 few dto events came along [Start -> Point -> Assist] and they are sent as a ProducerRequest (https://kafka.apache.org/documentation/#recordbatch), they are placed in RECORDS field. On the prd2 we got [Point -> Jump] also as a ProducerRequest. They are, in my understanding, two independent in-flight requests (out of 5 possible?)? Their ordering is based on a timestamp?

So when joining to the cluster, Kafka assigns id to producer let's say '0' for prd1 and '1' for prd2 (I guess it also depends on topic-partition they have been assigned). I don't understand whether each RecordBatch has its monotonically increasing sequence number id or each Kafka message within RecordBatch has its own monotonically increasing sequence number or both? Also the part 'time to recover' is bugging me. Like, if I got OutofOrderSequenceException, does it mean that [Point -> Jump] batch (with possibly other in-flight requsets and other batches in producer's buffer) will sit on Kafka until either delivery.timeout.ms expirees or when it finally successfully [Start -> Point -> Assist] is sent?



No comments:

Post a Comment