KAFKA-18757: Create full-function SimpleAssignor to match KIP-932 description #18864

adixitconfluent · 2025-02-11T18:01:04Z

About

The current SimpleAssignor in AK assigned all subscribed topic partitions to all the share group members. This does not match the description given in KIP-932. Here are the rules as mentioned in the KIP by which the assignment should happen. We have changed the step 3 implementation here due to the reasons described -

The assignor hashes the member IDs of the members and maps the partitions assigned to the members based on the hash. This gives approximately even balance.
If any partitions were not assigned any members by (1) and do not have members already assigned in the current assignment, members are assigned round-robin until each partition has at least one member assigned to it.
We combine the current and new assignment. (Original rule - If any partitions were assigned members by (1) and also have members in the current assignment assigned by (2), the members assigned by (2) are removed.)

Tests

The added code has been verified with unit tests and the already present integration tests.

…hash for current assignment + unit test

…ulating hash for current assignment + unit test" This reverts commit 86a4c6f.

adixitconfluent · 2025-02-12T18:21:46Z

Hi @AndrewJSchofield @apoorvmittal10 , the step 3 described above is a little tricky to implement (since we can only know the current assignment, not whether it was calculated by step 1 or step 2). I have implemented a way to filter current assignment as required in step 3 in function filterCurrentAssignment, but it is incorrect for a few cases. Maybe step 3 needs more consideration (or a future PR), but can we please meanwhile review the PR in its current state. I can also get rid of step 3 as of now and implement it in a new PR as well. Let me know your thoughts.

…hash for current assignment

Copilot

Copilot reviewed 2 out of 3 changed files in this pull request and generated 1 comment.

Files not reviewed (1)

core/src/test/scala/unit/kafka/server/ShareGroupHeartbeatRequestTest.scala: Language not supported

Comments suppressed due to low confidence (1)

group-coordinator/src/main/java/org/apache/kafka/coordinator/group/assignor/SimpleAssignor.java:304

The 'partition' field should be declared as 'final' to make the 'TargetPartition' class immutable.

int partition;

group-coordinator/src/main/java/org/apache/kafka/coordinator/group/assignor/SimpleAssignor.java

Copilot reviewed 2 out of 3 changed files in this pull request and generated no comments.

Files not reviewed (1)

core/src/test/scala/unit/kafka/server/ShareGroupHeartbeatRequestTest.scala: Language not supported

adixitconfluent · 2025-02-14T12:51:58Z

I have amended the implementation of the step 3 of the assignment such that we will combine new and current assignment without revoking the assignments that were assigned by step 1 in the new assignment and have members in current assignment by step 2. This has been done to avoid the complexity in both the implementation and the run time complexity because as of now we can only get the current assignment while calculating the new assignment. We do not have a way to know with which step, a particular assignment happened in the current assignment. I do have a way with which we can recreate the step wise assignment using the current assignment but that involves sorting and unnecessary computation. Hence, I am deferring with that approach.
IMO, step 3 helps in reducing the burden of certain members of the share groups. This can be achieved with the help of limiting the max no. of partitions assignment for every member(KAFKA-18788). Hence, the potential problem of burdening the share consumers will be addressed in a future PR.
PS - We shouldn't have any problem in merging the PR to trunk with the amendment I suggested since right now, we anyways assign all the topic partitions to all the share group members which would be leading to burdening the share consumers anyways.
cc- @AndrewJSchofield @apoorvmittal10

AndrewJSchofield · 2025-02-16T21:49:52Z

Marked failed test as flaky in #18925.

AndrewJSchofield

Thanks for the PR. Only a partial review so far, but I've left some initial comments.

group-coordinator/src/main/java/org/apache/kafka/coordinator/group/assignor/SimpleAssignor.java

apoorvmittal10

Thanks for the PR, took an initial look. Some comments.

...oordinator/src/test/java/org/apache/kafka/coordinator/group/assignor/SimpleAssignorTest.java

group-coordinator/src/main/java/org/apache/kafka/coordinator/group/assignor/SimpleAssignor.java

apoorvmittal10

Some comments, though seems PR as good starting point and we migh improve on better partition stickiness while revoking and assigning partitions.

group-coordinator/src/main/java/org/apache/kafka/coordinator/group/assignor/SimpleAssignor.java

...oordinator/src/test/java/org/apache/kafka/coordinator/group/assignor/SimpleAssignorTest.java

group-coordinator/src/main/java/org/apache/kafka/coordinator/group/assignor/SimpleAssignor.java

...oordinator/src/test/java/org/apache/kafka/coordinator/group/assignor/SimpleAssignorTest.java

apoorvmittal10

Mostly looks good to me, can you please share the numbers with 16 partitions and 25 share consumers, with and without the PR.

group-coordinator/src/main/java/org/apache/kafka/coordinator/group/assignor/SimpleAssignor.java

...oordinator/src/test/java/org/apache/kafka/coordinator/group/assignor/SimpleAssignorTest.java

adixitconfluent · 2025-02-24T15:10:51Z

can you please share the numbers with 16 partitions and 25 share consumers, with and without the PR.

@apoorvmittal10 , here are the numbers for 16 partitions and 25 share consumers -

With PR -
1 million records of size 1024 each - 17.5 seconds
5 million records of size 1024 each - 91 seconds

Without PR -
1 million records of size 1024 each - 14.1 seconds
5 million records of size 1024 each - 72.2 seconds

As mentioned above as well, this PR reduces the sharing of topic partitions from the assignor, hence the decline in performance is expected. With the future PRs, the performance should reach an optimum number.

apoorvmittal10 · 2025-02-24T15:48:42Z

can you please share the numbers with 16 partitions and 25 share consumers, with and without the PR.

@apoorvmittal10 , here are the numbers for 16 partitions and 25 share consumers -

With PR - 1 million records of size 1024 each - 17.5 seconds 5 million records of size 1024 each - 91 seconds

Without PR - 1 million records of size 1024 each - 14.1 seconds 5 million records of size 1024 each - 72.2 seconds

As mentioned above as well, this PR reduces the sharing of topic partitions from the assignor, hence the decline in performance is expected. With the future PRs, the performance should reach an optimum number.

Just to clarify, how was the partition allocation with the current PR code. Also if members are removed and added then there would be more sharing of partitions as per the combine logic in the PR, correct? Will it affect the performance?

adixitconfluent · 2025-02-24T15:58:11Z

can you please share the numbers with 16 partitions and 25 share consumers, with and without the PR.

@apoorvmittal10 , here are the numbers for 16 partitions and 25 share consumers -
With PR - 1 million records of size 1024 each - 17.5 seconds 5 million records of size 1024 each - 91 seconds
Without PR - 1 million records of size 1024 each - 14.1 seconds 5 million records of size 1024 each - 72.2 seconds
As mentioned above as well, this PR reduces the sharing of topic partitions from the assignor, hence the decline in performance is expected. With the future PRs, the performance should reach an optimum number.

Just to clarify, how was the partition allocation with the current PR code. Also if members are removed and added then there would be more sharing of partitions as per the combine logic in the PR, correct? Will it affect the performance?

right now, most of the members had 1-2 topic partitions allocated to them excepted 1-2 members which had a good 12-14 partitions assigned to them.
Yes, if members are removed and added then there would be more sharing of partitions as per the combine logic in the PR. Given the small size of 1-5 million records, it should improve the performance.

apoorvmittal10

LGTM, given the code of the simple assignor will change in future PRs. One comment to address.

core/src/test/scala/unit/kafka/server/ShareGroupHeartbeatRequestTest.scala

TaiJuWu

LGTM. Just two nit for question but they are not very important.

...oordinator/src/test/java/org/apache/kafka/coordinator/group/assignor/SimpleAssignorTest.java

AndrewJSchofield · 2025-02-25T17:16:48Z

group-coordinator/src/main/java/org/apache/kafka/coordinator/group/assignor/SimpleAssignor.java

+        // the burden of certain members of the share groups. This can be achieved with the help of limiting the max
+        // no. of partitions assignment for every member(KAFKA-18788). Hence, the potential problem of burdening
+        // the share consumers will be addressed in a future PR.
+


Doesn't the following do the job a bit better?

newAssignment.forEach((targetPartition, members) -> members.forEach(member -> finalAssignment.computeIfAbsent(member, k -> new HashSet<>()).add(targetPartition))); currentAssignment.forEach((targetPartition, members) -> { if (subscribedTopicIds.contains(targetPartition.topicId())) {} members.forEach(member -> { if (groupSpec.memberIds().contains(member) && !newAssignment.containsKey(targetPartition)) finalAssignment.computeIfAbsent(member, k -> new HashSet<>()).add(targetPartition); }); });

The problem with the code as it currently exists is that it assigns all partitions to the first member, and then as other members join, it leaves all partitions with the first member in spite of assigning the partitions to the other members.

What the snippet above does is essentially give precedence to the new assignment, and only copies over information from the current assignment which augments the new assignment. It's still not perfect because the round-robin nature of the reassignment is not sophisticated enough, but I think it's probably better.

makes sense. This will help reduce burdening of members, though it affects the stickiness of assignments now since we are revoking the assignments from current assignment. Now, we'll need to think of a way for optimum sharing in the future PRs. I have made this change.

AndrewJSchofield

lgtm. Needs a bit more refinement, but this is a good start.

adixitconfluent added 2 commits February 11, 2025 16:40

Added new SimpleAssignor

c008693

Added unit tests

995746a

github-actions bot added the triage PRs from the community label Feb 11, 2025

Correcting step 3 logic for assignment computation

a8b06d9

adixitconfluent marked this pull request as ready for review February 11, 2025 18:49

AndrewJSchofield added KIP-932 Queues for Kafka ci-approved and removed triage PRs from the community labels Feb 11, 2025

adixitconfluent added 4 commits February 12, 2025 10:29

Fixed failing tests

95a774a

Minor improvement

69fd35d

Fixing test failure

565fc91

enhanced test testShareGroupHeartbeatWithMultipleMembers

1246301

github-actions bot added the core Kafka Broker label Feb 12, 2025

adixitconfluent added 2 commits February 12, 2025 18:07

Enhanced the logic when combining new assignment and current assignment

7252d16

Changed HashSet to LinkedHashSet to preserve order while calculating …

86a4c6f

…hash for current assignment + unit test

adixitconfluent marked this pull request as draft February 12, 2025 15:41

Revert "Changed HashSet to LinkedHashSet to preserve order while calc…

9e5d94f

…ulating hash for current assignment + unit test" This reverts commit 86a4c6f.

adixitconfluent marked this pull request as ready for review February 12, 2025 18:15

Changed HashSet to LinkedHashSet to preserve order while calculating …

e3edc84

…hash for current assignment

apoorvmittal10 requested review from Copilot and apoorvmittal10 February 13, 2025 11:18

Copilot AI reviewed Feb 13, 2025

View reviewed changes

group-coordinator/src/main/java/org/apache/kafka/coordinator/group/assignor/SimpleAssignor.java Outdated Show resolved Hide resolved

Addressed copilot's comment

7ffb7b1

adixitconfluent requested a review from Copilot February 13, 2025 11:55

Copilot AI reviewed Feb 13, 2025

View reviewed changes

Ammended step 3 of the simple assignor rules + unit tests

481740e

AndrewJSchofield requested changes Feb 16, 2025

View reviewed changes

apoorvmittal10 suggested changes Feb 16, 2025

View reviewed changes

group-coordinator/src/main/java/org/apache/kafka/coordinator/group/assignor/SimpleAssignor.java Outdated Show resolved Hide resolved

AndrewJSchofield requested changes Feb 17, 2025

View reviewed changes

adixitconfluent added 2 commits February 19, 2025 13:25

Addressed Andrew's and Apoorv's round 1 comments - part 1

9fd13a1

Addressed Andrew's and Apoorv's round 1 comments - part 2

00ad45b

adixitconfluent requested review from apoorvmittal10 and AndrewJSchofield February 19, 2025 08:38

apoorvmittal10 reviewed Feb 23, 2025

View reviewed changes

adixitconfluent added 2 commits February 24, 2025 11:19

Addressed Apoorv's round 2 comments

477cb15

Addressed Apoorv's round 2 comments - part 2

dd3cfd7

adixitconfluent requested review from apoorvmittal10 and TaiJuWu February 24, 2025 07:38

apoorvmittal10 reviewed Feb 24, 2025

View reviewed changes

adixitconfluent requested a review from apoorvmittal10 February 24, 2025 15:11

Addressed Apoorv's round 3 comments

99232ec

apoorvmittal10 approved these changes Feb 24, 2025

View reviewed changes

core/src/test/scala/unit/kafka/server/ShareGroupHeartbeatRequestTest.scala Outdated Show resolved Hide resolved

Addressed Apoorv's comment around test

c5bfeec

TaiJuWu approved these changes Feb 24, 2025

View reviewed changes

...oordinator/src/test/java/org/apache/kafka/coordinator/group/assignor/SimpleAssignorTest.java Show resolved Hide resolved

...oordinator/src/test/java/org/apache/kafka/coordinator/group/assignor/SimpleAssignorTest.java Show resolved Hide resolved

AndrewJSchofield requested changes Feb 25, 2025

View reviewed changes

Addressed Andrew's comments

2ba3076

adixitconfluent requested a review from AndrewJSchofield February 25, 2025 17:58

AndrewJSchofield approved these changes Feb 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KAFKA-18757: Create full-function SimpleAssignor to match KIP-932 description #18864

KAFKA-18757: Create full-function SimpleAssignor to match KIP-932 description #18864

adixitconfluent commented Feb 11, 2025 •

edited

Loading

adixitconfluent commented Feb 12, 2025

Copilot AI left a comment

adixitconfluent commented Feb 14, 2025 •

edited

Loading

AndrewJSchofield commented Feb 16, 2025

AndrewJSchofield left a comment

apoorvmittal10 left a comment

apoorvmittal10 left a comment

apoorvmittal10 left a comment

adixitconfluent commented Feb 24, 2025

apoorvmittal10 commented Feb 24, 2025

adixitconfluent commented Feb 24, 2025

apoorvmittal10 left a comment

TaiJuWu left a comment •

edited

Loading

AndrewJSchofield Feb 25, 2025

adixitconfluent Feb 25, 2025

AndrewJSchofield left a comment

KAFKA-18757: Create full-function SimpleAssignor to match KIP-932 description #18864

Are you sure you want to change the base?

KAFKA-18757: Create full-function SimpleAssignor to match KIP-932 description #18864

Conversation

adixitconfluent commented Feb 11, 2025 • edited Loading

About

Tests

adixitconfluent commented Feb 12, 2025

Copilot AI left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adixitconfluent commented Feb 14, 2025 • edited Loading

AndrewJSchofield commented Feb 16, 2025

AndrewJSchofield left a comment

Choose a reason for hiding this comment

apoorvmittal10 left a comment

Choose a reason for hiding this comment

apoorvmittal10 left a comment

Choose a reason for hiding this comment

apoorvmittal10 left a comment

Choose a reason for hiding this comment

adixitconfluent commented Feb 24, 2025

apoorvmittal10 commented Feb 24, 2025

adixitconfluent commented Feb 24, 2025

apoorvmittal10 left a comment

Choose a reason for hiding this comment

TaiJuWu left a comment • edited Loading

Choose a reason for hiding this comment

AndrewJSchofield Feb 25, 2025

Choose a reason for hiding this comment

adixitconfluent Feb 25, 2025

Choose a reason for hiding this comment

AndrewJSchofield left a comment

Choose a reason for hiding this comment

adixitconfluent commented Feb 11, 2025 •

edited

Loading

adixitconfluent commented Feb 14, 2025 •

edited

Loading

TaiJuWu left a comment •

edited

Loading