Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-18601: Assume a baseline of 3.3 for server protocol versions #18845

Merged
merged 33 commits into from
Feb 19, 2025

Conversation

ijuma
Copy link
Member

@ijuma ijuma commented Feb 9, 2025

3.3.0 was the first KRaft release that was deemed production-ready and also when KIP-778 (KRaft to KRaft upgrades) landed. Given that, it's reasonable for 4.x to only support upgrades from 3.3.0 or newer (the metadata version also needs to be set to "3.3" or newer before upgrading).

Noteworthy changes:

  1. AlterPartition no longer includes topic names, which makes it possible to simplify AlterParitionManager logic.
  2. Metadata versions older than IBP_3_3_IV3 have been removed and IBP_3_3_IV3 is now the minimum version.
  3. MINIMUM_BOOTSTRAP_VERSION has been removed.
  4. Removed isLeaderRecoverySupported, isNoOpsRecordSupported, isKRaftSupported, isBrokerRegistrationChangeRecordSupported and isInControlledShutdownStateSupported - these are always true now. Also removed related conditional code.
  5. Removed default metadata version or metadata version fallbacks in multiple places - we now fail-fast instead of potentially using an incorrect metadata version.
  6. Update MetadataBatchLoader.resetToImage to set hasSeenRecord based on whether image is empty - this was a previously existing issue that became more apparent after the changes in this PR.
  7. Remove ibp parameter from BootstrapDirectory
  8. A number of tests were not useful anymore and have been removed.

I will update the upgrade notes via a separate PR as there are a few things that need changing and it would be easier to do so that way.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@ijuma ijuma requested a review from Copilot February 9, 2025 17:04
@github-actions github-actions bot added core Kafka Broker tools kraft storage Pull requests that target the storage module clients labels Feb 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 61 out of 78 changed files in this pull request and generated no comments.

Files not reviewed (17)
  • clients/src/main/resources/common/message/AlterPartitionRequest.json: Language not supported
  • clients/src/main/resources/common/message/AlterPartitionResponse.json: Language not supported
  • core/src/main/scala/kafka/cluster/Partition.scala: Language not supported
  • core/src/main/scala/kafka/server/AlterPartitionManager.scala: Language not supported
  • core/src/main/scala/kafka/server/ControllerRegistrationManager.scala: Language not supported
  • core/src/main/scala/kafka/server/KafkaConfig.scala: Language not supported
  • core/src/main/scala/kafka/tools/StorageTool.scala: Language not supported
  • core/src/main/scala/kafka/tools/TestRaftServer.scala: Language not supported
  • core/src/test/scala/integration/kafka/coordinator/transaction/ProducerIntegrationTest.scala: Language not supported
  • core/src/test/scala/integration/kafka/server/KRaftClusterTest.scala: Language not supported
  • core/src/test/scala/integration/kafka/server/MetadataVersionIntegrationTest.scala: Language not supported
  • core/src/test/scala/kafka/server/RemoteLeaderEndPointTest.scala: Language not supported
  • core/src/test/scala/unit/kafka/cluster/PartitionLockTest.scala: Language not supported
  • core/src/test/scala/unit/kafka/cluster/PartitionTest.scala: Language not supported
  • core/src/test/scala/unit/kafka/server/AbstractApiVersionsRequestTest.scala: Language not supported
  • core/src/test/scala/unit/kafka/server/AlterPartitionManagerTest.scala: Language not supported
  • core/src/test/scala/unit/kafka/server/BrokerRegistrationRequestTest.scala: Language not supported
Comments suppressed due to low confidence (3)

clients/src/test/java/org/apache/kafka/common/requests/AlterPartitionRequestTest.java:46

  • The removal of the topicName field might lead to incomplete test coverage if any tests rely on it. Verify if topicName is essential for any test cases.
TopicData topicData = new TopicData().setTopicId(topicId);

clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java:1705

  • The removal of the 'topicName' field may cause unintended side effects if the 'topicName' is required elsewhere in the codebase. Please verify if 'topicName' is needed.
.setTopicId(Uuid.randomUuid())

clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java:1723

  • The removal of the 'topicName' field may cause unintended side effects if the 'topicName' is required elsewhere in the codebase. Please verify if 'topicName' is needed.
.setTopicId(Uuid.randomUuid())
@ijuma ijuma force-pushed the kafka-18601-assume-baseline-3.3-for-server branch from 3643409 to a7fbe00 Compare February 10, 2025 03:49
@ijuma ijuma force-pushed the kafka-18601-assume-baseline-3.3-for-server branch 3 times, most recently from 4b19a3d to 890c3b2 Compare February 10, 2025 05:52
* Test downgrading to a MetadataVersion that doesn't support FeatureLevelRecord.
*/
@Test
public void testPremodernVersion() {
Copy link
Member Author

@ijuma ijuma Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't support these anymore, so remove it.

* Test downgrading to a MetadataVersion that doesn't support inControlledShutdown.
*/
@Test
public void testPreControlledShutdownStateVersion() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now always have the controlled shutdown state.

@ijuma ijuma force-pushed the kafka-18601-assume-baseline-3.3-for-server branch from 890c3b2 to e654cdb Compare February 10, 2025 06:06
@ijuma ijuma force-pushed the kafka-18601-assume-baseline-3.3-for-server branch from 9622ce9 to 86d1053 Compare February 10, 2025 20:54
}

@Test
public void testKRaftVersions() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This did not seem to add much value after the changes in this PR.

Copy link
Member

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ijuma thanks for this patch. overall LGTM. some minor comments remains.

…ssume-baseline-3.3-for-server

* apache-github/trunk:
  KAFKA-18366 Remove KafkaConfig.interBrokerProtocolVersion (apache#18820)
  KAFKA-18658 add import control for examples module (apache#18812)
  MINOR: Java version and TLS documentation improvements (apache#18822)
  KAFKA-18743 Remove leader.imbalance.per.broker.percentage as it is not supported by Kraft (apache#18821)
  KAFKA-18225 ClientQuotaCallback#updateClusterMetadata is unsupported by kraft (apache#18196)
  KAFKA-18441: Remove flaky tag on KafkaAdminClientTest#testAdminClientApisAuthenticationFailure (apache#18847)
  MINOR: fix KStream#to incorrect javadoc (apache#18838)
  KAFKA-18763: changed the assertion statement for acknowledgements to include only successful acks (apache#18846)
  MINOR: Accept specifying consumer group assignors by their short names (apache#18832)
@@ -437,18 +438,6 @@ class KRaftClusterTest {
}
}

@Test
def testCreateClusterInvalidMetadataVersion(): Unit = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's no longer possible to set the bootstrap metadata version to lower than 3.3-IV3.

@ijuma ijuma force-pushed the kafka-18601-assume-baseline-3.3-for-server branch from eb55590 to d505508 Compare February 18, 2025 14:42
@ijuma
Copy link
Member Author

ijuma commented Feb 18, 2025

@junrao @chia7712 The tests for JDK 23 were all passing for the last run (the JDK 17 build failed, but it didn't look related to this PR).

I think there's only one outstanding issue: how to handle the case if we read an old metadata version from the log. I'll update the relevant comment later today. Does the rest look good to you?

…pen frequently enough that it's not worth handling this edge case.
@ijuma
Copy link
Member Author

ijuma commented Feb 18, 2025

I resolved the outstanding item, so no known issues remain in this PR.

Copy link
Member

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ijuma thanks for this patch. a couple of comments remain.

@@ -92,8 +87,7 @@ object AlterPartitionManager {
scheduler = scheduler,
time = time,
brokerId = config.brokerId,
brokerEpochSupplier = brokerEpochSupplier,
metadataVersionSupplier = () => metadataCache.metadataVersion()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can remove metadataCache from the apply method

}

public Builder(MetadataImage image) {
this.metadataVersion = image.features().metadataVersion();
this.metadataVersion = image.features().metadataVersionOrThrow();
this.isEligibleLeaderReplicasEnabled = image.features().isElrEnabled();
}

public Builder setMetadataVersion(MetadataVersion metadataVersion) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After removing those if-else, it seems we don't need the builder anymore. Maybe we can rewrite ImageWriterOptions by record class

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a general pattern for the metadata module, I would rather not change it as part of this PR. We can have that discussion via a separate JIRA if useful.

@@ -30,31 +30,31 @@
@Timeout(value = 40)
public class MetadataVersionChangeTest {

private static final MetadataVersionChange CHANGE_3_0_IV1_TO_3_3_IV0 =
new MetadataVersionChange(IBP_3_0_IV1, IBP_3_3_IV0);
private static final MetadataVersionChange CHANGE_MINUMUM_TO_LATEST =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: MINUMUM -> MINIMUM

@@ -114,7 +107,7 @@ class DefaultAlterPartitionManager(
// and re-created, we cannot have two entries in this Map especially if we cannot
// use an AlterPartition request version which supports topic ids in the end because
// the two updates with the same name would be merged together.
private[server] val unsentIsrUpdates: util.Map[TopicPartition, AlterPartitionItem] = new ConcurrentHashMap[TopicPartition, AlterPartitionItem]()
private[server] val unsentIsrUpdates = new ConcurrentHashMap[TopicIdPartition, AlterPartitionItem]()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please update the docs according to the change?

@Test
public void testRegisterControlWithOlderMetadataVersion() {
FeatureControlManager featureControl = new FeatureControlManager.Builder().
setMetadataVersion(MetadataVersion.IBP_3_3_IV0).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test appears necessary as it verifies MV which is older than IBP_3_7_IV0. Perhaps we could simply change IBP_3_3_IV0 to IBP_3_3_IV3 instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deleted that because we have testRegistrationWithUnsupportedMetadataVersion. Does that address your concern?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears that testRegistrationWithUnsupportedMetadataVersion is designed to test broker registration, while testRegisterControlWithOlderMetadataVersion is for controller registration.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I had missed that. I'll add the test back with IBP_3_6_IV2 since that's the newest IBP that still fails.

@@ -48,40 +48,33 @@
public class ImageWriterOptionsTest {
@Test
public void testDefaultLossHandler() {
ImageWriterOptions options = new ImageWriterOptions.Builder().build();
ImageWriterOptions options = new ImageWriterOptions.Builder(MetadataVersion.latestProduction()).build();
assertEquals("stuff", assertThrows(UnwritableMetadataException.class,
() -> options.handleLoss("stuff")).loss());
}

@Test
public void testSetMetadataVersion() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we rewrite ImageWriterOptions by record class, this testSetMetadataVersion is unnecessary as all are pure setters.

Copy link
Member Author

@ijuma ijuma Feb 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this test isn't useful since the special logic for the setter was removed - even without removing the builder. I removed it.

@ijuma
Copy link
Member Author

ijuma commented Feb 19, 2025

@chia7712 I addressed your comments.

Copy link
Member

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ijuma
Copy link
Member Author

ijuma commented Feb 19, 2025

I will go ahead and merge this. If @junrao has any additional comments, I will submit a separate PR to address those.

@ijuma ijuma merged commit 3dba312 into apache:trunk Feb 19, 2025
9 checks passed
@ijuma ijuma deleted the kafka-18601-assume-baseline-3.3-for-server branch February 19, 2025 13:35
@mumrah
Copy link
Member

mumrah commented Feb 19, 2025

Thanks for pushing this over the line @ijuma. It's great to see these cleanups and removals. "IBP" is all but a memory now :)

ShivsundarR pushed a commit to ShivsundarR/kafka that referenced this pull request Feb 19, 2025
…pache#18845)

3.3.0 was the first KRaft release that was deemed production-ready and also
when KIP-778 (KRaft to KRaft upgrades) landed. Given that, it's reasonable
for 4.x to only support upgrades from 3.3.0 or newer (the metadata version also
needs to be set to "3.3" or newer before upgrading).

Noteworthy changes:
1. `AlterPartition` no longer includes topic names, which makes it possible to
simplify `AlterParitionManager` logic.
2. Metadata versions older than `IBP_3_3_IV3` have been removed and
`IBP_3_3_IV3` is now the minimum version.
3. `MINIMUM_BOOTSTRAP_VERSION` has been removed.
4. Removed `isLeaderRecoverySupported`, `isNoOpsRecordSupported`,
`isKRaftSupported`, `isBrokerRegistrationChangeRecordSupported` and
`isInControlledShutdownStateSupported` - these are always `true` now.
Also removed related conditional code.
5. Removed default metadata version or metadata version fallbacks in
multiple places - we now fail-fast instead of potentially using an incorrect
metadata version.
6. Update `MetadataBatchLoader.resetToImage` to set `hasSeenRecord`
based on whether image is empty - this was a previously existing issue that
became more apparent after the changes in this PR.
7. Remove `ibp` parameter from `BootstrapDirectory`
8. A number of tests were not useful anymore and have been removed.

I will update the upgrade notes via a separate PR as there are a few things that
need changing and it would be easier to do so that way.

Reviewers: Chia-Ping Tsai <[email protected]>, Jun Rao <[email protected]>, David Arthur <[email protected]>, Colin P. McCabe <[email protected]>, Justine Olshan <[email protected]>, Ken Huang <[email protected]>
ijuma added a commit that referenced this pull request Feb 19, 2025
…18845)

3.3.0 was the first KRaft release that was deemed production-ready and also
when KIP-778 (KRaft to KRaft upgrades) landed. Given that, it's reasonable
for 4.x to only support upgrades from 3.3.0 or newer (the metadata version also
needs to be set to "3.3" or newer before upgrading).

Noteworthy changes:
1. `AlterPartition` no longer includes topic names, which makes it possible to
simplify `AlterParitionManager` logic.
2. Metadata versions older than `IBP_3_3_IV3` have been removed and
`IBP_3_3_IV3` is now the minimum version.
3. `MINIMUM_BOOTSTRAP_VERSION` has been removed.
4. Removed `isLeaderRecoverySupported`, `isNoOpsRecordSupported`,
`isKRaftSupported`, `isBrokerRegistrationChangeRecordSupported` and
`isInControlledShutdownStateSupported` - these are always `true` now.
Also removed related conditional code.
5. Removed default metadata version or metadata version fallbacks in
multiple places - we now fail-fast instead of potentially using an incorrect
metadata version.
6. Update `MetadataBatchLoader.resetToImage` to set `hasSeenRecord`
based on whether image is empty - this was a previously existing issue that
became more apparent after the changes in this PR.
7. Remove `ibp` parameter from `BootstrapDirectory`
8. A number of tests were not useful anymore and have been removed.

I will update the upgrade notes via a separate PR as there are a few things that
need changing and it would be easier to do so that way.

Reviewers: Chia-Ping Tsai <[email protected]>, Jun Rao <[email protected]>, David Arthur <[email protected]>, Colin P. McCabe <[email protected]>, Justine Olshan <[email protected]>, Ken Huang <[email protected]>
dajac pushed a commit that referenced this pull request Feb 22, 2025
…solated_mode_upgrade (#19003)

#18845 assumed a baseline of 3.3 for server protocol versions so that
the lower version couldn't roll up to 4.0. Hence, the
`TestUpgrade#test_combined_mode_upgrad` and `test_isolated_mode_upgrade`
failed for the 3.1 and 3.2 versions.

e2e tests result with this patch on jenkins:
![Screenshot from 2025-02-22
13-22-17](https://github.com/user-attachments/assets/2de6f707-8281-4f30-b5d0-83dd4de9666d)
e2e tests result with this patch on local machine:
![Screenshot from 2025-02-22
13-28-16](https://github.com/user-attachments/assets/2e5e563a-1ac4-4894-ba30-593304697d1d)

Reviewers: David Jacot <[email protected]>
dajac pushed a commit that referenced this pull request Feb 22, 2025
…solated_mode_upgrade (#19003)

#18845 assumed a baseline of 3.3 for server protocol versions so that
the lower version couldn't roll up to 4.0. Hence, the
`TestUpgrade#test_combined_mode_upgrad` and `test_isolated_mode_upgrade`
failed for the 3.1 and 3.2 versions.

e2e tests result with this patch on jenkins:
![Screenshot from 2025-02-22
13-22-17](https://github.com/user-attachments/assets/2de6f707-8281-4f30-b5d0-83dd4de9666d)
e2e tests result with this patch on local machine:
![Screenshot from 2025-02-22
13-28-16](https://github.com/user-attachments/assets/2e5e563a-1ac4-4894-ba30-593304697d1d)

Reviewers: David Jacot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Blocker This pull request is identified as solving a blocker for a release. build Gradle build or GitHub Actions clients core Kafka Broker kraft performance storage Pull requests that target the storage module tools
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants