-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU] Skip handling of non-byte types in promote alloca. #128769
base: main
Are you sure you want to change the base?
Conversation
Non-byte types like i1 can be packed and be supported. For the time being these types are not promoted. Issue found by fuzzer.
@llvm/pr-subscribers-backend-amdgpu Author: Sumanth Gundapaneni (sgundapa) ChangesNon-byte types like i1 can be packed and be supported. For the time being these types are not promoted. Issue found by fuzzer. Full diff: https://github.com/llvm/llvm-project/pull/128769.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
index 28016b5936ccf..007f930cea4f3 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
@@ -759,6 +759,14 @@ bool AMDGPUPromoteAllocaImpl::tryPromoteAllocaToVector(AllocaInst &Alloca) {
return false;
}
+ Type *VecEltTy = VectorTy->getElementType();
+ constexpr unsigned SIZE_OF_BYTE = 8;
+ unsigned ElementSizeInBits = DL->getTypeSizeInBits(VecEltTy);
+ // FIXME: The non-byte type like i1 can be packed and be supported, but
+ // currently we do not handle them.
+ if (ElementSizeInBits % SIZE_OF_BYTE != 0)
+ return false;
+
std::map<GetElementPtrInst *, WeakTrackingVH> GEPVectorIdx;
SmallVector<Instruction *> WorkList;
SmallVector<Instruction *> UsersToRemove;
@@ -776,8 +784,7 @@ bool AMDGPUPromoteAllocaImpl::tryPromoteAllocaToVector(AllocaInst &Alloca) {
LLVM_DEBUG(dbgs() << " Attempting promotion to: " << *VectorTy << "\n");
- Type *VecEltTy = VectorTy->getElementType();
- unsigned ElementSize = DL->getTypeSizeInBits(VecEltTy) / 8;
+ unsigned ElementSize = ElementSizeInBits / SIZE_OF_BYTE;
for (auto *U : Uses) {
Instruction *Inst = cast<Instruction>(U->getUser());
diff --git a/llvm/test/CodeGen/AMDGPU/promote-alloca-skip-non-byte-type.ll b/llvm/test/CodeGen/AMDGPU/promote-alloca-skip-non-byte-type.ll
new file mode 100644
index 0000000000000..3d2234f0a7ac3
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/promote-alloca-skip-non-byte-type.ll
@@ -0,0 +1,21 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -mtriple=amdgcn-unknown-amdhsa -passes=amdgpu-promote-alloca < %s | FileCheck %s
+
+; Verify that we do not crash and not promote non-byte alloca types.
+define <8 x i1> @non_byte_alloca_type() {
+; CHECK-LABEL: define <8 x i1> @non_byte_alloca_type() {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[C:%.*]] = icmp ugt <16 x i1> zeroinitializer, zeroinitializer
+; CHECK-NEXT: [[RP:%.*]] = alloca <8 x i1>, align 1
+; CHECK-NEXT: [[TMP0:%.*]] = load <8 x i1>, ptr [[RP]], align 1
+; CHECK-NEXT: store <16 x i1> [[C]], ptr [[RP]], align 2
+; CHECK-NEXT: ret <8 x i1> [[TMP0]]
+;
+entry:
+ %C = icmp ugt <16 x i1> zeroinitializer, zeroinitializer
+ %RP = alloca <8 x i1>, align 1
+ %0 = load <8 x i1>, ptr %RP, align 1
+ store <16 x i1> %C, ptr %RP, align 2
+ ret <8 x i1> %0
+}
+
|
@@ -776,8 +784,7 @@ bool AMDGPUPromoteAllocaImpl::tryPromoteAllocaToVector(AllocaInst &Alloca) { | |||
|
|||
LLVM_DEBUG(dbgs() << " Attempting promotion to: " << *VectorTy << "\n"); | |||
|
|||
Type *VecEltTy = VectorTy->getElementType(); | |||
unsigned ElementSize = DL->getTypeSizeInBits(VecEltTy) / 8; | |||
unsigned ElementSize = ElementSizeInBits / SIZE_OF_BYTE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC SIZE_OF_BYTE
is defined by the whatever compiler compiles LLVM instead of for AMDGPU.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean , to use some thing like this to derive the value from data layout "DL.getTypeSizeInBits(Type::getInt8Ty(M->getContext()))".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have defined it to be "constexpr unsigned SIZE_OF_BYTE = 8" in line 763. Probably pick a different name ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I missed that part. Hardcoding 8 is probably fine for now and in the any near future, but the proper approach is definitely to query DL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the actual code, I don't see why this doesn't just work for this case. Is the assert wrong?
; CHECK-NEXT: ret <8 x i1> [[TMP0]] | ||
; | ||
entry: | ||
%C = icmp ugt <16 x i1> zeroinitializer, zeroinitializer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use something that can't fold away
; | ||
entry: | ||
%C = icmp ugt <16 x i1> zeroinitializer, zeroinitializer | ||
%RP = alloca <8 x i1>, align 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the correct alloca address space. Also this issue isn't about the UB under-alignment, so correct that
unsigned ElementSizeInBits = DL->getTypeSizeInBits(VecEltTy); | ||
// FIXME: The non-byte type like i1 can be packed and be supported, but | ||
// currently we do not handle them. | ||
if (ElementSizeInBits % SIZE_OF_BYTE != 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Best to replicate typeSizeEqualsStoreSize
store <16 x i1> %C, ptr %RP, align 2 | ||
ret <8 x i1> %0 | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some tests for the scalar case? Only the subvector extract was a problem?
Non-byte types like i1 can be packed and be supported. For the time being these types are not promoted.
Issue found by fuzzer.