Updated Project: A language empowering everyone to build reliable and efficient software.
Updated Project: A language empowering everyone to build reliable and efficient software.
Small aggregate arguments passed via PassMode::Cast in the Rust ABI (e.g. [u32; 2] cast to i64) are missing noundef in the emitted LLVM IR, even when the type contains no uninit bytes:
#[no_mangle]
pub fn f(v: [u32; 2]) -> u32 { v[0] }
; expected: define i32 @f(i64 noundef %0)
; actual: define i32 @f(i64 %0) ← noundef missing
This blocks LLVM from applying optimizations that require value-defined semantics on function arguments.
adjust_for_rust_abi calls arg.cast_to(Reg::Integer), which internally creates a CastTarget with ArgAttributes::new() — always empty. Any validity attribute that was present before the cast is silently dropped.
This affects all PassMode::Cast arguments and return values in the Rust ABI: plain arrays, newtype wrappers, and any BackendRepr::Memory type small enough to fit in a register.
A prior attempt (rust-lang/rust#127210) used Ty/repr attributes to detect padding.
After adjust_for_rust_abi, iterate all PassMode::Cast args and the return value. For each, call layout_is_noundef on the original layout; if it returns true, set NoUndef on the CastTarget’s attrs.
layout_is_noundef uses only the computed layout — BackendRepr, FieldsShape, Variants, Scalar::is_uninit_valid() — and never touches Ty or repr attributes. Anything it cannot prove returns false.
Covered cases:
Scalar / ScalarPair (both halves initialized, fields contiguous)FieldsShape::Array (element type recursively uninit-free)FieldsShape::Arbitrary with Variants::Single (fields cover 0..size with no gaps, each recursively uninit-free) — handles newtype wrappers, multi-field structs, single-variant enums, repr(transparent), repr(C) wrappersLog in to leave a comment
size_of_val == 0 for DSTs with Non-zero-sized Prefix via NUW and Assumesize_of_val(p) == 0 fails to optimize away for DST types that have a statically-known non-zero-sized prefix:
pub struct Foo<T: ?Sized>(pub [u32; 3], pub T);
pub fn demo(p: &Foo<dyn std::fmt::Debug>) -> bool {
std::mem::size_of_val(p) == 0 // always false, but LLVM can't prove it
}
Foo has a 12-byte prefix, so its total size is always ≥ 12. Yet the comparison persists as a runtime computation in LLVM IR. This matters because Box<dyn T> drop emits this exact check to guard the deallocation call — for types with a guaranteed non-zero prefix, the branch should vanish but doesn’t.
The slice tail variant Foo<[i32]> already optimized correctly; Foo<dyn Trait> and Foo<[u8]> did not.
In size_and_align_of_dst (the ADT/Tuple branch), the size computation is:
full_size = (offset + unsized_size + (align-1)) & -align
LLVM cannot prove full_size > 0 because:
offset + unsized_size used plain add — no NUW flag, so LLVM cannot conclude the result is ≥ offset.(x + addend) & -align — LLVM has no information that alignment rounding never reduces the value below x.Additionally, the vtable alignment range metadata was [1, u64::MAX] (only non-zero), despite the actual bound being [1, 1 << (ptr_width - 1)] (all alignments are powers of two with a tighter upper bound).
Log in to leave a comment
def_kind in codegen_fn_attrs: ConstCoercing a #[target_feature] const fn to a function pointer inside a const body triggers an ICE (debug builds only):
#[target_feature(enable = "sse2")]
const fn with_target_feature() {}
const X: () = unsafe {
let _: unsafe fn() = with_target_feature; // ICE
};
assertion failed: def_kind.has_codegen_attrs()
unexpected `def_kind` in `codegen_fn_attrs`: Const
Introduced in rust-lang/rust#135504 (2025-01-14, commit 8fee6a77394). adjust_target_feature_sig unconditionally calls codegen_fn_attrs(caller) to get the caller’s target features. codegen_fn_attrs requires that the DefId satisfies has_codegen_attrs(). DefKind::Const, AssocConst, and InlineConst do not — they have no codegen attributes by design. The debug assertion fires.
In release builds the call “worked” accidentally: codegen_fn_attrs on a const would reach the query machinery and happen to return empty attributes, producing a correct (but unguaranteed) result. The bug was latent until debug builds exposed it.
Replace codegen_fn_attrs(caller) with body_codegen_attrs(caller). body_codegen_attrs exists precisely for this case: it delegates to codegen_fn_attrs for function-like DefKinds and returns CodegenFnAttrs::EMPTY for const items. A const body has no target features, so returning empty is semantically correct.
Also fix the pre-existing variable name callee_features → caller_features (the variable holds the caller‘s features, not the callee’s).
Log in to leave a comment