百度360必应搜狗淘宝本站头条
当前位置:网站首页 > 编程字典 > 正文

Int64针对32位架构是按照4字节还是8字节对齐?

toyiye 2024-08-22 02:18 7 浏览 0 评论

作为构建.NET的标准,CLI Spec(ECMA-335)针对基元类型的对齐规则具有如下的描述。按照这个标准,我们是这么理解的:8字节的数据类型(int64、unsigned int64和float64)根据采用的机器指令架构选择4字节或者8字节对进一步来说,它们在x86/x64机器上的对字节分别为4字节和8字节。

Built-in data types shall be properly aligned, which is defined as follows:

  • 1-byte, 2-byte, and 4-byte data is properly aligned when it is stored at a 1-byte, 2-byte, or 4-byte boundary, respectively.

  • 8-byte data is properly aligned when it is stored on the same boundary required by the underlying hardware for atomic access to a native int.

Thus, int16 and unsigned int16 start on even address; int32, unsigned int32, and float32 start on an address divisible by 4; and int64, unsigned int64, and float64 start on an address divisible by 4 or 8, depending upon the target architecture. The native size types (native int, native unsigned int, and &) are always naturally aligned (4 bytes or 8 bytes, depending on the architecture). When generated externally, these should also be aligned to their natural size, although portable code can use 8-byte alignment to guarantee architecture independence. It is strongly recommended that float64 be aligned on an 8-byte boundary, even when the size of native int is 32 bits.

我们通过一个简单控制台程序来验证这个说法。为了在64位机器上模拟32位平台,我们按照如下的方式修改了.csproj文件,将PlatformTarget属性设置为x86(默认为Any CPU)。

<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>net7.0</TargetFramework>
<ImplicitUsings>enable</ImplicitUsings>
<able>enable</able>
<AllowUnsafeBlocks>True</AllowUnsafeBlocks>
<PlatformTarget>x86</PlatformTarget>
</PropertyGroup>
</Project>

在演示程序中,我们定义了如下一个名为Foobar的结构体Record。该结构体具有两个字段,类型分别为byte和ulong(unsigned int64)。我们将这两个字段分别设置为byte.Max(FF)和ulong.MaxValue(FF-FF-FF-FF-FF-FF-FF-FF-FF),并将在内存中的二进制形式输出来。为了进一步确定当前的环境与CLI Spec的描述一致,我们将Environment.Is64BitProcess属性(确定是不是64位处理器),ulong类型的字节数(确定这是一个”8-byte data”)和IntPtr.Size(确定native int类型的对其边界是4字节)。

unsafe
{
var bytes = new byte[sizeof(Foobar)];
var foobar = new Foobar(byte.MaxValue, ulong.MaxValue);
Marshal.Copy(new nint(Unsafe.AsPointer(ref foobar)), bytes, 0, bytes.Length);
Console.WriteLine(BitConverter.ToString(bytes));
Console.WriteLine($"Environment.Is64BitProcess = {Environment.Is64BitProcess}");
Console.WriteLine($"sizeof(ulong) = {sizeof(ulong)}");
Console.WriteLine($"IntPtr.Size = {IntPtr.Size}");
}

public record struct Foobar(byte Foo, ulong Bar);

从如下的输出可以看出,当前的环境与CLI Spec描述的32位处理器架构是一致的,但是ulong类型的字段Bar采用的对其长度是8字节而不是4字节(如果采用4字节对的话,二进制形式应该FF-00-00-00-FF-FF-FF-FF-FF-FF-FF-FF-FF,如果保证Foobar自身按照8字节对齐,结果也应该是FF-00-00-00-FF-FF-FF-FF-FF-FF-FF-FF-FF-00-00-00-00)。

对于这个问题,我们目前尚未找到一个权威的答案,莫不是我对CLI Spec的解读有误?还是我们的验证程序有问题?希望对此熟悉的朋友不吝赐教!我们目前Google如下这些相关的说法:

Memory alignment on a 32-bit Intel processor

The usual rule of thumb (straight from Intels and AMD's optimization manuals) is that every data type should be aligned by its own size. An int32 should be aligned on a 32-bit boundary, an int64 on a 64-bit boundary, and so on. A char will fit just fine anywhere.

Another rule of thumb is, of course "the compiler has been told about alignment requirements". You don't need to worry about it because the compiler knows to add the right padding and offsets to allow efficient access to data.

WHY IS THE DEFAULT ALIGNMENT FOR `INT64_T` 8 BYTE ON 32 BIT X86 ARCHITECTURE?

Interesting point: If you only ever load it as two halves into 32bit GP registers, then 4B alignment means those operations will happen with their natural alignment.

However, it's probably best if both halves of the variable are in the same cache line, since almost all accesses will read / write both halves. Aligning to the natural alignment of the whole thing takes care of that, even ignoring the other reasons below.

32bit x86 can load 64bit integers in a single 64bit-load using MMX or SSE2 movq. Handling 64bit add/sub/shift/ and bitwise booleans using vector instructions is more efficient (single instruction), as long as you don't need immediate constants or mul or div. The vector instructions with 64b elements are still available in 32b mode.

Atomic 64bit compare-and-exchange is also available in 32bit mode (lock CMPXCHG8B m64 works just like 64bit mode's lock CMPXCHG16B m128, using two implicit registers (edx:eax)). IDK what kind of penalty it has for crossing a cache-line boundary.

Modern x86 CPUs have essentially no penalty for misaligned loads/stores unless they cross cache-line boundaries, which is why I'm only saying that, and not saying that misaligned 64b would be bad in general. See the links in the x86 wiki, esp. Agner Fog's guides.

Why is the "alignment" the same on 32-bit and 64-bit systems?

MSVC targeting 32-bit x86 gives __int64 a minimum alignment of 4, but its default struct-packing rules align types within structs to min(8, sizeof(T)) relative to the start of the struct. (For non-aggregate types only). That's not a direct quote, that's my paraphrase of the MSVC docs link from @P.W's answer, based on what MSVC seems to actually do. (I suspect the "whichever is less" in the text is supposed to be outside the parens, but maybe they're making a different point about the interaction on the pragma and the command-line option?)

做了如下的补充实验,证明ulong类型的对齐规则确实与CLI Spec一致的。莫非8-byte 数据类型本身和作为符合类型(struct/class)字段成员时采用不同的对齐规则?

x64:如下的断言总是成立的。

var random = new Random();
unsafe
{
long v = random.NextInt64();
Debug.Assert(new IntPtr(Unsafe.AsPointer(ref v)).ToInt64() % 8 == 0);

}

x86:如下的断言也总是成立的

var random = new Random();
unsafe
{
long v = random.NextInt64();
Debug.Assert(new IntPtr(Unsafe.AsPointer(ref v)).ToInt32() % 4 == 0);
}

x86:如下的断言就不能保证都成立

var random = new Random();
unsafe
{
long v = random.NextInt64();
Debug.Assert(new IntPtr(Unsafe.AsPointer(ref v)).ToInt32() % 8 == 0);
}

相关推荐

# Python 3 # Python 3字典Dictionary(1)

Python3字典字典是另一种可变容器模型,且可存储任意类型对象。字典的每个键值(key=>value)对用冒号(:)分割,每个对之间用逗号(,)分割,整个字典包括在花括号({})中,格式如...

Python第八课:数据类型中的字典及其函数与方法

Python3字典字典是另一种可变容器模型,且可存储任意类型对象。字典的每个键值...

Python中字典详解(python 中字典)

字典是Python中使用键进行索引的重要数据结构。它们是无序的项序列(键值对),这意味着顺序不被保留。键是不可变的。与列表一样,字典的值可以保存异构数据,即整数、浮点、字符串、NaN、布尔值、列表、数...

Python3.9又更新了:dict内置新功能,正式版十月见面

机器之心报道参与:一鸣、JaminPython3.8的热乎劲还没过去,Python就又双叒叕要更新了。近日,3.9版本的第四个alpha版已经开源。从文档中,我们可以看到官方透露的对dic...

Python3 基本数据类型详解(python三种基本数据类型)

文章来源:加米谷大数据Python中的变量不需要声明。每个变量在使用前都必须赋值,变量赋值以后该变量才会被创建。在Python中,变量就是变量,它没有类型,我们所说的"类型"是变...

一文掌握Python的字典(python字典用法大全)

字典是Python中最强大、最灵活的内置数据结构之一。它们允许存储键值对,从而实现高效的数据检索、操作和组织。本文深入探讨了字典,涵盖了它们的创建、操作和高级用法,以帮助中级Python开发...

超级完整|Python字典详解(python字典的方法或操作)

一、字典概述01字典的格式Python字典是一种可变容器模型,且可存储任意类型对象,如字符串、数字、元组等其他容器模型。字典的每个键值key=>value对用冒号:分割,每个对之间用逗号,...

Python3.9版本新特性:字典合并操作的详细解读

处于测试阶段的Python3.9版本中有一个新特性:我们在使用Python字典时,将能够编写出更可读、更紧凑的代码啦!Python版本你现在使用哪种版本的Python?3.7分?3.5分?还是2.7...

python 自学,字典3(一些例子)(python字典有哪些基本操作)

例子11;如何批量复制字典里的内容2;如何批量修改字典的内容3;如何批量修改字典里某些指定的内容...

Python3.9中的字典合并和更新,几乎影响了所有Python程序员

全文共2837字,预计学习时长9分钟Python3.9正在积极开发,并计划于今年10月发布。2月26日,开发团队发布了alpha4版本。该版本引入了新的合并(|)和更新(|=)运算符,这个新特性几乎...

Python3大字典:《Python3自学速查手册.pdf》限时下载中

最近有人会想了,2022了,想学Python晚不晚,学习python有前途吗?IT行业行业薪资高,发展前景好,是很多求职群里严重的香饽饽,而要进入这个高薪行业,也不是那么轻而易举的,拿信工专业的大学生...

python学习——字典(python字典基本操作)

字典Python的字典数据类型是基于hash散列算法实现的,采用键值对(key:value)的形式,根据key的值计算value的地址,具有非常快的查取和插入速度。但它是无序的,包含的元素个数不限,值...

324页清华教授撰写【Python 3 菜鸟查询手册】火了,小白入门字典

如何入门学习python...

Python3.9中的字典合并和更新,了解一下

全文共2837字,预计学习时长9分钟Python3.9正在积极开发,并计划于今年10月发布。2月26日,开发团队发布了alpha4版本。该版本引入了新的合并(|)和更新(|=)运算符,这个新特性几乎...

python3基础之字典(python中字典的基本操作)

字典和列表一样,也是python内置的一种数据结构。字典的结构如下图:列表用中括号[]把元素包起来,而字典是用大括号{}把元素包起来,只不过字典的每一个元素都包含键和值两部分。键和值是一一对应的...

取消回复欢迎 发表评论:

请填写验证码