Likely incorrect 64bit code generation in FPC v3.0.4

Forum > FPC development

(1/8) > >>

440bx:
Hello,

In 64bit, when using a negative integer index to index into an array, the index isn't loaded into a 64bit register, instead it is loaded into a 32bit register which is then later used as a 64bit register in a load effective address instruction. This causes the integer index to always be used as if it were a positive displacement.

for the line:
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} --- LineIdxPtr^[RowIdx][I + 1] := LineAreaPtr; in 32bit, FPC generates :
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---// (from GDB) - 32bit code generated (this works) SignExtend.lpr:132 LineIdxPtr^[RowIdx][I + 1] := LineAreaPtr;0040198F 8b1520804100 mov 0x418020,%edx00401995 a130804100 mov 0x418030,%eax0040199A 8b0d40804100 mov 0x418040,%ecx004019A0 8d0482 lea (%edx,%eax,4),%eax004019A3 8b1510804100 mov 0x418010,%edx004019A9 891488 mov %edx,(%eax,%ecx,4)
which works correctly because there is no need to sign extend the integer index RowIdx.

in 64bit, FPC generates:
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---// (from GDB) - 64bit code generated (this does NOT work) SignExtend.lpr:132 LineIdxPtr^[RowIdx][I + 1] := LineAreaPtr;0000000100001B02 488b0517e50100 mov 0x1e517(%rip),%rax # 0x1000200200000000100001B09 8b1521e50100 mov 0x1e521(%rip),%edx # 0x1000200300000000100001B0F 48630d2ae50100 movslq 0x1e52a(%rip),%rcx # 0x1000200400000000100001B16 488d04d0 lea (%rax,%rdx,8),%rax0000000100001B1A 488b15efe40100 mov 0x1e4ef(%rip),%rdx # 0x1000200100000000100001B21 488914c8 mov %rdx,(%rax,%rcx,8) in that sequence of code, RowIdx is being loaded into edx, instead of rdx with sign extension. This causes the instruction
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---0000000100001B16 488d04d0 lea (%rax,%rdx,8),%raxto compute an address which is incorrect.

The attached program works when compiled for 32bit but causes an access violation when compiled for 64bit.

Changing the line
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} --- LineIdxPtr^[RowIdx][I + 1] := LineAreaPtr; to include a typecast to ptrint or int64:
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} --- LineIdxPtr^[ptrint(RowIdx)][I + 1] := LineAreaPtr; causes the correct code to be generated.

Just in case, I thought I'd wait for comments before reporting this on Mantis.

Jonas Maebe:
The reason the compiler uses a load without sign extension is that your arrays are declared as having only valid unsigned index types (0..0 and 1..1). I know this is a common way to declare variable sized arrays, but it's still hack and these statements are all invalid (they cause range errors, which means the result is undefined).

The correct way to declare such arrays is to either declare them as array[low(ptrint) div sizeof(elementsize)..high(ptrint) div sizeof(elementsize)], or to use a pointer type to the element size (you can index pointer types like arrays both in FPC and in Delphi).

440bx:

--- Quote from: Jonas Maebe on March 13, 2019, 09:00:40 am ---The reason the compiler uses a load without sign extension is that your arrays are declared as having only valid unsigned index types (0..0 and 1..1). I know this is a common way to declare variable sized arrays, but it's still hack and these statements are all invalid (they cause range errors, which means the result is undefined).

The correct way to declare such arrays is to either declare them as array[low(ptrint) div sizeof(elementsize)..high(ptrint) div sizeof(elementsize)], or to use a pointer type to the element size (you can index pointer types like arrays both in FPC and in Delphi).

--- End quote ---
I don't buy that explanation but, leaving that aside. The fact that the code behaves differently when compiled for 32bit than in 64bit is clearly a problem. The result should _not_ depend on the bitness. Either there is a bug when compiling for 64bit or there is a bug when compiling for 32bit. You choose.

PascalDragon:
No, it's not a problem, because as Jonas said you've entered the realm of undefined behaviour and it's nowhere written that this undefined behaviour needs to be the same on all platforms.

440bx:

--- Quote from: PascalDragon on March 13, 2019, 10:05:04 am ---No, it's not a problem, because as Jonas said you've entered the realm of undefined behaviour and it's nowhere written that this undefined behaviour needs to be the same on all platforms.

--- End quote ---
RowIdx is an integer. Specifically, a 32bit integer. If it is going to be moved into a 64bit register then it _must_ be sign extended. Anything else is incorrect and cannot be justified.

It is not reasonable to claim that is some kind of "undefined behavior".

That is a bug, a nasty one at that, because the result depends on the target bitness.

Navigation

[0] Message Index

[#] Next page