Lazarus

Using the Lazarus IDE => Debugger => Topic started by: HHick123 on December 26, 2018, 04:36:09 pm

Title: gdb "pause" broken?
Post by: HHick123 on December 26, 2018, 04:36:09 pm
Hi, on a fresh install of Lazarus 1.8.4.

When I compile the standard (default) application, run it and press "pause" then
the "assembler window" appears and when I hit F8 then to step, 
the debugger crashes with "The debugger encountered an error when trying to run/step the application: Cannot find bounds of current function."

Any ideas?

Regards, Helmut
Title: Re: gdb "pause" broken?
Post by: Martin_fr on December 26, 2018, 09:29:13 pm
What OS?
What gdb version?

This is a problem in gdb itself. It can happen in any debugged app, but may not always happen.

Pause can interrupt your app anywhere, including at some code that does not have line info. (kernel or RTL)
For such location stepping (except asm instruction stepping) does not work (that is expected, it is impossible for it to work in that case).

GDB should not crash, but it does in your case. Nothing that can be done, except maybe trying a different version of gdb.

If you stop at such a location you can:
- try and set a breakpoint (if the stackwindow works, which it might not)
- do assemble stepping (either from the buttons in the asm dialog, or by assigning keys for this)
- run and pause again, until you get lucky and break at a location in your source.

You can also compile your own fpc, with line info. The at least pause in RTL will have line info. Kernel still will not have it. (and 3rd party dll will also not)

---------------------------
If you pause while your app is IDLE, then you will always be in the kernel, and there is no stepping at all.

If your app is IDLE then it is interrupted by the OS until it gets input (or a timer, or paint event). Your OS will hold it at some code in the kernel where there is no stack, no caller, no stepping, ....
Title: Re: gdb "pause" broken?
Post by: HHick123 on December 31, 2018, 12:11:27 pm
> What OS?
Windows 10

> What gdb version?
I think it is
8.2.0 for codetyphon
7.3.50 for lazarus

Regards, Helmut
Title: Re: gdb "pause" broken?
Post by: Thaddy on December 31, 2018, 12:42:25 pm
Can you try just the GDB from the Lazarus version? I will not  debug Codetypon issues and I do not have GDB 8.2.0. nor am I going to install it now. (Although I see it is stable 8.2.1)
Title: Re: gdb "pause" broken?
Post by: Martin_fr on December 31, 2018, 12:53:28 pm
Lazarus can run with either GDB version.
Of course each gdb version has its own set of bugs, and those will be present depending on which one you use.

In any case from the description so far, the best you can hope for, is that gdb does not crash on the error.
Yet it would still not be able to "step out", with the exact same error. And yet it may (or may not) still not show a stacktrace...

That is not really a bug, but depends on the code in which you got paused.

Maybe you can share more details (though that may NOT mean that more answers are available...)
- What is the function name you get paused in? (if it starts with ntdl, then it is in the kernel, and there may not be a way out, or it may need hundreds of steps)
- Is your app a GUI app?
- When you hit pause, is your app busy (does your computer show cpu usage for your app)?

------------
And definitely try changing the current thread.
You may just get to see the wrong thread.
Title: Re: gdb "pause" broken?
Post by: HHick123 on December 31, 2018, 05:51:05 pm
> Maybe you can share more details

For example, when I start this simple test code (a TTimer component at Form1):

Code: [Select]
procedure TForm1.Timer1Timer(Sender: TObject);
var
 i:integer;
begin
  Timer1.enabled:=false;
  i:=0;
  repeat
   i:=i+1;
   if i>10 then i:=0;
  until false;
end;   
     

Of course, it does not return from the loop. So imagine, it would not return because of a bug and I would like to see "where my program runs"... So, I would hit "pause". But now the assembler window appears and when I try to continue via F7 or F8 it showes the above mentioned error...  Interestingly, it always shows the same assembler instruction after pause (does not seem to be random)...

Edit: I just found out: it does not crash, as I thought before, but recovers from the error! When I press ok, I really can step via F8  :-)
btw: The german translation of the buttons is a bit misleading... The caption of the button "Weiter" (which actually means "continue") should be replaced by something like "Details..." and "Halt" should be "Stop", as mentioned in the text above the buttons....

Title: Re: gdb "pause" broken?
Post by: Martin_fr on December 31, 2018, 06:21:48 pm
The address in your asm shows that you are in the kernel, or maybe some dll.

Since Microsoft does not ship debug info for that (wonder why :) ), stepping (out) is not (always) possible.

There are 2 possibilities:

- Pause pauses in the wrong thread (there are threads, even if you do not create them. Windows does create them if a debugger requests a pause...)
=> switch the thread (in the threads window of the debugger)

- your code makes calls to the OS, or library. (eg changing properties of visible components)
=> if you are lucky and there is info in the stack => good / otherwise: run and pause again until lucky / or set breakpoints in your code and hope one gets hit.
Title: Re: gdb "pause" broken?
Post by: Martin_fr on December 31, 2018, 06:28:10 pm
Strange, the error window should only have 2 buttons.

Feel free to report this as a bug. (I will then check if the English window has 3 buttons too, but probably it has)

Note: there are 2 issues.

1) 3 buttons, only 2 needed

2) whatever the German translation does or does not obscure. Separate issue. Will not be reviewed as part of the 3 buttons issue, as a different person needs to take care of it.
Title: Re: gdb "pause" broken?
Post by: HHick123 on December 31, 2018, 10:10:04 pm
Quote
Pause pauses in the wrong thread (there are threads, even if you do not create them. Windows does create them if a debugger requests a pause...)
=> switch the thread (in the threads window of the debugger)

Ah, yes, you are right!

By switching the thread (thread window) I can avoid the error message "Cannot find bounds of current function" and stepping via F8, etc. works!

But isn't it a bug? I think it would be better to stop in or switch to the right thread automatically. At least in cases, when it stopped in a thread that was not created by the user code.
Title: Re: gdb "pause" broken?
Post by: Martin_fr on January 01, 2019, 01:09:35 am
By switching the thread (thread window) I can avoid the error message "Cannot find bounds of current function" and stepping via F8, etc. works!

But isn't it a bug? I think it would be better to stop in or switch to the right thread automatically. At least in cases, when it stopped in a thread that was not created by the user code.
Ideally yes. I am not sure if it is always possible for the IDE to tell which thread is correct.

You can report it, it will need some investigation. And it probably be rather very low prior right now.
Title: Re: gdb "pause" broken?
Post by: HHick123 on January 10, 2019, 10:16:56 pm
Quote
I am not sure if it is always possible for the IDE to tell which thread is correct.
Hmm, when I compare the thread window with the thread output of the sysinternal process explorer, it looks like follows (attached jpg).

I did not test much with different gdb versions, but I think, the number of the threads shown in the thread window seems to me to be always equal (sometimes 2, sometimes 3, etc.) to the number of threads shown with process explorer. I can easily identify the correct thread via process explorer (I marked it yellow), but I cannot always find it (at least it's not obvious) in the thread window (above). But process explorer can show it, so it must be possible.

But is this functionality from gdb or from lazarus? gdb, I guess? maybe a gdb bug?
If I would have more time, I would look into this topic more deeply. Maybe someday.
Title: Re: gdb "pause" broken?
Post by: Martin_fr on January 11, 2019, 12:19:39 am
The thread info comes directly from gdb.
IIRC I have seen gdb sometimes shown the correct line and source file, similar to the process explorer (or even better, as it has the debug info).

The ID (1,7,13) is from gdb, and usually 1 is your main thread.
The id you see in process explore is in the 2nd column  10688 = 29C0   (704.29C0)

Process Explorer says the instruction pointer (execution pos) is 0x2cb0 bytes into the code segment of your Tax.exe.
GDB says it is at 7FF9D01.... => I would say that looks like an address in the kernel.

So I do not know who has the correct address (or if they are the same, and your code (compiled pascal) ended up at that address.
If they are not the same, I do not know who is right. Or why gdb (if indeed it is gdb) fails to get the correct address.

If gdb had an address inside your code, it would most likely show the source file and line.


I just did a test. (using your endless repeat loop)


Try this gdb https://sourceforge.net/projects/lazarus/files/Lazarus%20Windows%2064%20bits/Alternative%20GDB/GDB%208.2/



It would probably be a good idea if the IDE remembered the last selected thread id (1,  7 or 13) and if avail re-selects it. (On pause / not on reaching a breakpoint).
Feel free to add a feature request.
Title: Re: gdb "pause" broken?
Post by: HHick123 on February 11, 2019, 12:39:02 am
Yes, that sounds very interesting.
One more thought:

When comparing the call stack of "Process Explorer" with the call stack of Lazarus, I found that in many cases the call stack of Lazarus is shorter (see i.e. the attached screenshot). I think, I also saw cases where "project1.exe" could not be found in the lazarus call stacks at all, because the display ended, before reaching "project1.exe".

Is the short call stack displayed in Lazarus due to a gdb/lazarus issue, or is intended this way and no problem at all?

Regards, Helmut
Title: Re: gdb "pause" broken?
Post by: Martin_fr on February 11, 2019, 02:28:32 am
Is the short call stack displayed in Lazarus due to a gdb/lazarus issue, or is intended this way and no problem at all?
If it does not expand when pressing the "more" button, or increasing the limit, then it is a gdb issue.
Title: Re: gdb "pause" broken?
Post by: HHick123 on February 13, 2019, 10:25:13 pm
P.S.: I tried to attach "Code::Blocks" to a running Lazarus-exe.
Interestingly, it also shows the "short" stacktraces with ??, just like Lazarus.
When trying to step it shows similar behaviour "cannot find bounds of current function"... (screenshot)

Edit:
The gdb stops the backtrace with the following error:
"Backtrace stopped: Previous Frame inner to this frame (corrupt stack?)"

I googled this error and found an interesting thread from the gdb mailing list: It seems that "Go" once had a similar problem with gbd (I think, now they have their own debugger) and they discussed, whether an option to disable the frame checking should be implemented in gdb:

UNWIND_INNER_ID:
https://sourceware.org/ml/gdb/2012-10/msg00009.html
Title: Re: gdb "pause" broken?
Post by: Martin_fr on February 16, 2019, 09:52:41 pm
Code:Blocks uses gdb too. So same result is to be expected. (on the same executable)

If I understand the external post  (go) correctly, then without the check you are highly likely to get wrong frames in the trace. That would not be much more helpful?

I do not know all the tricks that gdb may (or may not) use to get a trace.

A normal stack follows certain rules, involving 2 registers SP (esp or rsp) and BP (ebp or rbp). So long as all the code uses them correctly, the stack should be ok.
With one exception already: If you are on the very first line, or the very last line (in most cases a handful of asm statements) then it may not work (it also may work), because the asm is just building the frames for the stack. Those parts of code a called prologue and epilogue.

In addition to that a compiler can write dwarf info helping the debugger. I do not know if fpc does, or if gdb uses this (if coming from fpc).
But since the issue is with code in the kernel or 3rd party lib, it would not help

Further more maybe (no idea) gdb reads the surrounding asm, and tries to make guesses what to expect.
IF so, then in that case gdb would best know asm as generated by gcc.

Now as for the correct usage of those 2 registers.
On certain optimization levels, and under very restrictive conditions fpc itself omits this. (for speed).
The rtl as compiled in the installer, has a few such cases.
Those cases always lead to the next higher frame being hidden. That is: one frame from the stack is simply not shown.

I have never bothered to check, but it is plausible that the kernel (and 3rd party libs) also omit this.
If they do that would also lead to skipped frames.

But combined with other optimizations, this can lead to the stack being entirely unreadable.
The stack contains a mixture of local vars, temp data, and the address of calling functions.
Only if you know, where in this mix the address of the calling function is, can you read the next frame.

It is possible that sysinternal (being specific to Windows) has access to further debug info of the kernel. If such info exist, it would be in a windows specific format (and not dwarf) or even hard-coded into sysinternal  products. GDB can only read dwarf (and stabs).

Anyway there is nothing that can be done. (you can report with gdb, but that is a long shot)
TinyPortal © 2005-2018