Recent

Author Topic: String Processing CPU Speed and Large Data Sets  (Read 4173 times)

JLWest

  • Hero Member
  • *****
  • Posts: 1293
String Processing CPU Speed and Large Data Sets
« on: December 13, 2018, 01:14:27 am »
"String Processing CPU Speed and Large Data Sets" almost sounds like a Country Western Song.

Oh well:

I have a program under way and I'm having a little trouble with speed.

I think I have to much speed. Is that possible?

The program has 4 list boxes loaded with the following:
1. ICAO Operators   5597 items
2. Airports              4723 items
3. Airports              4723 items
4. Aircraft Objects   8631 items
2. AircraftTypes       4129 items


So I select a file from a drop down box and process the records.

Files have a few hundred records up to 5,000 record.
Each file has 8 or 9 (Optional) fields.
Each field is processed against the four listboxes,
 
So basically read a record:
    Check if the first field is in the  ICAO Operators Listbox.
    Secord Field in Airports listbox
   Third Field in Airports listbox
   Fourth Field in the AircraftObjects Listbox

The AircraftTypes listbox isn't being used right now due to the speed problem encountered.
And the other three fields are validated with code i.e. Zulu time between 0000 and 2400.

Now here is the problem:

The program was going to Not Responding. It would show in the title of the main form the screen would dim and Task Manager would show not responding, end task and try and figure out the problem.

Thought it was a data problem so I put a throttle switch (MessageBox) after processing  200 records hoping to find the bad data.

It went thru a 3,000 record file 200 records just fine. As long as I process large files with the throttle works fine.

Any suggestions.



 




FPC 3.2.0, Lazarus IDE v2.0.4
 Windows 10 Pro 32-GB
 Intel i7 770K CPU 4.2GHz 32702MB Ram
GeForce GTX 1080 Graphics - 8 Gig
4.1 TB

440bx

  • Hero Member
  • *****
  • Posts: 3946
Re: String Processing CPU Speed and Large Data Sets
« Reply #1 on: December 13, 2018, 02:04:47 am »
It went thru a 3,000 record file 200 records just fine. As long as I process large files with the throttle works fine.

Any suggestions.
From the description you give, I'd _guess_ that the problem, whatever it may be, happens after record 200.

Is it possible for you to post the project's source and the files you are using ?  That would allow someone to figure out what is happening in your program.
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

JLWest

  • Hero Member
  • *****
  • Posts: 1293
Re: String Processing CPU Speed and Large Data Sets
« Reply #2 on: December 13, 2018, 02:15:38 am »
@440

I press ok on the messagebox processing 200 records on a 3000 record file 60 times and it finishes the file just fine.

As for posting the project it's to big. the program is small but the data files to reproduce this is very large. I would have to do it in 5 or 6 posts.

Plus the AircraftObjects fiile is 948,020 KB ziped.

I suspect it gets in a tight loop searching that and falls out of bed.
« Last Edit: December 13, 2018, 02:32:11 am by JLWest »
FPC 3.2.0, Lazarus IDE v2.0.4
 Windows 10 Pro 32-GB
 Intel i7 770K CPU 4.2GHz 32702MB Ram
GeForce GTX 1080 Graphics - 8 Gig
4.1 TB

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: String Processing CPU Speed and Large Data Sets
« Reply #3 on: December 13, 2018, 02:50:04 am »
"Not Responding" because the main thread (GUI thread) is busy. Use another thread for long processing.

To check, instead of MessageBox, try using a label or the caption of the form to track the number of records, and call Application.ProcessMessages to allow the GUI thread to respond:
Code: Pascal  [Select][+][-]
  1.   Caption := i.ToString;
  2.   Application.ProcessMessages

Edit:
I suspect you are misusing the five list boxes. Probably you need a few TStringList instances.
« Last Edit: December 13, 2018, 02:53:55 am by engkin »

JLWest

  • Hero Member
  • *****
  • Posts: 1293
Re: String Processing CPU Speed and Large Data Sets
« Reply #4 on: December 13, 2018, 03:00:19 am »
"Not Responding" because the main thread (GUI thread) is busy. Use another thread for long processing.

To check, instead of MessageBox, try using a label or the caption of the form to track the number of records, and call Application.ProcessMessages to allow the GUI thread to respond:
Code: Pascal  [Select][+][-]
  1.   Caption := i.ToString;
  2.   Application.ProcessMessages

Edit:
I suspect you are misusing the five list boxes. Probably you need a few TStringList instances.

Your probably right but what you are proposing is above my grade level as a Pascal programmer.
I would have to research Thread processing and I can't manage that at the present.

Thanks.
FPC 3.2.0, Lazarus IDE v2.0.4
 Windows 10 Pro 32-GB
 Intel i7 770K CPU 4.2GHz 32702MB Ram
GeForce GTX 1080 Graphics - 8 Gig
4.1 TB

JLWest

  • Hero Member
  • *****
  • Posts: 1293
Re: String Processing CPU Speed and Large Data Sets
« Reply #5 on: December 13, 2018, 03:22:23 am »
Wow

I put  Application.ProcessMessages; inside a the loop that process the record.

Problem solved.

Thanks.
FPC 3.2.0, Lazarus IDE v2.0.4
 Windows 10 Pro 32-GB
 Intel i7 770K CPU 4.2GHz 32702MB Ram
GeForce GTX 1080 Graphics - 8 Gig
4.1 TB

JLWest

  • Hero Member
  • *****
  • Posts: 1293
Re: String Processing CPU Speed and Large Data Sets
« Reply #6 on: December 13, 2018, 03:29:46 am »
Not so sure now.

Processed a large file. It got thru it just fine but couldn't exit the program.
FPC 3.2.0, Lazarus IDE v2.0.4
 Windows 10 Pro 32-GB
 Intel i7 770K CPU 4.2GHz 32702MB Ram
GeForce GTX 1080 Graphics - 8 Gig
4.1 TB

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: String Processing CPU Speed and Large Data Sets
« Reply #7 on: December 13, 2018, 04:06:45 am »
Application.ProcessMessages was not a solution. And this also is not a solution:
Add a boolean variable to your form:
Code: Pascal  [Select][+][-]
  1.   TForm1 = class(TForm)
  2. ...
  3.   private
  4.     NeedToExit: boolean;
  5. ...

Use OnCloseQuery event:
Code: Pascal  [Select][+][-]
  1. procedure TForm1.FormCloseQuery(Sender: TObject; var CanClose: boolean);
  2. begin
  3.   NeedToExit := true;
  4. end;

Here is how to exit the loop:
Code: Pascal  [Select][+][-]
  1. ...
  2.     if i mod 100=0 then
  3.     begin
  4.       Caption := i.ToString;
  5.       Application.ProcessMessages;
  6.       if NeedToExit then
  7.         exit;
  8. ...
« Last Edit: December 13, 2018, 04:08:44 am by engkin »

JLWest

  • Hero Member
  • *****
  • Posts: 1293
Re: String Processing CPU Speed and Large Data Sets
« Reply #8 on: December 13, 2018, 04:56:48 am »
Coded ad running now.

Don't understand how it works which is dangerous. Changed the title on the Main form to a 0.

 Seems to be working fine on a large data set which is Alaska Airlines 2018 workd wide flight schedules.
One record for each scheduled flight

FPC 3.2.0, Lazarus IDE v2.0.4
 Windows 10 Pro 32-GB
 Intel i7 770K CPU 4.2GHz 32702MB Ram
GeForce GTX 1080 Graphics - 8 Gig
4.1 TB

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: String Processing CPU Speed and Large Data Sets
« Reply #9 on: December 14, 2018, 02:20:23 am »
Don't understand how it works which is dangerous.
Unlike the main application loop, ProcessMessages does not handle ending-the-application message.

Changed the title on the Main form to a 0.
Caption := i.ToString is not needed. It is there to give a relative location in your code for ProcessMessages. Also, you do not need to call ProcessMessages for every record. Once every 100th record should be enough, I guess. So i is meant to be a record counter in your code. Along these lines:
Code: Pascal  [Select][+][-]
  1. var
  2.   i: integer;
  3. begin
  4.   for i := 1 to TotalNumberOfRecord do
  5.   begin
  6.     //Your code to process a record goes here
  7.     ...
  8.     //End of your code
  9.  
  10.     if i mod 100=0 then
  11.     begin
  12.       Application.ProcessMessages;
  13.       if NeedToExit then
  14.         exit;
  15.     end;
  16.   end;
  17. end;

This should speed up processing a file with 5000 records by reducing the number of calls to ProcessMessages to 50 times.

CCRDude

  • Hero Member
  • *****
  • Posts: 596
Re: String Processing CPU Speed and Large Data Sets
« Reply #10 on: December 14, 2018, 09:37:22 am »
Let me add my two cents here as well ;)

Threading is the only warranty to avoid a freeze. Such a freeze will happen if something expects the program to react within a certain time, but it can't, because it's busy. So you need to find a trade-off between speed (Application.ProcessMessages only every nth entry, with a huge n), and reaction time on the slowest machine for the slowest entry (small n). You'll always make sacrificies.

Some kind of visual feedback is helpful (e.g. TProgressBar) - if the user in front of the program knows it's working, he's less likely to click and cause a freeze. But updating UI is slowing down massively, since Application.ProcessMessages will take longer.

Since UI updating is an issue even with threading operations, I often take a different approach: I have an external "progress" object (for threading, I use a descendant of TMultiReadExclusiveWriteSynchronizer), and the main UI has a temporary timer that simply refresh the progress every 40 to 100 ms (1/25 to 1/10 second), taking input from that object as needed. This means that only those steps that are "visually important" to the user are updated to the UI. So even with many Application.ProcessMessages, the UI updating messages are kept low.

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11383
  • FPC developer.
Re: String Processing CPU Speed and Large Data Sets
« Reply #11 on: December 14, 2018, 10:14:27 am »
I'm no GUI expert, but doesn't have listbox have beginupdate etc statements to delay updating?

Some of those before and endupdate after might speed things along.

furious programming

  • Hero Member
  • *****
  • Posts: 853
Re: String Processing CPU Speed and Large Data Sets
« Reply #12 on: December 14, 2018, 03:30:22 pm »
@marcov: not as much as TListBox as TStrings. It is strongly recommended to use Items.BeginUpdate and Items.EndUpdate before and after modyfying the content.

In total, it does not matter what component—if these methods are available, we should use them.
Lazarus 3.2 with FPC 3.2.2, Windows 10 — all 64-bit

Working solo on an acrade, action/adventure game in retro style (pixelart), programming the engine and shell from scratch, using Free Pascal and SDL. Release planned in 2026.

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: String Processing CPU Speed and Large Data Sets
« Reply #13 on: December 14, 2018, 07:35:58 pm »
I suspect JLWest is using TListBox.Items.IndexOf which does not benefit from sorted items. The speed could be improved using TStringList.Find on a sorted list.

lucamar

  • Hero Member
  • *****
  • Posts: 4219
Re: String Processing CPU Speed and Large Data Sets
« Reply #14 on: December 14, 2018, 07:52:46 pm »
Here is how to exit the loop:
Code: Pascal  [Select][+][-]
  1. ...
  2.     if i mod 100=0 then
  3.     begin
  4.       Caption := i.ToString;
  5.       Application.ProcessMessages;
  6.       if NeedToExit then
  7.         exit;
  8. ...

You don't need that NeedToExit field; you can use this:

Code: Pascal  [Select][+][-]
  1.     if i mod 100=0 then
  2.     begin
  3.       Caption := i.ToString;
  4.       Application.ProcessMessages;
  5.       if Application.Terminated then
  6.         exit;

That will take care of stopping if the user tries to close application.
Turbo Pascal 3 CP/M - Amstrad PCW 8256 (512 KB !!!) :P
Lazarus/FPC 2.0.8/3.0.4 & 2.0.12/3.2.0 - 32/64 bits on:
(K|L|X)Ubuntu 12..18, Windows XP, 7, 10 and various DOSes.

 

TinyPortal © 2005-2018