Recent

Author Topic: Scan subdirectories and count files in each.  (Read 6581 times)

TomTom

  • Full Member
  • ***
  • Posts: 170
Scan subdirectories and count files in each.
« on: February 16, 2018, 11:02:04 pm »
Hello :)
My name is Tomek. I'm new here and I'm Lazarus newbie.
I'm writing a program to make my work easier. It was going quite well but now I'm stuck.

One part of my program is searching for files with specified extension (I'm using FindAllFiles) and puts the result in StringGrid. That part works. But I also need to know how many files is in each subdirectory and write that information somewhere to use it later. How Can I achieve this?

My example directory tree looks like this:
Code: Pascal  [Select][+][-]
  1. c:\Folder1
  2.           |-----Folder2
  3.                         |-----Folder3
  4.                                     |-----Folder4
  5.                                               |-----Folder5
  6.                                               |            |----------- File_01.txt
  7.                                               |            |----------- File_02.txt
  8.                                               |            |----------- File_03.txt
  9.                                               |            |----------- File_04.txt
  10.                                               |
  11.                                               |-----Folder6
  12.                                               |            |----------- File_01.txt
  13.                                               |            |----------- ...
  14.                                               |            |----------- ...
  15.                                               |            |----------- File_N.txt
  16.                                               |
  17.                                               |---- ...
  18.                                               |
  19.                                               |---- FolderN        
  20.  

 

Bart

  • Hero Member
  • *****
  • Posts: 5275
    • Bart en Mariska's Webstek
Re: Scan subdirectories and count files in each.
« Reply #1 on: February 16, 2018, 11:13:31 pm »
Use a TShellTreeView perhaps?
(May be a bit overkill)

You could also sort the results of FindAllFiles then process that.
Iterate throught the strings and keep track of when the path changes.

Bart

TomTom

  • Full Member
  • ***
  • Posts: 170
Re: Scan subdirectories and count files in each.
« Reply #2 on: February 16, 2018, 11:24:01 pm »
I was thinking Your second proposition and I'm reading at the moment how to use regex in Lazarus ;).
And I think the same as You about using TreeView.

But thank You for reply :).

PatBayford

  • Full Member
  • ***
  • Posts: 125
Re: Scan subdirectories and count files in each.
« Reply #3 on: February 17, 2018, 12:25:54 am »
The standard way to do this is to use the SysUtils functions FindFirst, FindNext, FindClose which iterate a selected directory, returning information on each file found. It is relatively simple to use these recursively to visit every directory in a prescribed path
Check out the SysUtils unit for exact details.
Lazarus 1.8.0 FPC 3.0.2 SVN 56594 Windows 10 64bit (i386-win32-win32/win64)

Bart

  • Hero Member
  • *****
  • Posts: 5275
    • Bart en Mariska's Webstek
Re: Scan subdirectories and count files in each.
« Reply #4 on: February 17, 2018, 12:26:50 am »
Not sure why you would need a regex here.

Bart

Bart

  • Hero Member
  • *****
  • Posts: 5275
    • Bart en Mariska's Webstek
Re: Scan subdirectories and count files in each.
« Reply #5 on: February 17, 2018, 12:33:26 am »
The standard way to do this is to use the SysUtils functions FindFirst, FindNext, FindClose which iterate a selected directory, returning information on each file found. It is relatively simple to use these recursively to visit every directory in a prescribed path

This is what FindAllFiles does.

You can use the TFileSearcher with appropriate events to do the counting as well.
Or use my enumdirs unit for that.

Or indeed back to basics with FindFirst, FindNext, that'll give you all the flexibility you need.

Bart

TomTom

  • Full Member
  • ***
  • Posts: 170
Re: Scan subdirectories and count files in each.
« Reply #6 on: February 17, 2018, 09:12:59 am »
I was thinking about sorting files list, extracting directories paths using regex ,removing duplicated entries so I end up with list of unique dir paths. Then I could check how many files I have in those directories.
 But maybe I should take a deeper look into tfilesearcher.

Not sure why you would need a regex here.

Bart

molly

  • Hero Member
  • *****
  • Posts: 2330
Re: Scan subdirectories and count files in each.
« Reply #7 on: February 17, 2018, 01:30:43 pm »
I was thinking about sorting files list, extracting directories paths using regex ,removing duplicated entries so I end up with list of unique dir paths. Then I could check how many files I have in those directories.
 But maybe I should take a deeper look into tfilesearcher.
There is no need for regular expressions as long as you make use of the filename related functions.

in case you do the findfiles yourself you can make use of fpmask's matchesmask.
« Last Edit: February 17, 2018, 01:36:43 pm by molly »

Bart

  • Hero Member
  • *****
  • Posts: 5275
    • Bart en Mariska's Webstek
Re: Scan subdirectories and count files in each.
« Reply #8 on: February 17, 2018, 02:14:07 pm »
There cannot be duplicates (as in: same filename and same folder path).
You can sort by simply sorting the list that you get back from FindAllFiles, just set Sorted to True.

You list will then look like:
Code: [Select]
/home/tomtom/apath/afile
/home/tomtom/apath/anotherfile
/home/tomtom/apath/thirdfile
/home/tomtom/apath/sub/afile
/home/tomtom/secondpath/afile
/home/tomtom/secondpath/secondfile
{etc}

Now you can extract the path to the file with ExtractFilePath() for each entry.
Pseudo code
Code: [Select]
  CurPath := '';
  LastPath := '';
  Counter := 0;
  for i := 0 to List.Count-1 do
  begin
    CurPath := ExtractFilePath(List[i]);
    if (CurPath <> LastPath) then
    begin
      LastPath := CurPath;
      Counter for the current path := 1;
    end
    else
    begin
      Inc(Counter for the current path);
    end;
  end;//for

Bart

balazsszekely

  • Guest
Re: Scan subdirectories and count files in each.
« Reply #9 on: February 17, 2018, 04:13:32 pm »
1. Download VTV from here: https://github.com/blikblum/VirtualTreeView-Lazarus/releases/tag/lazarus-5.5.3-R1 , first you need to install lclexentsions.
2. Test attached project

PS: VTV is only needed for a visual feedback.
PS1: I did not test the program extensively, it may contain bugs.
PS2: You can greatly improve, add icon, move the bulk of the work to a thread, etc...

TomTom

  • Full Member
  • ***
  • Posts: 170
Re: Scan subdirectories and count files in each.
« Reply #10 on: February 17, 2018, 04:33:59 pm »
There will be duplicates when I extract (using regular expression) string from the first character of the line to the last '/' char. As in Your example when I extract path part of the string I will get these:
Code: [Select]
/home/tomtom/apath/
/home/tomtom/apath/ <-- duplicate of above
/home/tomtom/apath/ <-- duplicate of above
/home/tomtom/apath/sub/
/home/tomtom/secondpath/
/home/tomtom/secondpath/ <-- duplicate of above
{etc}

Can I use ExtractFilePath with string from  StringGrid cell? <-- ofcourse I can :P... so there is no need to use regex in that case. That's a shame because I wanted to use it to learn it :P. I thought it's not possible  to use ExtractFilePath with string:P. I'm still learning. I had some experience with Delphi 10 years ago but... nothing left :P.
Ok so I can go further now :P...

« Last Edit: February 17, 2018, 04:55:48 pm by TomTom »

Bart

  • Hero Member
  • *****
  • Posts: 5275
    • Bart en Mariska's Webstek
Re: Scan subdirectories and count files in each.
« Reply #11 on: February 17, 2018, 06:12:50 pm »
You confuse iterating through the list with altering the contenst of the list.
Initially the list will not contain duplicates (if it does, your filesystem is f*cked up).

None of this requires the use of regular expressions, which are confusing an can lead to easily overseen mistakes.

To extract the path of a filename just use the standard functions fpc has for that, like I showed you. They are well tested and will not fail.

You need something to store the number of files associated with each folder.
For that you will need another datatype.

You yourself must decide how that datatype must be declared, because only know how it is supposed to be used in your program.
This could be yet another TStringList which you build up like:
Code: [Select]
/home/tomtom/folder1=32
/home/tomtom/folder1/sub=12
{etc}

You can the use built-in functions to retrieve the number that is associated with e.g. /home/tomtom/folder1/sub (which would be 12 in this case).

Bart

taazz

  • Hero Member
  • *****
  • Posts: 5368
Re: Scan subdirectories and count files in each.
« Reply #12 on: February 17, 2018, 06:25:07 pm »
You confuse iterating through the list with altering the contenst of the list.
Initially the list will not contain duplicates (if it does, your filesystem is f*cked up).
you assume that the iteration is of directories and not files while the files might not be doublicates the directories are the same for all files in it. thing of it like extrafiledirectory will produce duplicates for any file in the same directory
Good judgement is the result of experience … Experience is the result of bad judgement.

OS : Windows 7 64 bit
Laz: Lazarus 1.4.4 FPC 2.6.4 i386-win32-win32/win64

Bart

  • Hero Member
  • *****
  • Posts: 5275
    • Bart en Mariska's Webstek
Re: Scan subdirectories and count files in each.
« Reply #13 on: February 17, 2018, 06:57:07 pm »
You confuse iterating through the list with altering the contenst of the list.
Initially the list will not contain duplicates (if it does, your filesystem is f*cked up).
you assume that the iteration is of directories and not files while the files might not be doublicates the directories are the same for all files in it. thing of it like extrafiledirectory will produce duplicates for any file in the same directory

That makes no sense to me at all.
If List (TStringList) is the first parameter of FindAllFiles(), then there will be NO duplicates in List ever.

You should not add every directory you get with ExtractFilePath to yet another TStringList, since it may already be in there (unfortunately List will not be sorted in such a way that all files in a folder are listed before the next folder is processed).

I attached a demo program that counts the files.
It uses a second stringlist for the counting (not very efficient, I know), and when the filepath changes (from one entry to another) it first checkes if we already have counted some files from that folder before it continues.
It will display all found files (in the current directory) in the left memo, and the result of counting in the right one.

Bart


taazz

  • Hero Member
  • *****
  • Posts: 5368
Re: Scan subdirectories and count files in each.
« Reply #14 on: February 17, 2018, 08:48:07 pm »
You confuse iterating through the list with altering the contenst of the list.
Initially the list will not contain duplicates (if it does, your filesystem is f*cked up).
you assume that the iteration is of directories and not files while the files might not be doublicates the directories are the same for all files in it. thing of it like extrafiledirectory will produce duplicates for any file in the same directory

That makes no sense to me at all.
true it makes even less to me but the TS is using regex to "extract" them which by it self is a can of worms so...
If List (TStringList) is the first parameter of FindAllFiles(), then there will be NO duplicates in List ever.

You should not add every directory you get with ExtractFilePath to yet another TStringList, since it may already be in there (unfortunately List will not be sorted in such a way that all files in a folder are listed before the next folder is processed).

I attached a demo program that counts the files.
It uses a second stringlist for the counting (not very efficient, I know), and when the filepath changes (from one entry to another) it first checkes if we already have counted some files from that folder before it continues.
It will display all found files (in the current directory) in the left memo, and the result of counting in the right one.

Bart
nice! I hope its useful to the TS.
Good judgement is the result of experience … Experience is the result of bad judgement.

OS : Windows 7 64 bit
Laz: Lazarus 1.4.4 FPC 2.6.4 i386-win32-win32/win64

 

TinyPortal © 2005-2018