In my application I am copying large amounts of floats, some 50 million values or more, from one array to another, performing a simple filter operation; for instance, values outside a certain range are discarded. The target array will therefore be smaller than the original. I started with incrementing the size with each new element copied across, but that became very noticeably slower as the target array grew in size. Blockwise allocation speeded it up quite significantly.
I tested prime numbers up to 50,000,000 and finally saw a measurable difference in performance between pre-allocating the array size to the max likely number...
By the time we get to 50 million (which brings us into the millions of prime numbers), I saw a difference of 5 seconds between starting the array at full size, and starting at 25 elements and growing as needed.