Allocating on the stack: re- vs. pre-

A place to discuss the implementation and style of computer programs.

Moderators: phlip, Moderators General, Prelates

User avatar
Qaanol
The Cheshirest Catamount
Posts: 3069
Joined: Sat May 09, 2009 11:55 pm UTC

Allocating on the stack: re- vs. pre-

Postby Qaanol » Sat Jul 11, 2015 3:12 am UTC

In a C program, is there any advantage to putting “int temp[n];” inside a loop or outside?

Say the code looks like this:

Code: Select all

int f(const int n, const int reps) {
  int result = 0;
  int temp[n];           // <---- this line
  for (int i = 0; i < reps; ++i) {
                         // <---- could go here
    getValues(temp, n, i);
    // Update result with the data now in temp
  }
  return result;
}


We may assume n < 20 and reps > 1000 (possibly much greater). Although temp is only needed within the loop, the function will exit soon after the loop anyway.

Is there likely to be a perceptible speed difference between pre-allocating temp before the loop, versus re-allocating it on each pass?
wee free kings

EvanED
Posts: 4331
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI
Contact:

Re: Allocating on the stack: re- vs. pre-

Postby EvanED » Sat Jul 11, 2015 4:11 am UTC

Ordinarily I would say: With no initializer on the array, I would not be surprised if the code generated for each case is identical.

The fact that it's a VLA though... I'm not so sure. Unfortunately, I suspect it will allocates and deallocates the extra stack space each loop iteration. n probably doesn't matter except that if the compiler can establish that it is definitely less than 4096 or thereabouts it can probably do a more efficient allocation, but it can't do that without looking at the calling context for your example.

Basically: compile it and see. You can stop the compiler before it assembles (-S to GCC, /FAs to MSVC). You'll probably see a call alloca, call _alloca, or something similar in your function.Actually, I don't think VLAs are likely to use alloca. You'll probably see a call to a function with "chkstk" in its name or something similar. Maybe MSVC calls it something with probe in the name, if it even supports VLAs? If it's inside the loop and reps is very large and getValues is short, you might see a noticeable relative slowdown. Probably it would need to be much larger than 1000 for the difference to be measurable though; alloca should be quite fast compared to heap allocs and deallocs.

User avatar
Thesh
Made to Fuck Dinosaurs
Posts: 6580
Joined: Tue Jan 12, 2010 1:55 am UTC
Location: Colorado

Re: Allocating on the stack: re- vs. pre-

Postby Thesh » Sat Jul 11, 2015 4:25 am UTC

EvanED wrote:The fact that it's a VLA though... I'm not so sure. Unfortunately, I suspect it will allocates and deallocates the extra stack space each loop iteration. n probably doesn't matter except that if the compiler can establish that it is definitely less than 4096 or thereabouts it can probably do a more efficient allocation, but it can't do that without looking at the calling context for your example.


Because n is a const parameter, I don't see any reason why it would. The compiler should optimize them both to be identical.

EDIT: I tested in GCC, and it appears you were right. I guess it's just one of those things that the compiler didn't optimize for.
Summum ius, summa iniuria.

EvanED
Posts: 4331
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI
Contact:

Re: Allocating on the stack: re- vs. pre-

Postby EvanED » Sat Jul 11, 2015 6:19 am UTC

Thesh wrote:
EvanED wrote:The fact that it's a VLA though... I'm not so sure. Unfortunately, I suspect it will allocates and deallocates the extra stack space each loop iteration. n probably doesn't matter except that if the compiler can establish that it is definitely less than 4096 or thereabouts it can probably do a more efficient allocation, but it can't do that without looking at the calling context for your example.


Because n is a const parameter, I don't see any reason why it would. The compiler should optimize them both to be identical.
It probably could, but I'm not sure I'd bet on that case specifically being handled. There's also an edge case where it might make a difference, which is that weird things can happen if you use both alloca and VLAs in the same function. At least per the GCC docs, these functions are highly likely to behave differently:

Code: Select all

int foo_1(const int n)
{
    void *p1, *p2;
    if (1) {
        int vla[n];
        p1 = alloca(4);
    }
    p2 = alloca(4);
    return p1 == p2; // will likely return 1
}

int foo_2(const int n)
{
    void *p1, *p2;
    int vla[n];
    if (1) {
        p1 = alloca(4);
    }
    p2 = alloca(4);
    return p1 == p2; // will definitely return 0
}
Things become even more different if you change the if to a loop, because foo_1 will allocate a constant amount of stack space and foo_2 will allocate an amount proportional to the number of loop iterations.

Now, that's not an absolute exclusion either; compilers already detect alloca-using functions and disable certain behaviors. (For example, the compiler has to disable frame-pointer optimization in these functions.) But I wouldn't be surprised if they decided that "I have a function with a VLA declared in a loop with a size that is const and there is no alloca in that same function" was edge-casey enough to worry about speeding up a small amount. And I'd guess that behavior wouldn't just fall out of other stuff the compiler does anyway and would have to be explicitly handled.

On the other hand, compilers are really good. I could be surprised. :-) I'll try it out tomorrow, but I'm too lazy right now.I guess I don't need to. :-)

User avatar
Qaanol
The Cheshirest Catamount
Posts: 3069
Joined: Sat May 09, 2009 11:55 pm UTC

Re: Allocating on the stack: re- vs. pre-

Postby Qaanol » Sun Jul 12, 2015 3:38 am UTC

Thanks! I’ll be sure to preallocate my VLA’s.
wee free kings

commodorejohn
Posts: 1182
Joined: Thu Dec 10, 2009 6:21 pm UTC
Location: Placerville, CA
Contact:

Re: Allocating on the stack: re- vs. pre-

Postby commodorejohn » Wed Jul 15, 2015 3:39 pm UTC

General rule: unambiguously specifying a more optimal design is always better than hoping the compiler will do it for you, especially when it's such a trivial change to the code.
"'Legacy code' often differs from its suggested alternative by actually working and scaling."
- Bjarne Stroustrup
www.commodorejohn.com - in case you were wondering, which you probably weren't.

User avatar
Sizik
Posts: 1256
Joined: Wed Aug 27, 2008 3:48 am UTC

Re: Allocating on the stack: re- vs. pre-

Postby Sizik » Wed Jul 15, 2015 3:51 pm UTC

General-er rule: Always test to see if your optimizations are actually optimal, since compiler-optimized straightforward code could be faster than user-"optimized" clever code that the compiler has no idea how to optimize.
she/they
gmalivuk wrote:
King Author wrote:If space (rather, distance) is an illusion, it'd be possible for one meta-me to experience both body's sensory inputs.
Yes. And if wishes were horses, wishing wells would fill up very quickly with drowned horses.

commodorejohn
Posts: 1182
Joined: Thu Dec 10, 2009 6:21 pm UTC
Location: Placerville, CA
Contact:

Re: Allocating on the stack: re- vs. pre-

Postby commodorejohn » Wed Jul 15, 2015 10:40 pm UTC

General-er-er rule: avoiding basic, straightforward algorithm cleanup because on a hypothetical level you can't be 100% certain that it won't confuse some lunatic "optimizer" that is designed to turn bad code into good output into generating suboptimal output from good code is just silly.

Specific rule: moving constant code out of a loop isn't "clever" anyway.
"'Legacy code' often differs from its suggested alternative by actually working and scaling."
- Bjarne Stroustrup
www.commodorejohn.com - in case you were wondering, which you probably weren't.


Return to “Coding”

Who is online

Users browsing this forum: No registered users and 9 guests