Monday, April 25, 2011

C++ Optimization on negative integers

Lets say we have a negative integer say int a;

is there a faster implementation of -a?

Do I have to do some bitwise operation on this?

From stackoverflow
  • Are you seeing a performance issue with negating numbers? I have a hard time thinking that most compilers would do a bitwise op against integers to negate them.

    1800 INFORMATION : Maybe he is writing code to track the amount of money in AIG's bank account and the hardware can't keep up with the number of subtractions required
    paxdiablo : There's an urban myth that the IRS has a special computer just for Bill Gates' tax returns (and presumably Buffet's as well).
  • There's almost certainly nothing faster than the machine code NEG instruction that your compiler will most likely turn this into.

    If there was, I'm sure the compiler would use it.

    For a twos-complement number, you could NOT it and add 1 but that's almost certainly going to be slower. But I'm not entirely certain that the C/C++ standards mandate the use of twos-complement (they may, I haven't checked).

    I think this question belongs with those that attempt to rewrite strcpy() et al to get more speed. Those people naively assume that the C library strcpy() isn't already heavily optimized by using special machine code instructions (rather than a simplistic loop that would be most people's first attempt).

    Have you run performance tests which seem to indicate that your negations are taking an overly long time?

    <subtle-humor-or-what-my-wife-calls-unfunny>

      A NEG on a 486 (state of the art the last time I had to worry about clock cycles) takes 3 clock cycles (memory version, register only takes 1) - I'm assuming the later chips will be similar. On a 3Ghz CPU, that means you can do 1 billion of these every second. Is that not fast enough?

    </subtle-humor-or-what-my-wife-calls-unfunny>

    rlbond : A quick comment: Accessing memory for a read takes orders of magnitude more than 1 cycle, so it's likely more like 100 or 1000 cycles (due to memory refreshing). This makes the question's optimization even more pointless.
    Adam Rosenfield : Also, it takes a variable number of cycles for a memory access, depending on whether or not the cache hits. Cache misses are VERY expensive (relatively speaking).
    dreamlax : You wouldn't be able to do 1 billion in one second unless the CPU ran nothing but NEG instructions for one second, which it probably wouldn't due to the fact that other applications would be running, let alone the operating system providing you the CPU resources.
    Michael Burr : FWIW, the C/C++ standards do not mandate twos complement.
    paxdiablo : Good points, @rlbond, @Adam and @dream. Adjusted answer - it seems you understand my subtle sense of humor about as much as my wife (but now it's 4 against 1, so you bods may actually be right :-).
    dreamlax : @Pax, I knew you were probably having a punt but I thought I would clarify nonetheless :)
  • Have you ever heard the phrase "premature optimization"? If you've optimized all of your code, and this is the only thing left, fine. If not, you're wasting your time.

    paxdiablo : Even if this IS the only thing left, it's still probably a waste of time. Just wait for the next generation of CPUs (hopefully faster).
    Ken White : Exactly. Trying to optimize a basic increment/decrement operation is a waste of every single millisecond you spend on it. That's a basic CPU-level operation, and the folks at the CPU plants are a heck of a lot smarter than I am. :-)
  • To clarify Pax's statement,

    C++ compilers are not mandated to use two's compliment, except in 1 case. When you convert a signed type to an unsigned type, if the number is negative, the result of the conversion must be the 2's compliment representation of the integer.

    In short, there is not a faster way than -a; even if there were, it would not be portable. Keep in mind as well that premature optimization is evil. Profile your code first and then work on the bottlenecks.

    See The C++ Programming Language, 3rd Ed., section C.6.2.1.

    Michael Burr : interesting fact about the signed to unsigned conversion - I wasn't aware of that. Of course, the standard doesn't describe it in quite the nice, easy to understand way: "if the new type is unsigned, the value is converted by repeatedly adding or subtracting..." etc, etc.
  • Perhaps you should think about optimizing your algorithms more-so than little things like this. If this is the last thing to optimize, your code is as fast as it's going to get.

  • I can negate a number twice in zero cycles.

    Jon Skeet can do it six times in the same amount of time.

  • Negating a number is a very simple operation in terms of CPU hardware. I'm not aware of a processor that takes any longer to do negation than to do any bitwise operation - and that includes some 30 year old processors.

    Just curious, what led you to ask this question? It certainly wasn't because you detected a bottleneck.

  • All good answers.

    If (-a) makes a difference, you've already done some really aggressive performance tuning.

    Performance tuning a program is like getting water out of a wet sponge. As a program is first written, it is pretty wet. With a little effort, you can wring some time out of it. With more effort you can dry it out some more.

    If you're really persistent you can get it down to where you have to put it in the hot sun to get the last few molecules of time out of it.

    That's the level at which (-a) might make a difference.

    j_random_hacker : +1 for a cool analogy :)

0 comments:

Post a Comment