12 votes

Analyzing the simplest C++ program

3 comments

  1. joplin
    Link
    This is really cool! I rarely need to think about this type of stuff, but when I suddenly do, it's always interesting to see where it leads. I (vaguely) remember the first time I was debugging...

    This is really cool! I rarely need to think about this type of stuff, but when I suddenly do, it's always interesting to see where it leads. I (vaguely) remember the first time I was debugging something and someone said, "Oh yeah, that happens before main() is called." I was like, "What? I thought nothing in my programs happened before main() is called." It was quite the trip!

    4 votes
  2. dblohm7
    Link
    Examining and manipulating the AST is really powerful. When you have compilers that expose this stuff to plug-ins, you can build some really cool tooling.

    Examining and manipulating the AST is really powerful. When you have compilers that expose this stuff to plug-ins, you can build some really cool tooling.

    3 votes
  3. wirelyre
    Link
    Here's some simple but fun bit manipulation! What does this code do? # Forcibly grow the stack a bit. 40102d:48 83 e4 f0 and rsp,0xfffffffffffffff0 Well, it "and"s the value in rsp with some...

    Here's some simple but fun bit manipulation! What does this code do?

    # Forcibly grow the stack a bit.
      40102d:	48 83 e4 f0          	and    rsp,0xfffffffffffffff0
    

    Well, it "and"s the value in rsp with some constant number, let's call it c, then stores that new value. So we can think of it as

    out = in & 0xfffffffffffffff0;
        = in & c;
    

    If you look at the bits in c, you'll see that they're a bunch of ones followed by a bunch of zeros,

    0xf...f0 == 0b1111...11110000
    

    So the logical "and" is just zeroing out the last four bits of in, effectively rounding down to a multiple of 16. Since the "sp" in "rsp" stands for "stack pointer", that is the same as aligning a pointer down to a 16 byte boundary.


    How can we round up to a multiple of 16? Consider a chunk of addresses 16n, …, 16n + 15. If we "and" them with c, only the first one is correctly rounded up. If we first add 16, making 16n + 16, …, 16n + 31 or equivalently 16(n + 1), …, 16(n + 1) + 15, then "and"ing with c rounds the first one incorrectly, but all the others correctly!

    Instead we can add 15 first. The sequence 16n + 15, 16(n + 1) + 0, …, 16(n + 1) + 14, when rounded down to the nearest multiple of 16, produces 16n, 16(n + 1), …, 16(n + 1), exactly like we wanted!


    What is the relationship between 16 and 0xfffffffffffffff0? In binary:

        c  = 1111...11110000
        16 = 0000...00010000
    

    So c is just 16, with the highest set bit copied over to the left.

    Can we make c from 16? First let's invert all the bits:

        0000...00010000   => invert
        1111...11101111
    

    Very close! Now add one, which carries from the right until the zero bit:

        0000...00010000   => invert
        1111...11101111   => add 1
        1111...11110000   == c
    

    There's c! c = ~16 + 1. To round x down to a multiple of 16, compute x & (~16 + 1).

    One last trick. What if we add c + 16?

        c = 1111...11110000
     + 16 = 0000...00010000
      ----------------------
           10000...00000000
    

    So, modulo 264 like the computer runs it, c + 16 == 0! In a sense, c = -16. (You also might have figured that out already if you recognized that ~16 + 1 == -16 in two's complement.)

    In fact this works for any power of 2! So to round x down to a multiple of 2n, compute x & (- 2^n).

    And indeed in most disassemblers,

    uint64_t round_down(uint64_t in)
    {
        return in & 0xfffffffffffffff0;
    }
    
    round_down:
            mov     rax, rdi
            and     rax, -16
            ret
    

    Bonus content! If you encode 0xfffffffffffffff0 directly it takes 64 bits, 8 whole bytes. But the instruction and rax, -16 encodes as 48 83 e0 f0, only 4 bytes for the whole thing including the constant. How does that work?

    x86 has a few ways of encoding constants, depending on how long they are. On x86-64 and register, a_constant can take either an 8-bit or a 32-bit constant. (Longer constants need to be put into a register first, e.g. mov rcx, long_constant / and rax, rcx.)

    Whenever those constants are used, though, they are treated as signed integers! That means that they are sign-extended, or that the highest bit in the constant is copied into all the new higher bits.

    That is, in the instruction and rax, -16, -16 is literally being encoded as a signed 8-bit -16!

    3 votes