This is really cool! I rarely need to think about this type of stuff, but when I suddenly do, it's always interesting to see where it leads. I (vaguely) remember the first time I was debugging...
This is really cool! I rarely need to think about this type of stuff, but when I suddenly do, it's always interesting to see where it leads. I (vaguely) remember the first time I was debugging something and someone said, "Oh yeah, that happens before main() is called." I was like, "What? I thought nothing in my programs happened before main() is called." It was quite the trip!
Examining and manipulating the AST is really powerful. When you have compilers that expose this stuff to plug-ins, you can build some really cool tooling.
Examining and manipulating the AST is really powerful. When you have compilers that expose this stuff to plug-ins, you can build some really cool tooling.
Here's some simple but fun bit manipulation! What does this code do? # Forcibly grow the stack a bit. 40102d:48 83 e4 f0 and rsp,0xfffffffffffffff0 Well, it "and"s the value in rsp with some...
Here's some simple but fun bit manipulation! What does this code do?
# Forcibly grow the stack a bit.40102d:4883e4f0andrsp,0xfffffffffffffff0
Well, it "and"s the value in rsp with some constant number, let's call it c, then stores that new value. So we can think of it as
out=in&0xfffffffffffffff0;=in&c;
If you look at the bits in c, you'll see that they're a bunch of ones followed by a bunch of zeros,
0xf...f0 == 0b1111...11110000
So the logical "and" is just zeroing out the last four bits of in, effectively rounding down to a multiple of 16. Since the "sp" in "rsp" stands for "stack pointer", that is the same as aligning a pointer down to a 16 byte boundary.
How can we round up to a multiple of 16? Consider a chunk of addresses 16n, …, 16n + 15. If we "and" them with c, only the first one is correctly rounded up. If we first add 16, making 16n + 16, …, 16n + 31 or equivalently 16(n + 1), …, 16(n + 1) + 15, then "and"ing with c rounds the first one incorrectly, but all the others correctly!
Instead we can add 15 first. The sequence 16n + 15, 16(n + 1) + 0, …, 16(n + 1) + 14, when rounded down to the nearest multiple of 16, produces 16n, 16(n + 1), …, 16(n + 1), exactly like we wanted!
What is the relationship between 16 and 0xfffffffffffffff0? In binary:
c = 1111...11110000
16 = 0000...00010000
So c is just 16, with the highest set bit copied over to the left.
Can we make c from 16? First let's invert all the bits:
0000...00010000 => invert
1111...11101111
Very close! Now add one, which carries from the right until the zero bit:
0000...00010000 => invert
1111...11101111 => add 1
1111...11110000 == c
There's c! c = ~16 + 1. To round x down to a multiple of 16, compute x & (~16 + 1).
One last trick. What if we add c + 16?
c = 1111...11110000
+ 16 = 0000...00010000
----------------------
10000...00000000
So, modulo 264 like the computer runs it, c + 16 == 0! In a sense, c = -16. (You also might have figured that out already if you recognized that ~16 + 1 == -16 in two's complement.)
In fact this works for any power of 2! So to round x down to a multiple of 2n, compute x & (- 2^n).
Bonus content! If you encode 0xfffffffffffffff0 directly it takes 64 bits, 8 whole bytes. But the instruction and rax, -16 encodes as 48 83 e0 f0, only 4 bytes for the whole thing including the constant. How does that work?
x86 has a few ways of encoding constants, depending on how long they are. On x86-64 and register, a_constant can take either an 8-bit or a 32-bit constant. (Longer constants need to be put into a register first, e.g.mov rcx, long_constant / and rax, rcx.)
Whenever those constants are used, though, they are treated as signed integers! That means that they are sign-extended, or that the highest bit in the constant is copied into all the new higher bits.
That is, in the instruction and rax, -16, -16 is literally being encoded as a signed 8-bit -16!
This is really cool! I rarely need to think about this type of stuff, but when I suddenly do, it's always interesting to see where it leads. I (vaguely) remember the first time I was debugging something and someone said, "Oh yeah, that happens before main() is called." I was like, "What? I thought nothing in my programs happened before main() is called." It was quite the trip!
Examining and manipulating the AST is really powerful. When you have compilers that expose this stuff to plug-ins, you can build some really cool tooling.
Here's some simple but fun bit manipulation! What does this code do?
Well, it "and"s the value in
rsp
with some constant number, let's call itc
, then stores that new value. So we can think of it asIf you look at the bits in
c
, you'll see that they're a bunch of ones followed by a bunch of zeros,So the logical "and" is just zeroing out the last four bits of
in
, effectively rounding down to a multiple of 16. Since the "sp" in "rsp" stands for "stack pointer", that is the same as aligning a pointer down to a 16 byte boundary.How can we round up to a multiple of 16? Consider a chunk of addresses 16n, …, 16n + 15. If we "and" them with
c
, only the first one is correctly rounded up. If we first add 16, making 16n + 16, …, 16n + 31 or equivalently 16(n + 1), …, 16(n + 1) + 15, then "and"ing withc
rounds the first one incorrectly, but all the others correctly!Instead we can add 15 first. The sequence 16n + 15, 16(n + 1) + 0, …, 16(n + 1) + 14, when rounded down to the nearest multiple of 16, produces 16n, 16(n + 1), …, 16(n + 1), exactly like we wanted!
What is the relationship between 16 and 0xfffffffffffffff0? In binary:
So
c
is just 16, with the highest set bit copied over to the left.Can we make
c
from 16? First let's invert all the bits:Very close! Now add one, which carries from the right until the zero bit:
There's
c
!c = ~16 + 1
. To roundx
down to a multiple of 16, computex & (~16 + 1)
.One last trick. What if we add
c + 16
?So, modulo 264 like the computer runs it,
c + 16 == 0
! In a sense,c = -16
. (You also might have figured that out already if you recognized that~16 + 1 == -16
in two's complement.)In fact this works for any power of 2! So to round
x
down to a multiple of 2n, computex & (- 2^n)
.And indeed in most disassemblers,
Bonus content! If you encode
0xfffffffffffffff0
directly it takes 64 bits, 8 whole bytes. But the instructionand rax, -16
encodes as48 83 e0 f0
, only 4 bytes for the whole thing including the constant. How does that work?x86 has a few ways of encoding constants, depending on how long they are. On x86-64
and register, a_constant
can take either an 8-bit or a 32-bit constant. (Longer constants need to be put into a register first, e.g.mov rcx, long_constant / and rax, rcx
.)Whenever those constants are used, though, they are treated as signed integers! That means that they are sign-extended, or that the highest bit in the constant is copied into all the new higher bits.
That is, in the instruction
and rax, -16
, -16 is literally being encoded as a signed 8-bit -16!