UMBC | CMSC 313 -- Addressing | Previous | Next |
int age; int junk int *pAge age = 21; /* assigns the value of twenty-one to the memory location you named age */ junk = age; /* the memory location junk gets the value used at location age or three */ pAge = &age; /* pAge now holds the address of the memory location you named age */
As you remember, we can write something like this:
A DB 78h B DW 1234h C DD 0FEDCBA89hNow when we want to get the contents or address we do:
mov al, [ A ] ; moves the contents of A to al mov eax, A ; moves the address A to eax mov bx, [ B ] ; moves the contents of B to bx mov ebx, B ; moves the address of B to ebx mov ecx, [ C ] ; moves the contents of C to ecx mov ecx, C ; moves the address of C to ecxNotice that when we move the address of of a variable, we are dealing with a 32-bit address and must select a 32-bit container to hold that address. It does not matter what the value of the variable is, C could hold the value of 1 and it would still be the same!
char name[10];
A DB 0Ah, 1Ah, 2Ah, 3Ah, 4Ah, 5Ah, 6Ah, 7Ah, 8Ah, 9AhOr
ages DB 0Ah DB 1Ah DB 2Ah DB 3Ah DB 4Ah DB 5Ah DB 6Ah DB 7Ah DB 8Ah DB 9AhNow the location [ A ] holds 0Ah, etc. Notice that the other locations do not have a name as such, but we can get to them with address arithmetic. The location [ A + 3 ] refers to the byte containing 3Ah.
mov al, ages ; moves the contentes of ages ; to al(which will hold 0Ah) mov bl, [ ages + 3 ] ; moves the contents of ages plus 3 ; to bl (which will hold 3Ah!)Additionally, if pAges is a pointer variable in C, we can say that [pAges] is the equivalent of *pAges.
If you to get the address of of the byte holding the value of 03Ah, you would use:
mov eax, A + 3 ; moves address of where the ; byte holding 3Ah is into eaxOnce again, remember the address is a 32-bit number.
Suppose we have the following definitions of arrays:
AA DB 0Ah, 1Ah, 2Ah, 3Ah, 4Ah, 5Ah, 6Ah, 7Ah BB DB 0Bh, 1Bh, 2Bh, 3Bh CC DB 0Ch, 1Ch, 2Ch, 3Ch, 4Ch, 5ChIn memory we would have:
Label | AA: | BB: | CC: | |||||||||||||||||
Contents | 0A | 1A | 2A | 3A | 4A | 5A | 6A | 7A | 0B | 1B | 2B | 3B | 0C | 1C | 2C | 3C | 4C | 5C | ||
Offset | AA BB-8 CC-12 |
AA+1 BB-7 CC-11 |
AA+2 BB-6 CC-10 |
AA+3 BB-5 CC-9 |
AA+4 BB-4 CC-8 |
AA+5 BB-3 CC-7 |
AA+6 BB-2 CC-6 |
AA+7 BB-1 CC-5 |
AA+8 BB CC-4 |
AA+9 BB+1 CC-3 |
AA+10 BB+2 CC-2 |
AA+11 BB+3 CC-1 |
AA+12 BB+4 CC |
AA+13 BB+5 CC+1 |
AA+14 BB+6 CC+2 |
AA+15 BB+7 CC+3 |
AA+16 BB+8 CC+4 |
AA+17 BB+9 CC+5 |
It is important to notice that the offset can be a positive or negative number and that there is nothing preventing this. Array-bound checking is only accomplished in high-level languages with the addition of additional code that you normally don't see! This is why in C, if you exceed the boundary of an array, you don't get an error message unless you attempt to use memory that is not allocated to your process.
section .data wArray DW 1234h, 2345h, 3456h, 4567h, 5678h, 6789h, 789Ah, 89ABh, 9ABCh, 0ABCDh section .bss section .text global main ;must be declared for linker (ld) main: ;tell linker entry point nop mov eax, 0 mov ebx, 0 mov esi, wArray mov ax, [ wArray ] ;; What will we get when we use wArray + 3? mov bx, [ wArray + 3 ] mov ebx,0 ;successful termination of program mov eax,1 ;system call number (sys_exit) int 0x80 ;call kernel
(gdb) break *main Breakpoint 1 at 0x8048300: file addr.asm, line 10. (gdb) run Starting program: /home/burt/courses/umbc/CMSC313/spring04/lectures/Lect08/addr/addr Breakpoint 1, main () at addr.asm:10 (gdb) step (gdb) step (gdb) step (gdb) step (gdb) step (gdb) step (gdb) info registers eax 0x1234 4660 ecx 0x42015554 1107383636 edx 0x40016bc8 1073834952 ebx 0x5623 22051 esp 0xbffff32c 0xbffff32c ebp 0xbffff348 0xbffff348 esi 0x80493e8 134517736 edi 0x804835c 134513500 eip 0x804831d 0x804831d eflags 0x346 838 cs 0x23 35 ss 0x2b 43 ds 0x2b 43 es 0x2b 43 fs 0x0 0 gs 0x33 51 (gdb)Something does not look right:
eax 0x1234 4660 ebx 0x5623 22051There is now 5623h in the definition of wArray! How did that happen? Let's dump the word array wArray and see for ourselves!
(gdb) x/10hx &wArray 0x80493e8That did not help. Let's look at it in byte order: 0x1234 0x2345 0x3456 0x4567 0x5678 0x6789 0x789a 0x89ab 0x80493f8 : 0x9abc 0xabcd
(gdb) x/20bx &wArray 0x80493e8What we see is that the the wArray + 3 is not the third word, it is the third byte! Remember the bytes in memory are stored in little-endian order and when we shift down three bytes from the start we find the value in the register. The assembler does not help us out like the compiler did!: 0x34 0x12 0x45 0x23 0x56 0x34 0x67 0x45 0x80493f0 : 0x78 0x56 0x89 0x67 0x9a 0x78 0xab 0x89 0x80493f8 : 0xbc 0x9a 0xcd 0xab (gdb)
(gdb) set disassembly-flavor intel (gdb) disassemble &main Dump of assembler code for function main: 0x08048300: nop 0x08048301 : mov eax,0x0 0x08048306 : mov ebx,0x0 0x0804830b : mov esi,0x80493e8 0x08048310 : mov ax,ds:0x80493e8 0x08048316 : mov bx,ds:0x80493eb 0x0804831d : mov ebx,0x0 0x08048322 : mov eax,0x1 0x08048327 : int 0x80 0x08048329 : nop 0x0804832a : nop 0x0804832b : nop End of assembler dump. (gdb)
The most important difference is when a program is loaded into a different location in memory, the addresses change but the numbers do not!
If A and B are addresses and n a ordinary number, then we can legally do:
Some examples are:
A + 14 | address |
B - A | number |
CW - AW | number |
AW + ( B - A ) | address
Remember B - A is a number that puts this into the form of A + n |
msg: db "Hello World",10 ; the string to print, 10=cr len: equ $-msg ; "$" means "here"NOTE: This is not '$', the quotes make it a character. The assembler symbol represents the next location that code will be assembled into.
AWord | DW | 1234h |
34h | 12h |
Words in assembly language source code files
and words in registers have bytes in their normal order. Words in
memory have their bytes swapped!
Moving a word to or from memory automatically swaps bytes.