Writing Subprograms
Higher level languages use subprograms all the time. Some
languages have two types of subprograms, procedures (or
subroutines) and functions, while
others, like C, only have functions. Normally, the difference
between the two is that functions return a value (only one!).
Assembly languages uses "subprograms" and
typically returns a value in AX register. In C, we evoke a
functions by using the name with paraentheses. In assembly
language we put the instruction:
call SubProg
In C, the end of the function is the optional return
statement. In assembly, there is the required ret
instruction.
Rules
- Subprograms can be included in the same file or stored in a
separate file.
- If they are in the same file, the ordering and naming is
immaterial.
- There is no "main()", unless you are using gcc to link the
program.
- There can be more that one data and bss section.
- Data defined in the data and bss sections are global.
- The local variable defined in C are really on the stack.
That is why they don't exist after returning from the
subprogram.
To understand how subprograms actually work, we first need
to understand the stack in detail.
The Stack
The hardware stack and stack pointer sp are necessary to
get subprograms to work properly. Actually early CPUs did not
have them, and as a result, they were more limited than what
we have today. For instance, recursion was not easily possible.
A stack is a data structure, a Last-In, First-Out (LIFO) queue,
but data can only be added or removed at one end.
There are special instructions built into the CPU to work with the
stack. The sp register points to the newest
16-bit (or 32-bit
double word when using the extended registers in the later models)
value that is on the stack, which is the next item to be removed.
When putting an item on the stack, it starts at the highest
address and grows down.
The instructions are:
- push (16-bit or 32-bit, depending on the register
specified. If it is a memory location or constant, you
will have to specify whether it is a WORD or DWORD.)
- pusha (16-bit, pushs AX, CX, DX, BX, SP, BP, SI,
DI)
- pushad (16-bit, pushs EAX, ECX, EDX, EBX, ESP, EBP,
ESI, EDI)
- pushf (16-bit, pushs the flag register)
- pushfd (32-bit)
- pop (16- or 32-bit, based on the register specified,
can not be sp register! If it is a memory location or
constant, you will have to specify whether it is a WORD
or DWORD.)
- popa (16-bit pops in reverse order of pusha)
- popad (32-bit)
- popf (16-bit)
- popfd (32-bit)
Either a memory location (word or double word only, as
appropriate), constant, or a register can be specified.
If we have set up the code segment as:
X | DW | 1111 |
Y | DW | 2222 |
Z | DW | ? |
We can have a stack that looks like:
In this case, the SP register really points to what is above the
first location. If we execute:
We now have a stack that looks like:
If we then execute:
We now have a stack that looks like:
If we finally execute:
We now have a stack that looks like:
The location for the variable Z is set to 2222 and the SP register
points to the previous entry. Note that stack location holding
2222 is considered unused and will be overwritten by the next
push instruction. You can not count
on items popped from the stack remaining in unused stack
memory because the operating system also uses the stack.)
If you do push X followed by pop X, nothing is
changed.
Saving and restoring registers is particularly important when
using subprograms. You have the responsibility to save and
restore important data in the registers before you call a
subprogram and then you are responsible for restoring those
registers afterwards.
Well-behaved subprograms should save and restore any registers
that they use, unless they are returning values in certain
registers! However, not all subprograms written by others
are well-behaved. This means you have to write the instructions
to do it! in order to make sure the program works correctly.
Separately Translating Subprograms
Putting subprograms into spearate files lets you do things better
and faster. Better, because you can use the subprograms in more
than one program -- Software reuse is
good! Faster because you only have to assembly those
files with changes. Makefiles come in handy here.
Rules
- If you call a subprogram that is in another file, you must
have the EXTERN statement,
- For data that is defined in one file and used in another
must have the EXTERN/GLOBAL pair, but notice that when
you do that, you must provide the size (BYTE or WORD).
Normally, this is not a good way to do things because
it creates a global variable.
Use the stack instead if possible.
How the Linker Works
We have used the linker (ld and gcc [as a linker] ) and have
sed library procedures, printf and scanf.
These and many others have been assembled separately and stored
in a library created by compiler developers. How does the
linker handle that?
When the assembler translates a source file into object code,
it creates a symbol table of all the names and attributes of
symbols defined in the file. When it is done, that symbol table
is thrown away. Since GLOBAL symbols may be defined in
one file and referenced in another file(s), the assemblers saves
two files in the .o file -- a table of EXTERN symbols
and a list of places where each symbol is referenced, and a table
of GLOBAL symbols and the unique place where each is
defined.
The unresolved external references must be resolved by the
linker. Once the linker knows where is one of the GLOBAL
symbols is stored, it goes back and modifies the locations of the
EXTERN references with the now known address.
Previous |
Next
©2004, Gary L. Burt