NASM Overview
Introduction
First Step
OK, the assignment will be to write an assembly language program.
What does that mean? Time to panic?
Well, no. First, consider what you already know...that is why
there are prerequisites for this class. There are many things
done when writing assembly language that are identical to writing
a C program. We need to create a source file using a text editor.
It does not matter what editor you use, it only matters that the
source be saved as ASCII text. You've done that in C/C++, so
no problem! The filename will be whatever, it is customary to
use a file extension of .asm or .S .
Requirement
In this class, the rules for naming files are:
- The last four of your SSN
- prj -- just those three letters!
- proj number
- .asm
If your SSN is 123-45-6789 and you are doing project 0, your
filename is 6789prj0.asm
Note
When you write Java code, you can compile it and run it anywhere.
It is not that way when you write assembly language!
When you write C code, using ANSI C, you can run the program
anywhere, as long as you have a compiler for that system, probably
without any change to the source code. It is not that way when
you write assembly language!
In assembly language, your code must match:
- the CPU (sometimes the exact CPU model!)
- the Operating System (usually, only a certain subset of
versions of that operating system!)
- the assembler (different assemblers require different
formats for the source code!)
- the code libraries (not all libraries are backward
compatible!)
When things don't match, you have to modify the source to make
it match!
We are using Intel 486/Pentium CPUs, Linux (kernel 2.4.x) which
has the correct libraries, and NASM (which uses the Intel
format).
The latest version is
0.98.38, which can be downloaded from either the Official NASM
site at Sourceforge, or the web page for this course. This
software licensed under the Lesser General Public License,
so you are free to copy it. You can not charge anyone
for the software under the LGPL!
Do you need to have Linux on your computer with NASM installed?
No, but you will be installing Linux in CMSC421, so you can
get a jump on the process if you do it now. Otherwise, you
must use the GL system for your assignments. (Personally,
I think every Computer Science major needs to have two or three
operating systems installed on his/her computer, but they don't
let me make the rules.) Assistance is available from the Linux
Users Group here at UMBC. They are an excellent source of
information!
One issue to point out is that the standard assembler for Linux
is as which uses the AT&T format and the two are
not compatible!
Step Two
Put something into the source file! Sounds easy....what?
Well, I am lazy. So I make a source file template, called
template.asm (creative, aren't I?) So what is in that file?
Check it out.
Well, there are three sections in an assembly program:
- .data -- for constants -- read-only
- .bss -- for variables -- read/write
- .text -- for code -- read-only
The read-only attribute is a constraint imposed on you by the
operating system.
NOTE: The addresses inside each section start at zero!
That means the addresses are not the physical address, but a
relative address. The operating system loads each section
at some address (unknown until the program is actually
loaded into memory by the loader.) To get the physical address,
you must add the relative address to the start address of the
of the section. (Of course, there is some magic done here,
but don't work about it until you get to CMSC421!)
The template is actually a complete program. The only thing
it does is successfully terminate! Well, you should do one
thing and do it well? Not quite that extreme.
I copy the template to the appropriate filename and then I
write the program, building on the template...saving some
work!
NASM is case sensitive, just like Linux!
Assembly language instructions (not macros) result in one
machine instruction. You will be putting one instruction
on a line. This is where you get the metric Lines of Code.
The phrase not had any meaning since the first high-level
language, however! Instructions have up to four parts:
label: instruction operands ; comment
All four are optional, however you can not have an operand
without an instruction. In the old days, things had to be in
certain columns or it was an error. Today, there is no such
constraint.
Labels are used to implement program control and data structures.
Valid characters in labels are letters, numbers, _, $, #, @, ~,
., and ?. In this class, the first character must be a letter.
Instructions tell the computer to do something and the operand is
what it is to be done upon or from. You have the same thing in
C/C++:
int age;
age = 21;
In the first line you told the computer to reserve some memory
for a variable. In the second line you told the computer to
set the variable to the value 21. C/C++ is a little bit
forgiving when you not mindful of the data types involved in
an instruction and will try to "fix" things for you using implied
data conversions. The assembler is not that nice! You can not
put a floating point value into a character, it does not fit
and the assembler will not force it. You can not put a long
int into a short int. More precisely, you can not put 4 bytes
into 2 bytes! With each instruction, you must make sure
that the source and destination are
exactly the same size. In addition to "real"
machine instructions, NASM also supports a number of
pseudo-instructions, such as the reserving memory locations..
Operands are the source and/or destination for the instruction.
When there are two operands, the first one is the destination
and the second one is the source.
Operands can be registers, addresses, constants or expressions.
Along the way, there are also a couple of additional constraints,
You can not have both the source and destination as addresses,
the CPU is not build that way. Also the destination can not be
a constant! The destination must be what in C/C++ is called
an "lvalue" (effective memory address or a register.) You can use an operand with an instruction!
Comments in assembly language are exactly the same thing as
they are in C/C++. In this class, if you fail to comment,
you are planning to fail! You are required to submit a required
set of comments as a bare minimum and you can supplement it with
any extra comments you wish. Your grade depends on a
well-commented program.
Assembly
in C/C++, when you compile a program, you are exactly running
both the compiler and the linker. With NASM, it is two separate
steps.
Running NASM
There is one important option for the NASM program that you must
supply, and that is what format to use for the output file.
You must specify "-f elf" and then give the name of the
file(s) to assembly.
nasm -f elf hello.asm
That will produce a file hello.o. This file is an unlinked,
object file that can not be executed. It must be linked and
turned into an executable.
There is another option of interest, that is the -l or listing
option.
nasm -f elf -l hello.lst hello.asm
Running The Linker
OK, so good, so far, but you still can not execute. Of course,
that assumes that there were not assembly errors. If there were,
you must go back and fix them and re-assembly....feel just
like C/C++ doesn't it!
There are two linkers. ld gives you a simple executable,
works but no free-bees!
ld hello.o
This results in a file a.out and where have you heard
of that before!
The good stuff comes when you use gcc instead.
gcc -o hello hello.o
First of all, the -o option lets us rename the output file
from a.out to hello. This will also allow
you to use the C library and not have to rewrite all of the
basic stuff, like printf. Unless you are told otherwise,
in this class you can use the C library.
Makefile
It is often convenient to automate the process. When there is
not a lot to be gain by using a
Makefile when you only have
one source file, it can still be useful. I use them because
I like to use xemacs and can simply click on the "Compile"
button and get a new executable! (Make sure you name it
Makefile!) Then all you have to do is run the command
make
It figures out which files have to compiled and then relinks
everything. This saves a little typing when you only have
one file, but when you starting having multiple files, it
really saves time and effort!
Previous | Next