Assembler

Introduction

BBC BASIC includes a powerful inline assembler allowing you to include Z80 machine code directly in you BBC BASIC programs. '[' enters assembler mode and ']' exits assembler mode. You can then call assembled Z80 code from BASIC with the CALL or USR keywords.

Statements

An assembly language statement consists of three elements; an optional label, an instruction and an operand. A comment may follow the operand field. The instruction following a label must be separated from it by at least one space. Similarly, the operand must also be separated from the instruction by a space.

Statements are terminated by a colon (:) or end of line.

Labels

Any BBC BASIC (Z80) numeric variable may be used as a label. These (external) labels are defined by an assignment (count=23 for instance). Internal labels are defined by preceding them with a full stop. When the assembler encounters such a label, a BASIC variable is created containing the current value of the Program Counter (P%). (See The Program Counter for more information).

In the example shown later under the heading The Assembly Process, two internal labels are defined and used. Labels have the same rules as standard BBC BASIC (Z80) variable names; they should start with a letter and not start with a keyword.

Comments

You can insert comments into assembly language programs by preceding them with a semi-colon (;) or a back-slash (\). In assembly language, a comment ends at the end of the statement. Thus, the following example will work (but it's a bit untidy).

[;start assembly language program
etc
LD A,C ;In-line comment:POP BC ;start add
JR NZ,loop ;Go back if not finished:RET ;Return
etc
;end assembly language program:]

Language Syntax

All standard Zilog mnemonics are accepted: ADD, ADC and SBC must be followed by A or HL. For example, ADD A,C is accepted but ADD C is not. However, the brackets around the port number in IN and OUT are optional. Thus both OUT (5),A and OUT 5,A are accepted. The instruction IN F,(C) is not accepted, but the equivalent code is produced from IN (HL),C.

Constants

You can store constants within your assembly language program using the define byte (DEFB), define word (DEFW) and define string (DEFM) pseudo-operation commands. These will create 1 byte, 2 byte or string items respectively.

Define Byte - DEFB

DEFB can be used to set one byte of memory to a particular value. For example,

.data DEFB 15
      DEFB 9

will set two consecutive bytes of memory to 15 and 9 (decimal). The address of the first byte will be stored in the variable 'data'.

Define Word - DEFW

DEFW can be used to set two bytes of memory to a particular value. The first byte is set to the least significant byte of the number and the second to the most significant byte. For example,

.data DW &90F

will have the same result as the Define Byte - DEFB example.

Define String - DEFM

DEFM can be used to load a string of ASCII characters into memory. For example,

JP continue; jump round the data
.string DEFM "This is a test message"
DEFB &D
.continue; and continue the process

will load the string 'This is a test message' followed by a carriage-return into memory. The address of the start of the message is loaded into the variable 'string'. This is equivalent to the following program segment:

JP continue;	jump round the data
.string;	leave assembly and load the string
]
$P%="This is a test message" REM starting at P%
P%=P%+LEN($P%)+1 REM adjust P% to next free byte
[
OPT opt; reset OPT
.continue;	and continue the program

Reserving Memory

The Program Counter

Machine code instructions are assembled as if they were going to be placed in memory at the addresses specified by the program counter, P%. Their actual location in memory may be determined by O% depending on the OPTion specified. You must make sure that P% (or O%) is pointing to a free area of memory before your program begins assembly. In addition, you need to reserve the area of memory that your machine code program will use so that it is not overwritten at run time. You can reserve memory by using a special version of the DIM statement or by changing HIMEM or LOMEM.

Using DIM to Reserve Memory

Using the special version of the DIM statement to reserve an area of memory is the simplest way for short programs which do not have to be located at a particular memory address. (See the keyword DIM for more details.) For example,

DIM code 20: REM Note the absence of brackets

will reserve 21 bytes of code (byte 0 to byte 20) and load the variable 'code' with the start address of the reserved area. You can then set P% (or O%) to the start of that area. The example below reserves an area of memory 100 bytes long and sets P% to the first byte of the reserved area.

DIM sort% 99
P%=sort%

Moving HIMEM to Reserve Memory

If you are going to use a machine code program in a number of your BBC BASIC (Z80) programs, the simplest way is to assemble it once, save it using *SAVE and load it from each of your programs using *LOAD. In order for this to work, the machine code program must be loaded into the same address each time. The most convenient way to arrange this is to move HIMEM down by the length of the program and load the machine code program in to this protected area. Theoretically, you could raise LOMEM to provide a similar protected area below your BBC BASIC (Z80) program. However, altering LOMEM destroys all your dynamic variables and is more risky.

Length of Reserved Memory

You must reserve an area of memory which is sufficiently large for your machine code program before you assemble it, but you may have no real idea how long the program will be until after it is assembled. How then can you know how much memory to reserve? Unfortunately, the answer is that you can't. However, you can add to your program to find the length used and then change the memory reserved by the DIM statement to the correct amount.

In the example below, a large amount of memory is initially reserved. To begin with, a single pass is made through the assembly code and the length needed for the code is calculated (lines 100 to 120). After a CLEAR, the correct amount of memory is reserved (line 140) and a further two passes of the assembly code are performed as usual. Your program should not, of course, subsequently try to use variables set before the clear statement. If you use a similar structure to the example and place the program lines which initiate the assembly function at the start of your program, you can place your assembly code anywhere you like and still avoid this problem.

  100 DIM free -1, code HIMEM-free-1000
  110 PROC_ass(0)
  120 L%=P%-code
  130 CLEAR
  140 DIM code L%
  150 PROC_ass(0)
  160 PROC_ass(2)
      
      - - -
      Put the rest of your program here.
      - - -
      
 1000 DEF PROC_ass(opt)
10010 P%=code
10020 [OPT opt
      
      - - -
      Assembler code program.
      - - -
      
11000 ]
11010 ENDPROC

Initial Setting of the Program Counter

The program counters, P%, and O% are initialised to zero. Using the assembler without first setting P% (and O%) is liable to corrupt BBC BASIC (Z80)'s workspace and possibly result in a crash (see the appendix entitled Format of Program and Variables in Memory).

The Assembly Process

OPT

The only assembly directive is OPT. It controls the way the assembler works, whether a listing is displayed and whether errors are reported. OPT should be followed by a number in the range 0 to 7. The way the assembler functions is controlled by the three bits of this number in the following manner.

Bit 0 - LSB

Bit 0 controls the listing. If it is set, a listing is displayed.

Bit 1

Bit 1 controls the error reporting. If it is set, errors are reported.

Bit 2

Bit 2 controls where the assembled code is placed. If bit 2 is set, code is placed in memory starting at the address specified by O%. However, the program counter (P%) is still used by the assembler for calculating the instruction addresses.

Assembly at a Different Address

In general, machine code will only run properly if it is in memory at the addresses for which it was assembled. Thus, at first glance, the option of assembling it in a different area of memory is of little use. However, using this facility, it is possible to build up a library of machine code utilities for use by a number of programs. The machine code can be assembled for a particular address by one program without any constraints as to its actual location in memory and saved using *SAVE. This code can then be loaded into its working location from a number of different programs using *LOAD.

OPT Summary

Code Assembled Starting at P%

The code is assembled using the program counter (P%) to calculate the instruction addresses and the code is also placed in memory at the address specified by the program counter.

Code Assembled Starting at O%

The code is assembled using the program counter (P%) to calculate the instruction addresses. However, the assembled code is placed in memory at the address specified by O%.

How the Assembler Works

The assembler works line by line through the machine code. When it finds a label declared it generates a BBC BASIC variable with that name and loads it with the current value of the program counter (P%). This is fine all the while labels are declared before they are used. However, labels are often used for forward jumps and no variable with that name would exist when it was first encountered. When this happens, a 'No such variable' error occurs. If error reporting has not been disabled, this error is reported and BBC BASIC (Z80) returns to the direct mode in the normal way. If error reporting has been disabled (OPT 0, 1, 4 or 5), the current value of the program counter is used in place of the address which would have been found in the variable, and assembly continues. By the end of the assembly process the variable will exist (assuming the code is correct), but this is of little use since the assembler cannot 'back track' and correct the errors. However, if a second pass is made through the assembly code, all the labels will exist as variables and errors will not occur.

The example below shows the result of two passes through a (completely futile) demonstration program. Ten bytes of memory are reserved for the program. (If the program was run, it would 'doom-loop' from line 50 to 70 and back again). The program disables error reporting by using OPT 1.

10 DIM code 9
20 FOR opt=1 TO 3 STEP 2
30 P%=code
40 [OPT opt
50 .jim JP fred
60 DEFW &2345
70 .fred JP jim
80 ]
90 NEXT

This is the first pass through the assembly process (note that the 'JP fred' instruction jumps to itself):

C07A              OPT opt
C07A C3 7A C0     .jim JP fred
C07D 45 23        DEFW &2345
C07F C3 7A C0     .fred JP jim

This is the second pass through the assembly process (note that the 'JP fred' instruction now jumps to the correct address):

C07A              OPT opt
C07A C3 7F C0     .jim JP fred
C07D 45 23        DEFW &2345
C07F C3 7A C0     .fred JP jim

Generally, if labels have been used, you must make two passes through the assembly language code to resolve forward references. This can be done using a FOR...NEXT loop. Normally, the first pass should be with OPT 0 (or OPT 4) and the second pass with OPT 2 (OPT 6). If you want a listing, use OPT 3 (OPT 7) for the second pass. During the first pass, a table of variables giving the address of the labels is built. Labels which have not yet been included in the table (forward references) will generate the address of the current op-code. The correct address will be generated during the second pass.

Conditional Assembly and Macros

Introduction

Most machine code assemblers provide conditional assembly and macro facilities. The assembler does not directly offer these facilities, but it is possible to implement them by using other features of BBC BASIC (Z80).

Conditional Assembly

You may wish to write a program which makes use of special facilities and which will be run on different types of computer. The majority of the assembly code will be the same, but some of it will be different. In the example below, different output routines are assembled depending on the value of 'flag'.

DIM code 200
FOR pass=0 TO 3 STEP 3
  [OPT pass
  .start     - - -
             - - - code - - -
                        - - - :]
  :
  IF flag  [OPT  pass: - code for routine 1 -:]
  IF NOT flag [OPT pass: - code for routine 2 - :]
  :
  [OPT pass
  .more_code - - -
             - - - code - - -
                        - - -:]
NEXT

Macros

Within any machine code program it is often necessary to repeat a section of code a number of times and this can become quite tedious. You can avoid this repetition by defining a macro which you use every time you want to include the code. The example below uses a macro to expand a value from 8 bits to 16. Conditional assembly is used within the macro to select a different register depending on the value of op_flag.

It is possible to suppress the listing of the code in a macro by forcing bit 0 of OPT to zero for the duration of the macro code. This can most easily be done by ANDing the value passed to OPT with 6. This is illustrated in PROC_extend_hl and PROC_extend_de in the example below.

DIM code 200
op_flag=TRUE
FOR pass=0 TO 3 STEP 3
  [OPT pass
  .start   - - -
           - - - code - - -
                      - - -
: 
  OPT FN_select(op_flag); Include code depending on op_flag
:
           - - -
           - - - code - - -
                      - - -:]
NEXT
END
:
:
REM Include code depending on value of op_flag
:
DEF FN_select(op_flag)
IF op_flag PROC_extend_hl ELSE PROC_extend_de
=pass
REM Return original value of OPT.  This is a
REM bit artificial, but necessary to insert
REM some BBC BASIC (Z80) code in the assembly code.
:
DEF PROC_extend_hl
[OPT pass AND 6
LD L,A
ADD A,A
SBC A,A
LD H,A:]
ENDPROC
:
DEF PROC_extend_de
[OPT pass AND 6
LD E,A
ADD A,A
SBC A,A
LD D,A:]
ENDPROC

The use of a function call to incorporate the code provides a neat way of incorporating the macro within the program and allows parameters to be passed to it. The function should return the original value of OPT.