Format of Program and Variables in Memory

Memory Map

Workspace RAM

BBC BASIC (Z80) uses 768 (&300) bytes of RAM as a workspace. It contains the values of resident interpreter variables, LOMEM, HIMEM, etc. The full list is as follows.

Address Description
&9E00~&9EFF The string accumulator.
&9F00~&9FFF The INPUT buffer.
&A000~&A06B The static variables. 108 bytes holding the values of the 27 integer variables @% to Z% inclusive. The variable values are stored as described later in this section and they each occupy 4 bytes.
&A06C~&A0D7 54 2-byte values (108 bytes in all) which point to the first item in the linked lists of dynamic variables starting with the characters A to Z (26) and _ to z (28). If no variables starting with the given initial character exists, the pointer contains zero.
&A0D8~&A0D9 A 2-byte pointer to the linked list of function names. If no function is currently active, this contains zero.
&A0DA~&A0DB A 2-byte pointer to the linked list of procedure names. If no procedure is currently active, this contains zero.
&A0DC~&A0DD The 2-byte value of PAGE.
&A0DE~&A0DF The 2-byte value of TOP (TOP > PAGE).
&A0E0~&A0E1 The 2-byte value of LOMEM.
&A0E2~&A0E3 A 2-byte pointer to the first free location after the heap.
&A0E4~&A0E5 The 2-byte value of HIMEM (LOMEMFREE < HIMEM).
&A0E8~&A0E9 A 2-byte value holding the current TRACE status. TRACE OFF sets it to zero, TRACE ON to &FFFF and TRACE nnn sets it to nnn (line numbers less than nnn are traced).
&A0EA~&A0EB A 2-byte value holding the next AUTO line number. If zero, AUTO is not active.
&A0EC~&A0ED A 2-byte pointer to the tail of the ON ERROR statement in the user's program. If zero, no ON ERROR statement is active (ON ERROR OFF).
&A0EE~&A0EF A 2-byte pointer to the last error string (used by REPORT).
&A0F0~&A0F1 A 2-byte pointer to the current DATA item in the user's program. Initialised to point to the first data item (if any) when the program is RUN.
&A0F2~&A0F3 The 2-byte value of ERL (the line number at which the last error occurred).
&A0F4~&A0F5 BBC BASIC copies the program text pointer to this location every so often (at the beginning of each line, for example). When an error occurs, it is used to determine ERL.
&A0F6~&A0FA A 33 bit pseudo random number updated by RND. Five bytes are used to hold this number, the fifth byte containing only the 33rd bit.
&A0FB The 1-byte value of COUNT (the number of printed characters output since the last new line).
&A0FC The 1-byte value of WIDTH. Zero signifies that BBC BASIC (Z80) inserts no automatic new-lines.
&A0FD The 1-byte value of ERR (the number of the last error).
&A0FE A byte containing the LISTO value.
&A0FF A 1-byte value containing the increment for the AUTO command.

Memory Management

There is little you can do to control the growth of the stack. However, with care, you can control the growth of the heap. You can do this by limiting the number of variables you use and by good string variable management.

Limiting the Number of Variables

Each new variable occupies room on the heap. Restricting the length of the names of variables and limiting the number of variables used will limit the size of the heap. However, of the techniques available to you, this is the least rewarding. In addition, it leads to incomprehensible programs because your variable names become meaningless. You should keep this technique in the back of your mind whilst you are programming, but only apply it rigorously if you are really stuck for space.

String Management

Garbage Generation

Unlike numeric variables, string variables do not have a fixed length. When you create a string variable it is added to the heap and sufficient memory is allocated for the initial value of the string. If you subsequently assign a longer string to the variable there will be insufficient room for it in its original position and the string will have to be relocated with its new value at the top of the heap. The initial area will then become 'dead' and the heap will have grown by the new length of the string. The areas of 'dead' memory are called garbage. As more and more re-assignments take place, the heap grows and eventually there is no more room. Thus, it is possible to run out of room for variables even though there should be enough space.

Memory Allocation for String Variables

You can overcome the problem of 'garbage' by reserving enough memory for the longest string you will ever put into a variable before you use it. You do this simply by assigning a string of spaces to the variable. If your program needs to find an empty string the first time it is used, you can subsequently assign a null string to it. The same technique can be used for string arrays. The example below sets up a single dimensional string array with room for 20 characters in each entry, and then empties it.

10 DIM names$(10)
20 FOR i=0 TO 10
30   name$(i)=STRING$(20," ")
40 NEXT
50 stop$=""
60 FOR i=0 TO 10
70   name$(i)=""
80 NEXT

Assigning a null string to stop$ prevents the space for the last entry in the array being recovered when it is emptied.

Program Storage in Memory

The program is stored in memory in the format shown below. The first program line commences at PAGE.

Length LSB MSB &0D
Line Number Program Line Statements CR

Line Length

The line length includes the line length byte. The address of the start of the next line is found by adding the line length to the address of the start of the current line. The end of the program is indicated by a line length of zero and a line number of &FFFF.

Line Number

The line number is stored in two bytes, LSB first. The end of the program is indicated by a line number of &FFFF and a line length of zero.

Statements

With the exception of the symbols '*', '=' and '[' and the optional reserved word LET, each statement in the line commences with the appropriate reserved word token. Reserved words are tokenised wherever they occur. A token is indicated by bit 7 of the byte being set. Statements within a line are separated by colons.

Line Terminator

Each program line (except the last) is terminated by a carriage-return (&0D).

Variable Storage in Memory

Variables are held within memory as linked lists (chains). The first variable in each chain is accessed via an index which is maintained by BBC BASIC (Z80). There is an entry in the index for each of the characters permitted as the first letter of a variable name. Each entry in the index has a word (two bytes) address field which points to the first variable in the linked list with a name starting with its associated character. If there are no variables with this character as the first character in the name, the pointer word is zero. The first word of all variables holds the address of the next variable in the chain. The address word of the last variable in the chain is zero. All addresses are held in the standard Z80 format - LSB first.

The first variable created for each starting character is accessed via the index and subsequently created variables are accessed via the index and the chain. Consequently, there is some speed advantage to be gained by arranging for all your variables to start with a different character. Unfortunately, this can lead to some pretty unreadable names and programs.

Integer Variables

Integers are held in two's complement format. They occupy 4 bytes, with the LSB first. Bit 7 of the MSB is the sign bit. To make up the complete variable, the address word, the name and a separator (zero) byte are added to the number. The format of the memory occupied by an integer variable called NUMBER% is shown below. Note that since the first character of the name is found via the index, it is not stored with the variable.

LSB MSB U M B E R % &00 LSB MSB
Address of next variable starting with the same letter. Rest of Name Value

The smallest amount of space is taken up by a variable with a single letter name. The static integer variables, which are not included in the variable chains, use the names A% to Z%. Thus, the only single character names available for dynamic integer variables are a% to z% plus _% and £%. As shown below, integer variables with these names will occupy 8 bytes.

LSB MSB % &00 LSB MSB
Address of next variable starting with the same letter. Value

Real Variables

Real numbers are held in binary floating point format. The mantissa is held as a 4 byte binary fraction in sign and magnitude format. Bit 7 of the MSB of the mantissa is the sign bit. When working out the value of the mantissa, this bit is assumed to be 1 (a decimal value of 0.5). The exponent is held as a single byte in 'excess 127' format. In other words, if the actual exponent is zero, the value of stored in the exponent byte is 127. To make up the complete variable, the address word, the name and a separator (zero) byte are added to the number. The format of the memory occupied by a real variable called NUMBER is shown below.

LSB MSB U M B E R &00 LSB MSB Exp
Address of next variable starting with the same letter. Rest of Name Mantissa Exponent

As with integer variables, variables with single character names occupy the least memory. (However, the names A to Z are available for dynamic real variables). Whilst a real variable requires an extra byte to store the number, the '%' character is not needed in the name. Thus, integer and real variables with the same name occupy the same amount of memory. However, this does not hold for arrays, since the name is only stored once.

In the following examples, the bytes are shown in the more human-readable manner with the MSB on the left.

The value 5.5 would be stored as shown below.

Mantissa Exponent
0011 0000 0000 0000 0000 0000 0000 0000 1000 0010
&30 &00 &00 &00 &82

Because the sign bit (underlined) is assumed to be 1, this would become:

Mantissa Exponent
1011 0000 0000 0000 0000 0000 0000 0000 1000 0010
&B0 &00 &00 &00 &82

The equivalent in decimal is:

    (0.5+0.125+0.0625) * 2^(130-127)
=   0.6875 * 2^3
=   0.6875 * 8
=   5.5

BBC BASIC (Z80) stores integer values in real variables in a special way which allows the faster integer arithmetic routines to be used if appropriate. The presence of an integer value in a real variable is indicated by the stored exponent being zero. Thus, if the stored exponent is zero, the real variable is being used to hold an integer and the 4 byte mantissa holds the number in normal integer format.

Depending on how it is put there, an integer value can be stored in a real variable in one of two ways. For example,

number=5

will set the exponent to zero and store the integer &00 00 00 05 in the mantissa. On the other hand,

number=5.0

will set the exponent to &82 and the mantissa to &20 00 00 00.

If all this seems a little complicated, try using the following program to accept a number from the keyboard and display the way it is stored in memory. The program displays the 4 bytes of the mantissa in 'human readable order' followed by the exponent byte. Look at what happens when you input first 5 and then 5.0 and you will see how this corresponds to the explanation given above. Then try -5 and -5.0 and then some other numbers. (The program is an example of the use of the byte indirection operator. See the Indirection section for details).

 10 NUMBER=0
 20 DIM A% -1
 30 REPEAT
 40   INPUT"NUMBER PLEASE "NUMBER
 50   PRINT "& ";
 60   :
 70   REM Step through mantissa from MSB to LSB
 80   FOR I%=2 TO 5
 90     REM Look at value at address A%-I%
100     NUM$=STR$~(A%?-I%)
110     IF LEN(NUM$)=1 NUM$="0"+NUM$
120     PRINT NUM$;" ";
130   NEXT
140   :
150   REM Look at exponent at address A%-1
160   N%=A%?-1
170   NUM$=STR$~(N%)
180   IF LEN(NUM$)=1 NUM$="0"+NUM$
190   PRINT " & "+NUM$''
200 UNTIL NUMBER=0

The layout of the variable NMBR in memory is shown below.

Mantissa
LSB MSB M B R &00 LSB MSB Exp
A%-5 points here A%-2 points here A%-1 points here A% points here

String Variables

String variables are stored as the string of characters. Since the current length of the string is stored in memory an explicit terminator for the string in unnecessary. As with numeric variables, the first word of the complete variable is the address of the next variable starting with the same character. However, since BBC BASIC (Z80) needs information about the length of the string and the address in memory where the it starts, the overheads for a string are more than for a numeric. The format of a string variable called NAME$ is shown below.

LSB MSB A M E $ &00 Length Max LSB MSB
Address of next variable starting with the same letter. Rest of Name Current Length Maximum Length String Start Address

When a string variable is first created in memory, the characters of the string follow immediately after the two bytes containing the start address of the string and the current and maximum lengths are the same. While the current length of the string does not exceed its length when created, the characters of the string will follow the address bytes. When the string variable is set to a string which is longer than its original length, there will be insufficient room in the original position for the characters of the string. When this happens, the string will be placed on the top of the heap and the new start address will be loaded into the two address bytes. The original string space will remain, but it will be unusable. This unusable string space is called 'garbage'. See the Variables section for ways to avoid creating garbage.

Because the original length and the current length of the string are each stored in a single byte in memory, the maximum length of a string held in a string variable is 255 characters.

Fixed Strings

You can place a string starting at a given location in memory using the indirection operator '$'. For example,

$&8000="This is a string"

would place &54 (T) at address &8000, &68 (h) at address &8001, etc. Because the string is placed at a predetermined location in memory it is called a 'fixed' string. Fixed strings are not included in the variable chains and they do not have the overheads associated with a string variable. However, since the length of the string is not stored, an explicit terminator (&0D) is used. Consequently, in the above example, byte &8010 would be set to &0D.

Array variables

Information about the array is stored along with its name, including the number of dimensions and number of elements in each dimension.

Repeated as necessary
LSB MSB A M E ( &00 Dimensions LSB MSB LSB MSB
Address of next variable starting with the same letter. Rest of Name Number of dimensions Number of elements Number of elements Array Data

The number of elements stored for each dimension is the total number of elements, not the upper bound; that is, an array dimensioned as array%(10) would have 11 elements.

For example, the array

DIM array%(4,9,2)

would be stored as follows:

Following this information is the array data itself. Integers and real numbers are stored directly as four or five byte values respectively. Strings are referenced via a four byte block detailing the current length, maximum length and address of the first character in the same way that regular string variables are referenced.

Array elements are stored sequentially with the last dimension first. For example, with the array%(4,9,2) example above the order of elements is (0,0,0), (0,0,1), (0,0,2), (0,1,0), (0,1,1), (0,1,2), (0,2,0) etc.