Adding 11KB of RAM to a CP/M 3 system with a single NAND gate chip
Wednesday, 30th August 2023
It's been quite a while since I posted about my Z80 Computer project. This is a home-made Z80 computer I built back in 2010 that features a 10MHz Z80 CPU with 64KB RAM that runs CP/M 3. It can drive an internal LCD, TV or VGA monitor at 320x240 (monochrome only) and unfortunately is a project I was never too happy with due to several compromises I had to make in its design – though at the time I was happy enough I got it to work at all! The video output was limited by both my choice to use an internal graphical LCD and the limitations of the dsPIC33F I chose to use to drive it and the software was all a bit half-baked. I could run the generic CP/M version of BBC BASIC on it, but this lacks graphics and sound support, for example.
More recently my work on adapting BBC BASIC to the Sega Master System had reignited my interest in 8-bit programming, though that too was imperfect due to the limitations of the Master System's VDP. I was further encouraged by coming third in the "Retro not Vintage" competition on /r/retrobattlestations, though I'm not sure I was quite worthy of a podium finish.
With this in mind I started work on improving the computer. I replaced the existing dsPIC33F VDP with a new one based around a dsPIC33E. This newer microcontroller has 32KB of RAM and can run at up to 70 MIPS, a big upgrade from the previous 16KB RAM and 40 MIPS. This provides me with enough video RAM to store the largest BBC Micro screen mode frame buffer (20KB) as well as the necessary CPU grunt to look up pixel data from colour palettes and output it to the screen. I've implemented all eight of the standard BBC Micro screen modes, from the high-res 640x256 (in two colours) MODE 0 to the low-res 160x256 (in sixteen colours) MODE 2 along with the Teletext-compatible MODE 7. This is all controlled via a BBC Micro-compatible VDU driver and the results all seem quite faithful with no real compromises.
There was even enough CPU power left over on the microcontroller to implement BBC Micro-compatible SOUND and ENVELOPE, and with the source code for the CP/M version of BBC BASIC having been released since I last worked on the project it made it much easier to add all of the graphics and sound routines into the version of BBC BASIC specific to my computer.
To get an idea of what the computer is like to use, I recorded a little demo video here. However, this is not really what I wanted to write about in this post – I wanted to cover an easy way to free up some RAM by implementing banked CP/M 3.
Non-banked versus banked CP/M
I chose CP/M 3 as the OS for my computer instead of CP/M 2 as I'm using an SD card for storage and CP/M 3 has native support for disk sector sizes that do not directly match the file record size and it will handle the blocking/unblocking for you (CP/M's file records are 128 bytes long, SD card sectors are 512 bytes long). One other nice feature of CP/M 3 is the existence of a "banked" version which allows it to run on systems with more than 64KB of RAM. As far as user programs are concerned they still run in a flat 64KB memory space, however the OS can move certain parts of itself as well as disk and directory buffers into a separate memory bank where they are only accessed when needed, freeing up space in the "transient program area" (TPA). As well as more memory for user programs the banked version provides a much improved line editor when typing at the command-line, password protection of files and more descriptive error messages.
Naturally, when I read about this I thought it would be an obvious choice for my computer. As it is, I'm using a 128KB RAM chip but have tied A16 low as I didn't have any kind of MMU or bank-switching hardware setup (32KB and 128KB RAM chips are available in abundance, 64KB ones less so, and using a 128KB chip with the address line tied low involved a lot less soldering than two separate 32KB RAM chips). I did have an emulator where I could try to prototype the hardware changes to support a banked CP/M 3, however I was not able to get a banked version of the OS built and working so gave up – after all, I had a 49KB TPA, which seemed like it would be good enough.
With the other improvements to the computer recently I thought it worth reinvestigating. I did a bit of hunting to see if I could find any recommendations for a simple setup but most of what I could find ended up being a lot more complicated than what I was really looking for. After a bit more experimentation I was able to end up with a banked version of CP/M running on my computer and all I needed was a single NAND gate chip.
Memory requirements for banked CP/M
The memory layout of banked CP/M is actually quite a bit simpler than a lot of the threads I could find online seemed to make out. All you really need is a shared common area at the top of memory that will always be accessible regardless of the current state of the selected bank, and memory below that which can be switched between multiple banks. When booting the computer bank 0 will be selected, so both the common (resident) and banked parts can be copied to memory, and then bank 1 will be swapped in to provide the large TPA.

In my case, as I'm using a 128KB RAM chip, I will use A16 as the bank selection bit. When low this will provide access to the lower 64KB RAM on the chip, when high it will provide access to the upper 64KB RAM. To implement the common area at the top of memory, you then just need to check to see if the address is above the boundary between banked and common memory and if so to force A16 either high or low (it doesn't matter which, as long as it's consistent) so that when the address is in the common area the same bank will be accessed, regardless of the state of the bank selection bit.
Bank switching with simple logic
A simple way to implement a common area in upper memory is with AND (to detect the high address) and OR (to force the A16 high if it's a high address) logic, like this:

Here we use a 4-input AND gate to detect any memory address in the top 4KB of the chip (address lines A12 to A15 will go high at %1111000000000000 which gives a common region of $F000 to $FFFF). If that's the case, then the output of the 4-input AND gate will be high, which when ORed with the bank selection bit will force A16 high whenever we're in the common memory area. If we're below the common memory area then the value of the bank selection bit will pass through directly to A16, allowing us to bank switch the lower area of memory. Or, to summarise in a truth table:
In | Out | ||||
---|---|---|---|---|---|
A12 | A13 | A14 | A15 | BANK | A16 |
1 | 1 | 1 | 1 | x | 1 |
0 | x | x | x | 0 | 0 |
0 | x | x | x | 1 | 1 |
x | 0 | x | x | 0 | 0 |
x | 0 | x | x | 1 | 1 |
x | x | 0 | x | 0 | 0 |
x | x | 0 | x | 1 | 1 |
x | x | x | 0 | 0 | 0 |
x | x | x | 0 | 1 | 1 |
However, it would be easier if we could implement this on a single chip. A 4x 2-input NAND gate chip (such as the SN74ALS00AN) should do the job when wired up as follows:

The truth table is a little different this time around:
In | Out | |||
---|---|---|---|---|
A13 | A14 | A15 | BANK | A16 |
1 | 1 | 1 | x | 1 |
0 | x | x | 0 | 1 |
0 | x | x | 1 | 0 |
x | 0 | x | 0 | 1 |
x | 0 | x | 1 | 0 |
x | x | 0 | 0 | 1 |
x | x | 0 | 1 | 0 |
When accessing the banked region of memory A16 is the inverse of the bank selection bit. This doesn't matter, though, as long as there's a consistent mapping between logical addresses and the physical RAM addresses it will work even if it's "backwards". There's also one fewer address line, which means that the common area now runs from %1110000000000000 = $E000 to $FFFF, providing a common area of 8KB. In practice I didn't find this made a difference to the amount of memory available in the TPA; whether the common area was 4KB, 8KB or 16KB I was able to bring the TPA up to 60KB (from 49KB in the non-banked system), though it does eat into the amount of memory available on page 0 for disk and directory buffers. As I'm loading from an SD card (which is much faster than the floppy discs of yore) the reduced buffer space is less of a concern to me.
Fortunately there was enough space inside the computer (and a single remaining pin on the I/O controller to act thas bank selection bit) to add the NAND chip and drive A16. At last I have access to 120KB of my 128KB RAM chip... but what about the software?
Building a banked version of CP/M
I will start with the assumption that you have been able to build a non-banked version of CP/M 3 and got that running on your computer, as there is a lot less that can go wrong when doing so. Once you've got that working there's not too much to add to your BIOS to make it support banking, however I did run into a few issues with missing files and some misinterpretation of how things should work until I was able to get it working.
I used the "Developers Build Directory for CP/M 3" from The Unofficial CP/M Web site as my source for CP/M 3. This contains the GENCPM tool that will be used to generate the CPM3.SYS that will need to be loaded into memory by your boot loader. In my case I get my I/O controller to copy CP/M from the SD card into memory at boot – if you've already got the non-banked version of CP/M 3 booting then you'll be familiar with this, but do pay attention to table D-1 in the CP/M 3 system guide which points out the two parts of CP/M to load – the "resident" and "banked" portions. Both parts need to be loaded on a banked system, and both need to be loaded into page 0.
To get that far you will need to have relocatable copies of your banked BIOS (BNKBIOS3.SPR) and the BDOS (RESBDOS3.SPR and BNKBDOS3.SPR) ready to be used by GENCPM. I couldn't find a ready-made copy of these BDOS modules, but you can build them using RMAC and LINK as shown below:
RMAC RESBDOS LINK RESBDOS3=RESBDOS[OS]
PIP BNKBDOS3.ASM=CPMBDOS2.ASM,CONBDOS.ASM,BDOS30.ASM RMAC BNKBDOS3 LINK BNKBDOS3=BNKBDOS3[OS]
The banked BDOS source code is split between three different source files which need to be combined with PIP first, then can be built. For the sake of completeness, if you wanted to build the non-banked BDOS3.SPR you'd use a very similar set of commands, just with CPMBDOS1.ASM instead of CPMBDOS2.ASM:
PIP BDOS3.ASM=CPMBDOS1.ASM,CONBDOS.ASM,BDOS30.ASM RMAC BDOS3 LINK BDOS3=BDOS3[OS]
The other important ingredient is your banked BIOS, BNKBIOS3.SPR. I don't get on with 8080 syntax so I assemble my BIOS3.MAC with Microsoft's M80 in Z80 mode (instead of RMAC).
RMAC SCB RMAC BIOSKRNL M80 =BIOS3 LINK BNKBIOS3[B]=BIOSKRNL,BIOS3,SCB
If you had previously edited BIOSKRNL.ASM to state banked equ false remember to change it to banked equ true as well!
The only additions you should need in your BIOS are implementations of ?xmove and ?bank. ?bank is an easy one, and just switches to the memory bank requested in the A register. In my case I handle that just by outputting A to the I/O port that handles bank switching:
; Select Memory Bank ; Entry Parameters: A=Memory Bank ; Returned Values: None ; You must preserve or restore all registers other than the ; accumulator, A, upon exit. ?bank: if banked out (bank$select),a ; change this for what your hardware requires endif ret
(To retain compatibility with my old banked BIOS I wrap the changes in an if banked condition – banked equ true appears earlier in the file).
?xmove is a little more complicated – this states that the subsequent ?move operation (which copies BC bytes from DE to HL) should transfer data from one memory bank to another. Note that this only affects the next ?move operation; if ?move is called again afterwards without ?xmove then it should perform a copy within the same bank as before.
Fortunately the inter-bank copy is limited to 128 bytes so you can simply implement this by temporarily copying the data from one bank into a 128 byte buffer in common memory, then copying the data back to the destination bank. It's not exactly efficient, but it keeps the hardware simple.
; Memory-to-Memory Block Move ; Entry Parameters: HL=Destination address ; DE=Source address ; BC=Count ; Returned Values: HL and DE must point to ; next bytes following move operation ?move: ex de,hl ldir ex de,hl ret ; Set Banks for Following MOVE ; Entry Parameters: B=destination bank ; C=source bank ; Returned Values: None ?xmove: if banked ; Store the source/destination bank numbers ld (mov$src$b),bc ; Make sure that the next call to move (via ?mov vector) uses the banked move routine. ld bc,banked$move ld (?mov+1),bc ret banked$move: ; Select source bank ld a,(mov$src$b) call ?bank ; Swap registers from CP/M to Z80 conventions ex de,hl ; Preserve destination and length push de push bc ; Copy from source to buffer ld de,mov$buf ldir ; Recover length and destination, preserve source pop bc pop de push hl ; Select destination bank ld a,(mov$dst$b) call ?bank ; Copy from buffer to destination ld hl,mov$buf ldir ; Recover source pop hl ; Swap registers from Z80 to CP/M conventions ex de,hl ; Make sure that the next call to move (via ?mov vector) uses the regular move routine. ld bc,?move ld (?mov+1),bc ret mov$src$b: db 0 mov$dst$b: db 0 mov$buf: ds 128 else ; Unbanked ret endif
This implementation works by changing the ?mov vector in the BIOSKRNL to point at our banked$move routine after a request to ?xmove. Once we've carried out the banked move, the original ?move routine is restored to the ?mov vector.
Once you have assembled and linked your BNKBIOS3.SPR, RESBDOS3.SPR and BNKBDOS3.SPR you can use GENCPM to create your new CPM3.SYS. You'll need to answer some questions differently to support the banked system:
- Bank switched memory? Y.
- Common memory base page? E0 (if using the NAND gate circuit above – our common area starts at $E000).
- Number of memory segments? 1 – we have three in total (bank 0, bank 1 and common) however bank 1 and common are not included in the segment table so should be ignored here.
- Memory segment table base, size, bank: 01, 90, 00 (we want to keep CP/M out of the "zero page" so start the segment from $0100, CP/M 3 starts at $9100 so we have $9100-$0100=$9000 as our size, the bank number is 0).
Before being prompted for the memory segment table GENCPM will display where CP/M 3 itself is using memory so you can use that to figure out how much free space you have on your bank zero for your segment definition. However, if you enter a value that is too large GENCPM will automatically reduce the size for you.
After this you will be prompted to create disk and directory buffers for each of your disk definitions – pay attention to available space to get an idea of how many buffers you can create, but if in doubt just allocate a single buffer for each disk/directory as prompted as that will at least get you booted, then you can experiment with larger buffers later.
I did intentionally start my segment from $0100 instead of $0000 and this is to avoid problems with interrupts and to keep the zero page free. My computer design uses interrupts to signal to the Z80 that keys are available (for example) instead of requiring it to constantly poll the I/O controller. However, I did find that if I interrupted the CPU (e.g. by pressing a key) when it had switched over to page 0 it would hang the computer as the ISR vector had been switched out from underneath it. My ISR is in common memory and I just make sure that when the computer boots it installs its interrupt vectors in every memory bank so that it doesn't matter which is currently swapped in, it'll always find its way to the common ISR.
After making these changes I was greeted with a 60KB TPA instead of the previous 49KB TPA – 11KB of extra memory is well worth it, and the improved line editor in CP/M 3 is another nice bonus. I did think that implementing this was going to be a nightmare, but in the end I only needed one extra NAND gate and a few easy changes to the software.
Addendum (31st August 2023): One other change you will need to implement is to support disk operations reading from or writing to specific memory banks. I forgot to mention this earlier as it's handled by the setbnk routine inside BIOSKRNL, and that routine stores the selected DMA bank number in the @dbnk variable. When your BIOS performs a disk read or write operation it will need to preserve the current bank number, switch to the bank number in @dbnk, carry out the read or write operation, then restore the previous bank number.
In my case, disk I/O is handled by the AVR I/O controller where operations are set up by sending over the DMA address, sector and track numbers, drive index and then performing a read from either the "read" or "write" ports to initiate the I/O operation and retrieve the status. The only change required was to make sure that the bank number is also sent over before initiating the I/O request so the AVR knows which bank it should be accessing:
fd$copy$ptrs: ld hl,(@dma) ld a,l out (disk$dma$l),a ld a,h out (disk$dma$h),a ld hl,(@sect) ld a,l out (disk$sector$l),a ld a,h out (disk$sector$h),a ld hl,(@trk) ld a,l out (disk$track$l),a ld a,h out (disk$track$h),a ld a,(@adrv) out (disk$drive),a if banked ld a,(@dbnk) out (disk$dma$bank),a endif ret fd$write: call fd$copy$ptrs in a,(disk$write) ret fd$read: call fd$copy$ptrs in a,(disk$read) ret
I'm pretty sure I didn't forget anything else!
Reverse engineering Z-Tape for the Cambridge Z88
Saturday, 10th June 2023
When reading about the Cambridge Z88 computer and its available software I bumped into the occasional mention of Z-Tape by Wordmongers, a system that allowed you to back up files from your Z88 to a cassette recorder. I had wondered how this worked, assuming there some sort of external hardware to connect the cassette recorder to the Z88 (likely via its serial port). I'd done some work on tape loading and saving myself for the Sega Master System and had come up with a somewhat hacky but minimal solution that relies on abuse of a hex inverter. Surely a commercially-released product would have a better way of doing things, or at least so I thought!
More recently I noticed someone had uploaded a copy of the application and some accompanying documentation to the Cambridge Z88 page on SourceForge, so I downloaded it to take a look and was very surprised at what I found:

That can't work, surely? The output for recording seems sensible enough, using the 1Ω resistor to ground on the output to reduce the level down to something that could be fed into a sensitive microphone input, but just running the earphone output directly into the RS-232 port's CTS line doesn't seem like it would do the job. There's only one way to find out, though, and that's to build a cable and try it out and to my surprise it does indeed work!

I have had a few issues with this, however. The program appears to require that the phase of the data played back into the Z88 matches the phase that it was recorded. Both of my cassette recorders reverse the phase when playing back the recordings. Fortunately one of them does have a phase reversal switch, and two wrongs in this case does make a right and by setting the phase switch to "reverse" it allows Z-Tape to load back the recorded data.
The overall loader is not particularly reliable, though. It relies on a very strong output from the cassette recorder to successfully register a signal on the Z88's serial port, and I find I have to rewind to try again quite often. That it works at all with such a simple cable is certainly impressive, though.
In my testing I wrote a little BASIC program that crudely checks the signal level on the RS-232 input. You can use this to test the strength of your cassette recorder's output: it will display a rolling progress bar with the approximate signal strength. With my cassette recorder I can get over 80% when playing back a block, but I can't register anywhere near that when connecting the Z88 to my PC's audio output or my phone's headphone socket and consequently can't load back recordings from those devices.
10 *NAME Tape Level Test 20 REPEAT 30 S%=0 40 FORI%=0TO99:S%=S%+(GET(&E5)AND1):NEXT 50 S%=50-ABS(50-S%) 60 PRINT'S%*2;"% ";CHR$1;"R";CHR$1;"3N";CHR$(32+S%);" ";CHR$1;"G";CHR$1;"3N";CHR$(32+(50-S%));" ";CHR$1;"3-RG "; 70 UNTIL INKEY(0)<>-1 80 PRINT
Once I'd experimented with Z-Tape and a cassette recorder I thought it would be interesting to see how it worked and whether I could reverse-engineer the format used. I connected the Z88 to my PC, made some recordings, and then set to work.
Bit-level format
The first thing to do is to establish the base frequency. After taking a recording from the Z88, I checked it in Audacity's frequency analyser and found a strong peak at 1590Hz:

There is also a strong peak at 3195Hz, which is very close to twice the other peak's frequency (halving it gives us 1597.5Hz, close to 1590Hz). Based on these measurements it would seem that the base frequency is around 1600Hz, and likely that the tape format is a combination of 1600Hz and 3200Hz tones. Zooming into the recorded waveform shows the two different tones:

A common way to record data on tape is to use one full cycle of the base frequency to represent a "0" bit and two full cycles of twice the base frequency to represent a "1" bit. This means that the data is the same length regardless of how many "0"s or "1"s appear in the data, and looking at the length of data blocks in the recording they were all the same length, so it seems this is a possible candidate.
The phase of the signal is also important. If we represent the signal as a sine wave, a phase of 0° would start at zero, increase in the positive direction for the first quarter of the wave, head down in the negative direction for the next half of the wave, before returning to zero in a positive direction in the last quarter of the wave. Conversely a phase of 180° would start from zero but go negative in the first half of the wave before going positive in the second half of the wave. The phase can be determined by looking at the start of the signal after a period of silence:

As the signal goes positive first after a period of silence we can confirm the signal has a phase of 0°. In summary, the bit-level format required by Z-Tape is as follows:
- Base frequency of 1600Hz.
- Phase of 0°.
- "0" bits encoded as one full cycle at base frequency (1600Hz).
- "1" bits encoded as two full cycles at twice the base frequency (3200Hz).
Block-level format
Now that we have a stream of bits, we can group them into blocks of data on the tape. Each block starts with a leader or pilot tone, which is effectively a long stream of "1" bits (3200Hz). This lasts 1.25 seconds, after which there is a very brief silence (around two full waves in length) followed by the stream of bits that make up the actual block data.

I created some files on the Z88 that followed certain obvious patterns, for example a file that alternated $00 bytes and $FF bytes so you'd expect to see eight consecutive "0" bits in the recording followed by eight consecutive "1" bits. This would help check to see if there were any start, stop or parity bits in the data (or if it was just eight plain bits of data). I also had a file that contained all of the numbers from $00 to $FF consecutively, so you'd be able to see a clear pattern of byte values counting up and use this to check whether the data was sent least-significant or most-significant bit first.
Using these files I quickly found that the data in each block always starts with two zero bits (immediately after the leader or pilot tone) and is then sent in plain 8-bit bytes (no start, stop or parity bits) with the least significant bit sent first. Each block always contains 1031 bytes of raw data, no matter the size of the file being transmitted. I knew that there was a checksum as Z-Tape would occasionally grumble when loading about a checksum error and I could see that after transferring small files there'd be data at the start of the block, a gap filled with zeroes, followed by a final non-zero data byte. I assumed this was the checksum, and found that by adding up all 1031 bytes in the block the result always came to zero. The checksum can therefore be calculated by setting a counter to zero, subtracting the value of every byte in the 1030 data bytes of the block, and then appending the counter value to as the 1031st byte of the block.
In summary, the block-level format is as follows:
- 1.25 seconds of 3200Hz leader or pilot tone (stream of "1" bits).
- Silence for the duration of two full cycles.
- Two "0" bits, sent as two full cycles of the 1600Hz tone.
- 1030 data bytes, each sent as eight plain bits, least significant bit first.
- "0" bits sent as one full cycle of 1600Hz tone.
- "1" bits sent as two full cycles of 3200Hz tone
- Checksum data byte, sent in same manner as other data bytes, but calculated such that adding up all 1031 data bytes in the block results in 0.
There is approximately half a second of silence between data blocks, though the actual amount of time depends on how much work the Z-Tape application has to do to prepare each block. When building the catalogue before sending a large number of files I've seen gaps over 24 seconds long!
Block contents
Each block always contains 1030 bytes of data plus a checksum byte, and for the sake of simplicity I'll ignore the checksum in the discussion below.
The first byte of each block's data determines what sort of block it is. I've identified six different block types.
The next two bytes are the size of the data included in the block, least significant byte first, though sometimes this value is incorrect or missing depending on the particular type of block.
After that are two bytes that record the block number, least significant byte first. The first block has a block number of 0 and this counts up one for every block on the tape.
After this you'll find the actual data associated with the block, normally up to 1024 bytes, with the rest of the block padded with zeroes.
Blocks $04 and $05: Catalogue blocks
When storing a selection of files on tape Z-Tape writes a catalogue file first containing a list of files. The final block in the catalogue is sent with a block type of $05, if more than one block is required to represent the catalogue then preceding partial catalogue blocks use a type of $04.
Catalogue blocks always have a reported size field of zero.
Each file entry is stored in a record 28 bytes long. As each block can store up to 1025 bytes of user data this allows for up to 36 files to be described in each catalogue block.
The records always start from offset 5 into the block (one byte block ID, two byte size = 0, two byte block number) and each takes the following format:
- Bytes 0~15: Filename.
- Byte 16: 0 (NUL terminator for filename).
- Bytes 17~21: File size as floating-point number (four byte mantissa, MSB first, followed by exponent).
- Bytes 22~24: Three byte time (centiseconds since start of day, LSB first).
- Bytes 25~27: Three byte Julian date (number of days since Monday 23rd November 4713 BC, LSB first).
Filenames can be up to sixteen characters long (12 filename characters, a dot, three extension characters). They can be mixed case.
The file size being a floating-point number took me a while to figure out! This is the numeric format used by BBC BASIC (Z80) and is also internally used by the Z88 OS for its FPP routines. The format for this number can be found in the BBC BASIC documentation, though for the sake of simplicity if you're creating your own catalogue in Z-Tape format note that it does accept the "special case" real number where the exponent is set to 0 and the mantissa is a regular integer. If you're decoding tapes created by Z-Tape you'll need to decode the floating-point number yourself, though.
The date and time are in the format used internally by the Z88 OS. The only challenge here is the Julian day is outside the range that can be represented by some programming language date and time functions which can complicate matters. Here's a snippet of C# that works if you're trying to convert a catalogue date and catalogue time to a .NET DateTime object:
var catalogueDate = DateTime.FromOADate(catalogueDateNum - 2415019); catalogueDate = catalogueDate.AddMilliseconds(catalogueTimeNum * 10);
Blocks $01 and $06: File start blocks
These blocks appear at the start of a file. Block type $06 is used if the whole file data can fit in a single block, $01 if additional blocks containing the rest of the file will follow.
The block size is used here to determine how many bytes of data are present. This will be the size of the whole file if the block type is $06, $03E0 (992 bytes) if the block type is $01.
Block bytes from 5 to 31 contain a copy of the filename, padded with zeroes. This must be in UPPERCASE, regardless of how the file was listed in the catalogue, otherwise the Z-Tape loader will be unable to recognise the file by name (this one took a while to puzzle out!)
After this comes the file data. If this is a block type $06 that's the end of it, but if it's block $01 more file data will follow...
Block $02 and $03: File data blocks
These blocks contain raw file data from offset 5 (there is no filename field, as with blocks $01 and $06) and appear in the middle or end of files. If the block type is $02 then this block appears in the middle of the file and it always contains 1024 bytes of data, though the header will report it contains $03E0 (992 bytes) and should be ignored. If it's block type $03 then that corresponds to the end of the file, and the data length should be taken into consideration.
Block types summary
The following table documents the block types. All multi-byte numeric values are transmitted least significant byte first with the exception of the floating-point numbers representing the file sizes in the catalogue described earlier.
Offset | Catalogue | File | ||||
---|---|---|---|---|---|---|
Partial catalogue block | Final catalogue block | File fits in single block | First file block | Middle file block | Last file block | |
0 | $04 | $05 | $06 | $01 | $02 | $03 |
1~2 | $0000 | Data length | $03E0 | Data length | ||
3~4 | Block number (starting from 0 for the first block) | |||||
5 | Up to 36 28-byte records listing the files about to follow. | The UPPERCASE name of the file, zero-padded to 27 bytes in length. | 1024 bytes of file data. | Data length bytes of file data. | ||
32 | Data length bytes of file data. | |||||
1030 | Checksum calculated so that adding up all 1031 bytes results in 0. |
Creating Z-Tape audio on a PC
This is all well and good, but what's the point of it? The information above may be useful if someone has an old tape that they needed to recover data from but no longer had a Z88, though that seems like a fairly remote possibility. Another possibility could be to create Z-Tape data from files on PC and then play it back to transfer data from the PC to the Z88. Alternatively, a selection of programs could be stored on a CD and loaded onto the Z88 from a portable CD player when out and about.
Maybe not the most useful ideas, but here's a C# function that will take an array of filenames and generate a series of data blocks in the Z-Tape format, including a catalogue:
static byte[][] CreateBlocksFromFiles(string[] files) { List<byte[]> blocks = new List<byte[]>(); // generate the catalogue for (int firstFileInBlock = 0; firstFileInBlock < files.Length; firstFileInBlock += 36) { // which is the last file in the block (+1) that we will write to the file? var lastFileInBlock = Math.Min(files.Length, firstFileInBlock + 36); // catalogue block data is 1030 bytes, same as all other blocks var catalogue = new byte[1030]; // if the is the last block in the catalogue, block type is 0x05, otherwise it's 0x04 catalogue[0] = (byte)((lastFileInBlock == files.Length) ? 0x05 : 0x04); // current block number catalogue[3] = (byte)(blocks.Count >> 0); catalogue[4] = (byte)(blocks.Count >> 8); // write each file for this block to the catalogue var catalogueOffset = 5; for (int fileInBlock = firstFileInBlock; fileInBlock < lastFileInBlock; ++fileInBlock) { var file = new FileInfo(files[fileInBlock]); // file name (can be mixed case) Array.Copy(Encoding.ASCII.GetBytes(file.Name.PadRight(16, '\0')[..16]), 0, catalogue, catalogueOffset, 16); // file size (Z-Tape normally uses floating-point values) catalogue[catalogueOffset + 17] = (byte)(file.Length >> 24); catalogue[catalogueOffset + 18] = (byte)(file.Length >> 16); catalogue[catalogueOffset + 19] = (byte)(file.Length >> 8); catalogue[catalogueOffset + 20] = (byte)(file.Length >> 0); // file date/time var writeTime = file.LastWriteTime; // time is centiseconds since midnight var fileTime = (int)(writeTime.TimeOfDay.TotalMilliseconds / 10); catalogue[catalogueOffset + 22] = (byte)(fileTime >> 0); catalogue[catalogueOffset + 23] = (byte)(fileTime >> 8); catalogue[catalogueOffset + 24] = (byte)(fileTime >> 16); // date is Julian day number var fileDate = (int)(writeTime.ToOADate() + 2415019); catalogue[catalogueOffset + 25] = (byte)(fileDate >> 0); catalogue[catalogueOffset + 26] = (byte)(fileDate >> 8); catalogue[catalogueOffset + 27] = (byte)(fileDate >> 16); catalogueOffset += 28; } blocks.Add(catalogue); } // write each file to the tape foreach (var filePath in files) { var file = new FileInfo(filePath); using (var fileData = file.OpenRead()) { do { var fileBlock = new byte[1030]; // current block number fileBlock[3] = (byte)(blocks.Count >> 0); fileBlock[4] = (byte)(blocks.Count >> 8); // how much data can we store in the block? var maxBlockData = 1024; var blockDataOffset = 5; if (fileData.Position == 0) { // if it's the first block for the file, store the filename (must be UPPERCASE) Array.Copy(Encoding.ASCII.GetBytes(file.Name.ToUpperInvariant().PadRight(16, '\0')[..16]), 0, fileBlock, blockDataOffset, 16); blockDataOffset = 0x20; // can't store as much in the first block due to all the header info we just wrote maxBlockData = 992; // what sort of block is it? if (file.Length > maxBlockData) { fileBlock[0] = 0x01; // first block in a multi-block file } else { fileBlock[0] = 0x06; // single block for the whole file } } else { // what sort of block is it? if (file.Length > fileData.Position + maxBlockData) { fileBlock[0] = 0x02; // continued data block in a multi-block file } else { fileBlock[0] = 0x03; // last data block in a multi-block file } } // how much data can we actually copy? var actualBlockData = Math.Min(maxBlockData, (int)(file.Length - fileData.Position)); // read the data if (fileData.Read(fileBlock, blockDataOffset, actualBlockData) != actualBlockData) { throw new InvalidDataException(); } // store the data size fileBlock[1] = (byte)(actualBlockData >> 0); fileBlock[2] = (byte)(actualBlockData >> 8); blocks.Add(fileBlock); } while (fileData.Position < fileData.Length); } } return blocks.ToArray(); }
Once the blocks have been generated, we can convert them to a tape format like UEF:
static void WriteUef(string filename, IEnumerable<byte[]> blocks, ushort baudRate = 1600, bool reversePhase = false) { using (var uefFile = File.Create(filename)) using (var uefWriter = new BinaryWriter(uefFile)) { // Header uefWriter.Write(Encoding.ASCII.GetBytes("UEF File!\0")); uefWriter.Write((byte)0x0A); // minor version uefWriter.Write((byte)0x00); // major version // Chunk &0113 - change of base frequency uefWriter.Write((ushort)0x0113); uefWriter.Write((uint)4); uefWriter.Write((float)baudRate); // Chunk &0115 - change of phase uefWriter.Write((ushort)0x0115); uefWriter.Write((uint)2); uefWriter.Write((ushort)(reversePhase ? 180 : 0)); // Write each block to the UEF foreach (var block in blocks) { // Calculate the checksum byte checksum = 0; foreach (var b in block) { checksum -= b; } // Chunk &0110 - carrier tone uefWriter.Write((ushort)0x0110); uefWriter.Write((uint)2); uefWriter.Write((ushort)(baudRate * 5 / 4)); // Chunk &0112 - integer gap uefWriter.Write((ushort)0x0112); uefWriter.Write((uint)2); uefWriter.Write((ushort)2); // Chunk &0102 - explicit tape data block uefWriter.Write((ushort)0x0102); uefWriter.Write((uint)2); uefWriter.Write((byte)14); // bit count = (chunk length * 8) - 14 = 2 bits uefWriter.Write((byte)0); // 2 zero bits // Chunk &0102 - explicit tape data block uefWriter.Write((ushort)0x0102); uefWriter.Write((uint)(2 + block.Length)); uefWriter.Write((byte)8); // bit count = (chunk length) * 8 - 8 uefWriter.Write(block); uefWriter.Write(checksum); // Chunk &0112 - integer gap uefWriter.Write((ushort)0x0112); uefWriter.Write((uint)2); uefWriter.Write((ushort)(baudRate / 2)); } } }
A .wav file is probably an easier format to work with, however!
static void WriteWav(string filename, IEnumerable<byte[]> blocks, int baudRate = 1600, bool reversePhase = false, uint sampleRate = 48000, uint channelCount = 1, ushort bitsPerSample = 16) { // generate cycles var cycleSampleCount = sampleRate / baudRate; var bits = new byte[3][]; // good old ternary logic - true, false, and file_not_found. for (int b = 0; b < 3; ++b) { bits[b] = new byte[cycleSampleCount * bitsPerSample / 8 * channelCount]; } for (int c = 0; c < cycleSampleCount * channelCount; ++c) { double a = ((c / channelCount) * Math.PI * 2.0d) / cycleSampleCount; for (int b = 0; b < 3; ++b) { double v = b == 2 ? 0 : Math.Sin(a * (1.0d + b)); if (reversePhase) v = -v; switch (bitsPerSample) { case 8: bits[b][c] = (byte)Math.Round(Math.Max(byte.MinValue, Math.Min(byte.MaxValue, 127.5d + 127.5d * v))); break; case 16: short vs = (short)Math.Round(Math.Max(short.MinValue, Math.Min(short.MaxValue, (short.MaxValue + 0.5d) * v))); bits[b][c * 2 + 0] = (byte)(vs >> 0); bits[b][c * 2 + 1] = (byte)(vs >> 8); break; } } } using (var wavFile = File.Create(filename)) using (var wavWriter = new BinaryWriter(wavFile)) { // RIFF header wavWriter.Write(Encoding.ASCII.GetBytes("RIFF")); // chunk ID var riffDataSizePtr = wavFile.Position; wavWriter.Write((uint)0); // file size (we'll write this later) wavWriter.Write(Encoding.ASCII.GetBytes("WAVE")); // RIFF type ID // chunk 1 (format) wavWriter.Write(Encoding.ASCII.GetBytes("fmt ")); // chunk ID wavWriter.Write((uint)16); // chunk 1 size wavWriter.Write((ushort)1); // format tag wavWriter.Write((ushort)channelCount); // channel count wavWriter.Write((uint)sampleRate); // sample rate wavWriter.Write((uint)(sampleRate * channelCount * bitsPerSample / 8)); // byte rate wavWriter.Write((ushort)(channelCount * bitsPerSample / 8)); // block align wavWriter.Write((ushort)bitsPerSample); // bits per sample // chunk 2 (data) wavWriter.Write(Encoding.ASCII.GetBytes("data")); // chunk ID var waveDataSizePtr = wavFile.Position; wavWriter.Write((uint)0); // wave size (we'll write this later) var waveDataStartPtr = wavFile.Position; // write half a second of silence for (int i = 0; i < baudRate / 2; ++i) { wavWriter.Write(bits[2]); } // Write each block to the WAV foreach (var block in blocks) { // write 1.25 seconds of carrier tone for (int i = 0; i < baudRate * 5 / 4; ++i) { wavWriter.Write(bits[1]); } // write gap wavWriter.Write(bits[2]); wavWriter.Write(bits[2]); // write two 0 bits wavWriter.Write(bits[0]); wavWriter.Write(bits[0]); // calculate the checksum as we go byte checksum = 0; // write all of the bytes in the block for (var i = 0; i < block.Length + 1; ++i) { // fetch the byte to write byte b; if (i < block.Length) { // use data from the block and update the checksum b = block[i]; checksum -= b; } else { // write the checksum b = checksum; } // write each bit, LSB first for (int bit = 0; bit < 8; ++bit) { wavWriter.Write(bits[b & 1]); b >>= 1; } } // write half a second of silence for (int i = 0; i < baudRate / 2; ++i) { wavWriter.Write(bits[2]); } } // update wave size var waveDataEndPtr = wavFile.Position; wavFile.Seek(waveDataSizePtr, SeekOrigin.Begin); wavWriter.Write((uint)(waveDataEndPtr - waveDataStartPtr)); // update RIFF size wavFile.Seek(riffDataSizePtr, SeekOrigin.Begin); wavWriter.Write((uint)(waveDataEndPtr - 8)); } }
But, I hear you say, didn't you earlier mention how a PC's audio output was now powerful enough to drive the Z88's serial port? I did indeed, and that's why I've also put together this little circuit:

This is based on the tape interface circuit I devised for the Sega Master System and uses an SN74LS04N hex inverter chip as an amplifier to drive the Z88's CTS line. It's designed to be powered from the Z88's serial port which provides 5V at 1mA on the DTR pin. This current limit does seem awfully low and I have seen it reported as 10mA in some places but I'm not sure if that's a typo or not — the user manual states 1mA. In my testing this circuit consumes between 2mA-3mA which is much more than 1mA but it does still work, however I would strongly recommend doing your own testing before hooking anything up to your Z88's serial port. The other hex inverter chips I tried all consumed over 20mA in this use which is far too much for the Z88! There was a noticeable difference in current consumption depending on whether unused inputs were tied high or tied low, so please do your own testing.
The presence of a phase switch does allow this circuit to be used with recorders that reverse the phase when recording but don't provide a phase reversal switch of their own to fix this on playback.
All in all I'm very impressed that the Z-Tape software works as well as it does considering the simplicity of the hardware, and it's been a lot of fun digging into how it works.
Updated TI-83 Plus BootExec with support for TI's "Silver Link" driver
Wednesday, 7th June 2023
The previous release of the TI-83 Plus BootExec program relied on temporarily replacing TI's Silver Link driver with WinUSB if you wanted to use the Silver Link USB cable. I've updated the program so it will try to use TI's driver if it's available, or WinUSB if not. This should help people who can't (or don't want to) temporarily replace TI's driver.
The updated application can be downloaded, as before, from the same link: ti83p-bootexec.zip.
Updated TI-83 Plus BootExec with USB "Silver Link" support
Sunday, 4th June 2023
This is a quick update to the TI-83 Plus BootExec program described in a previous journal entry. The program now supports the USB "Silver Link" cable (as well as the serial "Black Link" it previously supported) though to access the USB device you do need to temporarily replace TI's supplied driver with a generic WinUSB one which can be organised with Zadig.
The updated application can be downloaded from the same link as before: ti83p-bootexec.zip.
Unbricking a TI-83 Plus calculator with a link buffer overflow
Friday, 2nd June 2023
A few years ago I started running into problems with my TI-83 Plus graphical calculator. I was unable to install applications – it would keep locking up when "defragmenting". In the end I attempted to reinstall the operating system to see if that would cure matters, but that failed too and in the process left the calculator in a state where it wouldn't boot at all. Switching it on you'd be presented with a screen prompting you to reinstall the OS:
Waiting... Please install calculator software now.
If you tried to install the OS over the link port it would switch to a progress screen but then get permanently stuck at the 0% mark until you pulled a battery out.
I eventually found a program called Overflow by Brandon Wilson which described similar symptoms and a possible cause – a corrupt certificate page. Considering the problems I'd been having with the flash ROM before attempting the OS reinstallation it seemed possible that my certificate page might have become corrupt and that was preventing me from reinstalling the OS.
The Overflow program describes a technique whereby it can transfer a user-supplied program to the target calculator by sending a very large variable packet and taking advantage of a lack of bounds checking in the calculator's boot code. Unfortunately, I was unable to get it to work on my TI-83 Plus, in spite of many repeated attempts. I eventually bought a replacement calculator, though being a newer model and built to a much cheaper standard I was always a bit disappointed that my original calculator was lingering, bricked, in a drawer.

Photo of the repaired calculator (right) next to the its temporary replacement (left) – note the missing ID on the repaired calculator.
More recently I decided to revisit the problem, got a better understanding of just how the Overflow program worked and found a way to get it work on my original TI-83 Plus. The photo above shows the two working calculators I now have, though as I ended up having to erase the certificate page on the one on the right it now lacks an ID.
How Overflow works
The basic technique exploited here is that the TI-83 Plus boot code does not bounds-check the length of the link packet we're sending it, so by sending a very large packet we can overflow the intended buffer right up to user memory, send over a program we wish to execute, and then overwrite the Z80 stack with the address of our program so that when the link routines return it executes our program rather than returning to the boot code.
Overflow satisfies this process by filling up the memory as described above, then sending some correcting data so that the checksum for the oversized packet is equal to zero, and then sending a constant stream of zeroes until the transfer fails. The last two bytes of a transfer are the checksum, and by previously correcting the packet's checksum to zero this means that the packet will be seen as valid.
At this point the transmitting calculator detects the link error and tries to read back the acknowledgement from the receiving calculator, and all should be well.
Unfortunately, the TI-83 Plus seems to be more fussy about how it handles linking errors and once the attempt to send too many zero bytes has failed it just displays an error message and switches off, rather than letting us receive the acknowledgement before executing our payload.
Looking at the documentation for Overflow it seems to have been intended more for the TI-84 Plus series calculators,
so it could be that they are more forgiving of the linking errors.
Trial-and-error with zero padding
If the problem is that we're sending too many zero bytes, one option is to count how many zero bytes we can send successfully. Once the attempt has failed, we can then make sure that on our next attempt we only send just the right number of zero bytes (based on our previous count) and no more, then check for the acknowledgement from the receiving calculator. To my delight this strategy works well, and is provided by the application's -zeropad option.
Unfortunately as over 30,000 zeroes need to be sent each time the exploit packet takes a long time to transmit and as we now need to do it twice this can really slow things down! Once a safe number is known this can be specified with -zeropad=<count> but it's still a time-consuming process.
Fixed-size packets for quicker transmission
The problem here is not knowing the size of the packet we're transmitting. The packet does start with a length parameter, however as the "number of bytes left to receive" counter is stored on the calculator's stack by the receiving routine we end up overwriting that with our exploit payload and the total number of bytes left to receive will end up depending on the particular stack level at the time.
In my testing the variable ends up being stored on the stack at the same address ($FFC1 for normal transfers, $FFBF for ones where the flash was previously unlocked). Knowing this means that as we trample over the stack deploying our exploit we can at least make sure that we leave that value in the state it should be for the current point in the packet transfer.
This is implemented in the program with the -fixed parameter, which executes much more quickly than the -zeropad one and only needs to run through once. It is however reliant on knowing exactly where on the stack the "number of bytes left to receive" variable is stored; if it's different from the two presets baked into the program it can be changed with -fixed=<hex addr>.
The program itself
In case it helps anyone else out, the program can be downloaded from this link. It's a .NET application and requires a computer with a serial port and a "black link" compatible serial cable (I use a home-made cable), which I appreciate is not exactly the most modern solution but is what I have access to.
It will allow you to transfer a standard "noshell" TI-83 Plus assembly program to the target calculator, with or without flash unlocked. As this is a potentially risky operation (especially with flash unlocked, which would allow you to completely brick the calculator by damaging the boot code) any such programs are left as an exercise to the user to be used at their own risk. The original Overflow program contains much more useful information, including a sample program that can erase the certificate page, though be warned that as written is is not designed for the TI-83 Plus and will erase the wrong page and so will need to be modified before use. This is only recommended as a last chance for calculators that are otherwise bricked and unusable!
Update 4th June 2023: The program now supports the USB "Silver Link" cable, though you will need to temporarily replace TI's driver with a generic WinUSB driver using Zadig. The download link is the same as before.
Update 7th June 2023: The program will now try to use TI's driver for the "Silver Link" USB cable, if available. This avoids the need to temporarily replace it with the WinUSB driver.