Unix pages
Home -> UNIX software -> JOYCE -> LocoScript 1 file format

LocoScript 1 file format

This is my best guess at the LocoScript 1 file format, based on examining a variety of documents.

LocoScript 1 saves its files on CP/M-formatted discs, and therefore uses CP/M conventions such as 8.3 filenames and 128-byte records.

A byte is 8 bits. A word is 16 bits, in little-endian format.

Header

Nearly all LocoScript files start with a 128-byte header, and the LocoScript 1 document is no exception.

OffsetSizeDescription
0x003Magic number: 'JOY'
0x03wordFile format version number. For LocoScript 1 this is 0x101, i.e. 1.1
0x0590File identity. Three lines of 30 bytes each, PCW extended ASCII.
0x5FbyteMaximum number of layouts.
0x60byteMaximum number of tabs per layout.
0x61byteDecimal point character (usually . or , )
0x62byteZero character (0x30 for slashed zero, 0x7F for unslashed)
0x63byte0xFF if widows and orphans are allowed, 0 if not.
0x64byte0xFF if paragraphs can be broken across a page, 0 if not.
0x65byteBitmask of which word-processing codes to show.
Bit 0: Show codes
Bit 1: Show rulers
Bit 2: Show blanks
Bit 3: Show spaces
Bit 4: Do not show effectors (paragraph / tab symbols)
0x663unknown
0x69wordpage length in half-lines
0x6BwordNumber of first page in the file.
0x6DwordNumber of last page in the file.
0x6FbytePage numbering scheme:
0xFF: All the same
0x00: Odd / even pages differ
0x01: First page differs
0x02: Last page differs
0x70byteHeader height, in half-lines
0x71byteHeader start row, in half-lines
0x72byteHeader omit flags:
Bit 7 set: omit first page header
Bit 0 set: omit last page header
0x734Unknown
0x77byteFooter height, in half-lines
0x71byteFooter start row, in half-lines
0x72byteFooter omit flags:
Bit 7 set: omit first page footer
Bit 0 set: omit last page footer
0x7A4Unknown
0x7EbyteNumber of the 128-byte record containing the first chunk (see below).
0x7Fbyte

Layouts

The layouts are stored immediately following the header (i.e. at offset 128). The length of each layout is 10 bytes plus the number of tab stops (header byte 0x60).

The format of a layout is:

OffsetSizeDescription
0x00byteCharacter pitch. Bits 0-5 give character width in 16ths of an inch:
0x18 => 10cpi
0x14 => 12cpi
0x10 => 15cpi
0x0E => 17cpi
0x00 => proportional
Bit 6 set for double width.
0x01byteLine pitch, in 432ths of an inch, so 0x48 for 6lpi, 0x36 for 8lpi.
0x02byteLine spacing, in half-lines; so 1 for half-spaced, 2 for normal-spaced, etc.
0x03byteDefault character style:

Bit 1: Word underline (underline words but not spaces)
Bit 2: Underline
Bit 3: Reverse video
Bit 4: Doublestrike
Bit 5: Italic
Bit 6: ?
Bit 7: Justified?
0x04byteLeft margin position.
0x05byteRight margin position.
0x06byteCount of left tabs.
0x07byteCount of right tabs.
0x08byteCount of centre tabs.
0x09byteCount of decimal tabs.
0x0Avariable Tab stops. First the positions of all the left tabs, then all the right tabs, then centre, then decimal.

The Document content

The document starts at offset 128 * byte 0x7E of the header. It consists of a number of chunks, each of which is a multiple of 128 bytes long. The format of a chunk is:

OffsetSizeDescription
0x00byteLength, in 128-byte records.
0x01byte0 (high byte of length?)
0x02byteFlags:
Bit 0 set => Last chunk of this page.
Bit 7 set => First chunk of this page.
0x034Unknown
0x07byteNumber of currently- selected layout (0-based)
0x08byteLine alignment (details not known)
0x09byteCurrent character pitch (cf byte 0x00 of the layout). Bit 7 set if this is the default pitch.
0x0AbyteCurrent line pitch (cf byte 0x01 of the layout). Bit 7 set if this is the default pitch.
0x0BbyteCurrent line spacing (cf byte 0x02 of the layout). Bit 7 set if this is the default spacing.
0x0CbyteCurrent character style (cf byte 0x03 of the layout).
0x0DvariableText and markup

In the text, characters 0x00-0x7F and 0xA0-0xFF are printable, using the PCW character set. This is the same character set used by CP/M on the Spectrum +3. Characters 0x80-0x9F are markup codes:

CodeDescription
0x80End of chunk.
0x81Space (0x20 is used for hard spaces, i.e. those which do not permit the line to be broken).
0x82 0x00(LastLine) code - break page after this line.
0x82 0x01Form feed - break page now.
0x82 0x02Hyphen (0x2D is used for hard hyphens).
0x82 0x03Soft space.
0x82 0x04Insert current page number,
0x82 0x05Insert last page number.
0x82 0x06Appears to be a combined carriage return and (-ReV). Does not appear in live documents, but is decoded as that if inserted using a binary editor.
0x82 0x07Appears to be a combined carriage return and (+ReV). Does not appear in live documents, but is decoded as that if inserted using a binary editor.
0x82 0x08(SiC) - the word containing this code is spelt correctly. Added some time between LocoScript 1.20 and 1.40.
0x83 0x00(+Bold) Bold on
0x83 0x01(+Wordul) Word underline on
0x83 0x02(+UL) Underline on
0x83 0x03(+ReV) Reverse video on
0x83 0x04(+Double) Doublestrike on
0x83 0x05(+Italic) Italic on
0x83 0x06(+SupeR) Superscript on
0x83 0x07(+SuB) Subscript on
0x83 0x08(+Mail) Begin LocoMail macro. Added some time between LocoScript 1.20 and 1.40.
0x84 0x00(-Bold) Bold off
0x84 0x01(-Wordul) Word underline off
0x84 0x02(-UL) Underline off
0x84 0x03(-ReV) Reverse video off
0x84 0x04(-Double) Doublestrike off
0x84 0x05(-Italic) Italic off
0x84 0x06(-SupeR) Superscript off
0x84 0x07(-SuB) Subscript off
0x84 0x08(-Mail) End LocoMail macro.
0x85 0x00 0x00Soft hyphen.
0x85 0x00 0xnn(0xnn > 0) Soft linebreak.
0x85 0x02 0xnn(+LayouT) Select layout 0xnn
0x85 0x03 0xnn(+LPitch) Set line pitch 0xnn. As in the layout, this is specified in 432ths of an inch. If bit 7 of the pitch is set, this is a (-LPitch) command; the value being selected is the default.
0x85 0x04 0xnn(+LSpace) Set line spacing 0xnn. As in the layout, this is specified in half-lines. If bit 7 of the vlaue is set, this is a (-LSPace) command; the value being selected is the default.
0x85 0x05 0xnn(+Pitch) Set character pitch 0xnn. As in the layout, this is specified in half-lines. If bit 7 of the vlaue is set, this is a (-Pitch) command; the value being selected is the default.
0x85 0x06 0xnn(+Keep) Keep the following 0xnn lines together.
0x85 0x07 0xnn(-Keep) Keep the previous 0xnn lines together.
0x86 0xnn 0xmmStart of line. The two following bytes are a little-endian Z80 word, giving the distance of the right-hand end of the line from the right margin, in 240ths (?) of an inch.
0x88 0xnnEnd of line. nn is:
0x01 => soft end-of-line (normal wrapping)
0x02 => hard end-of-line (paragraph)	
0x03 => Unit
0x04 => Tab         } These 4 not found in live documents, 
0x05 => Indent tab  } but LS1 interprets them thus if they're
0x06 => Centre      } inserted manually
0x07 => Right align }
0x08 => Carriage return? Seems to be used at the start of some pages.
0x89 0xaa 0xbb 0xcc 0xdd Change horizontal position. aa is the code that caused realignment:
0x00 => Layout change causes left margin to move
0x01 => soft end-of-line (normal wrapping)
0x02 => hard end-of-line (paragraph)	
0x03 => Unit
0x04 => Tab.
0x05 => Indent tab	
0x06 => Centre.
0x07 => Right align.
bb is the new X position in characters. cc and dd is the new position in 240ths of an inch (not sure where it's measured from).
0x8A 0xaa 0xbb 0xcc 0xdd 0xeeJustified text. The first byte is:
0x01 => End-of-line (caused by text wrapping). If there are no packing 
           spaces to be inserted, the standard 0x88 0x01 end-of-line is used.
0x02 => Start justified paragraph.
0x08 => A justified version of the 0x88 0x08 carriage return code?	
Other bytes unknown.
0x89 0xaa 0xbb 0xcc 0xdd 0xee 0xff 0xgg Justified version of the 0x89 code. The first 5 bytes are the same as this code. Other bytes unknown.
0x8C-0x9FDo not appear to be used.