Bug 45966

GemBuilder for Smalltalk/VW

8.5, 8.4, 8.3, 8.2, 8.1, 8.0., 7.6.1, 7.6, older versions

GS64 3.2.x and later

GS-File in of code that was filed out from topaz in 8-bit in ambiguous range is corrupted

As of GemStone/S 64 Bit 3.2.x and 3.3.x, server code fileout from topaz or via server methods may include the fileformat command, which may specify utf8 (for UTF-8 encoded fileouts) or 8bit (for traditional GemStone format), depending on the mode of the GemStone image and the specifics of the fileout.

Filein using GS-Filein does not error on a fileformat command in 7.6.1 and later, but the argument (utf8 or 8bit) is not used. GBS relies on VW's file management, so the file is interpreted as UTF-8.

UTF-8 and 8-bit encoding is the same for the ASCII range, and Characters with codepoints over 255 cannot be filed out in 8-bit.  But Characters with code points in the range 128...255 are ambiguous; the bytes in the file will produce different results depending on if they are interpreted as UTF-8 or 8bit.

If the topaz fileout has fileformat 8bit, and the contents includes Characters in the range 127..255, then filing in using GBS GS-filein will cause these characters to be incorrectly read as UTF-8, and the resulting methods will be corrupt.

Workaround

Avoid using GBS to file in code that was filed out from topaz, unless your image is configured to use Unicode Comparison Mode or you have otherwise ensured that fileouts are always UTF-8.

It is reliable to use GS-File in for code that was filed out using GBS, and topaz file of code filed out using topaz.


Last updated: 12/7/16