Unicode is a worldwide standard which encodes characters as integers or 'code points'. It includes representations of all the international and special characters
used in modern computer applications. APL characters, including all those used in APLX, are defined in the
Unicode standard, although there are some ambiguities about a few of them. This makes it possible to
exchange character data between APLX and other applications (including other
APL interpreters which support Unicode), without encountering problems with
character translation and 'code tables', provided that the text being
transferred can be represented in the APLX character set.
The system function ŒUCS translates Unicode values to the equivalent character in the APLX character set (if there is one), and vice versa. It takes a right argument, which must be a simple character or integer array.
If the argument is a character array, ŒUCS returns an integer array of the same shape, containing the
Unicode representation of each character. This will be a number within the range 0 to 65535, because all of the characters supported in APLX fall into the basic Unicode range.
If the argument is an integer array, ŒUCS returns a character array of the same shape, containing the APLX character corresponding to each Unicode value
provided. Unicode values which have no equivalent in the APLX character set are converted to the current value of ŒMC (Missing Character). By default, this is a question mark.
For example:
ŒUCS 'X„¼10' 88 8592 9075 49 48 ŒUCS 88 8592 9075 49 48 X„¼10 ŒUCS 937 8364 223 ?€ß
In the last example, the Unicode value 937 (hex 03A9,
representing the Greek capital omega character) was
translated to the 'missing character' value (question mark) because it has no equivalent in the APLX
character set.
A different 'missing character' can be set using ŒMC:
ŒMC„'$' ŒUCS 937 8364 223 $€ß
See the section on the APLX Character Set for details of the mapping between APLX characters and Unicode.