Motivation
I have been working on the Dyalog APL quicksort implementation. The following programming puzzle arose in the process of doing the QA for this work.
⍵
is a simple array. Write a function sorted
, without using ⍋
or ⍒
, such that sorted ⍵
is 1 if ⍵
is sorted in ascending order and 0 otherwise.The point about not using grade is that this is supposed to be an independent check that grade is correct (remains correct) after the new work.
Real Vectors
The simplest case is when ⍵
is a numeric vector. If furthermore ⍵
are not complex numbers (a case addressed later), then
∧/ 2≤/⍵
each item being less than or equal to the next one, checks that ⍵
is sorted. Since ⍋
uses exact comparisons, here we must set ⎕ct←⎕dct←0
. Morever, in order that decimal floating-point numbers (DECFs) be compared correctly, here ⎕fr←1287
.
Real Arrays
More generally, when ⍵
is a non-complex numeric matrix, we must check that each row precedes or is equal to the next row. If c
and d
are consecutive rows, then corresponding items are compared and at the first item where they differ, c[i]
must be less than d[i]
.
~ 0 ∊ (2>⌿⍪⍵) ⍲ <\ 2≠⌿⍪⍵
The expression incorporates two refinements:
- If
⍵
is not a matrix, first apply⍪⍵
. - Instead of checking
c[i]
is less thand[i]
, check thatc[i]
is not greater thand[i]
. This finesses the case wherec≡d
and there is no first item where they differ; that is, the case where<\2≠⌿⍪⍵
is all 0s for that row.
<\
on a boolean vector has 0s after the first 1, (and is all 0 if there are no 1s). Therefore, <\2≠⌿⍪⍵
finds the first item (if any) where one cell differs from the next cell, and that item must not be greater than the corresponding item in the next cell.
For example:
x←?97 3⍴10
{~ 0 ∊ (2>⌿⍪⍵) ⍲ <\ 2≠⌿⍪⍵} x
0
{~ 0 ∊ (2>⌿⍪⍵) ⍲ <\ 2≠⌿⍪⍵} x[⍋x;]
1
(Muse: since x
above are random numbers, there is a possibility that it is sorted and the first test above can be 1. But: if each elementary particle in the visible universe were a computer and every nanosecond each of them creates a random matrix and tests it for sortedness as above, running from the beginning of the time to the end of our lives, it is still a very safe bet that no 1 would result.)
For integer arrays, there is an alternative of using the signs of the arithmetic difference between successive cells:
{~ 0 ∊ 1≠t×<\0≠t← × 2-⌿⍪⍵} x[⍋x;]
1
(The sign where consecutive cells first differ must not be 1.) However, computing the difference on floating point numbers can founder on overflow:
⊢ x←¯1 1×⌊/⍬
¯1.79769E308 1.79769E308
{~ 0 ∊ 1≠t×<\0≠t← × 2-⌿⍪⍵} x
DOMAIN ERROR
{~0∊1≠t×<\0≠t←×2-⌿⍪⍵}x
∧
Complex Numbers
Two complex numbers are ordered first by the real parts and then by the imaginary parts. (This is part of the TAO extension implemented in Dyalog APL version 17.0.) Therefore, a complex array can be tested for sortedness by testing an equivalent real array with each number replaced by their real and imaginary parts, thus:
(¯1⌽⍳1+⍴⍴⍵) ⍉ 9 11∘.○⍵
↑9 11∘○¨⍵
9 11○⍤1 0⊢⍵
Although the second expression is the shortest, it is less efficient in time, space, and number of getspace
calls. The last expression is favored for its brevity and performance.
The number of getspace
is a worthwhile measure. Part of the QA process is a rather stringent procedure called the “Shuffle QA”. The entire Shuffle QA takes several weeks to run and its running time is directly related to the number of getspace
.
Character Arrays
None of the functions < ≤ ≥ > - ×
are permitted on characters. This is solved by application of ⎕ucs
, converting characters to integers while preserving the ordering.
Putting It All Together
sorted←{
⎕ct←⎕dct←0
⎕fr←1287
d←10|⎕dr ⍵
d∊0 2: ∇ ⎕ucs ⍵
d=9: ∇ 9 11○⍤1 0⊢⍵
~ 0 ∊ (2>⌿⍪⍵) ⍲ <\ 2≠⌿⍪⍵
}
Other Considerations
That ⍵⌷⍨⊂⍋⍵
is sorted is a necessary but not sufficient condition that ⍋⍵
is correct. For example, an “adversary” can supply the following results for ⍋⍵
so that ⍵⌷⍨⊂⍋⍵
is sorted:
?≢⍵
(≢⍵)⍴?≢⍵
¯1↓⍋⍵
∊ i {⊂⍵[?⍨≢⍵]}⌸⍨ ⍵⌷⍨⊂i←⍋⍵
The last expression randomly permutes the grade indices of equal cells, a result which violates the requirement that grade indices of equal cells are in ascending order. That is, grade must be stable.
In Dyalog APL version 17.0, grade has been extended to work on non-simple arrays, the much discussed TAO, total array ordering. Checking that a non-simple array is sorted without using grade requires facilities discussed in the paper TAO Axioms and is beyond the scope of this note.
“None of the functions – × are permitted on characters.”
Those anybody know the rationale behind this design decision ?
In the original APL, extending the comparatives to characters was problematic on two accounts:
– There was no standard on how characters compare to each other applicable across systems.
– Ideally you’d want to compare character _vectors_ and not just individual characters, for example so that ‘first’ ≤ ‘last’ would be 1. But the comparatives are already specified as _scalar functions_ and ‘first’≤’last’ would signal LENGTH ERROR.