In ⍺ f⌸ ⍵
, major cells of ⍺
specify keys for the corresponding major cells of ⍵
, and f
applies to each unique major cell of ⍺
and the major cells of ⍵
having that key. The monadic case f⌸⍵
is equivalent to ⍵ f⌸ ⍳≢⍵
. Key is similar to the GROUP BY clause in SQL.
For example:
⎕io←0
(⍳11) ,[¯0.5] 'Mississippi'
0 1 2 3 4 5 6 7 8 9 10
M i s s i s s i p p i
{⍺}⌸'Mississippi' {⍺⍵}⌸'Mississippi'
Misp ┌─┬────────┐
│M│0 │
{⊂⍵}⌸'Mississippi' ├─┼────────┤
┌─┬────────┬───────┬───┐ │i│1 4 7 10│
│0│1 4 7 10│2 3 5 6│8 9│ ├─┼────────┤
└─┴────────┴───────┴───┘ │s│2 3 5 6 │
├─┼────────┤
{≢⍵}⌸'Mississippi' │p│8 9 │
1 4 4 2 └─┴────────┘
We illustrate key by using it to model some APL primitives and to motivate and solve a programming puzzle.
Index-Of and Friends
Since key is defined in terms of unique and indices, we expect to be able to effect members of the index-of family. And so we can.
The specifications say that f
applies to each unique major cell of ⍺
. Therefore, {⍺}⌸
and equivalently ⊣⌸
computes unique. Since the specifications are for unique major cells, these are what ∪⍵
would be if it were extended to non-vector arguments, analagous to the way ⍳
was extended to non-vector left arguments.
x←↑'Aspen' 'John' 'Susan' 'Roger' 'Opal' 'John' 'Aspen'
x
Aspen
John
Susan
Roger
Opal
John
Aspen
{⍺}⌸x ⊣⌸x
Aspen Aspen
John John
Susan Susan
Roger Roger
Opal Opal
A little surprisingly, key can also be used to model extended index-of and extended membership. For reasons which will become clear, we model ∍
(AKA ∊⍨
) instead of ∊
. Both models use the merge operator, whimsically denoted here by →
.
ix ← {(≢⍺) ↓ (0⍴⍨(≢⍺)+≢⍵) ((∊i)→)⍨ (≢¨i)/(≢⍺)⌊⊃¨i←{⊂⍵}⌸⍺⍪⍵}
has ← {(≢⍺) ↓ (0⍴⍨(≢⍺)+≢⍵) ((∊i)→)⍨ (≢¨i)/(≢⍺)>⊃¨i←{⊂⍵}⌸⍺⍪⍵}
y←↑'China' 'Susan' 'John' 'John' 'Anne' 'Roger'
y
China
Susan
John
John
Anne
Roger
x ix y
7 2 1 1 7 3
x has y
0 1 1 1 0 1
To focus on essentials, ix
and has
do not directly handle the case where the right argument has higher rank than the left argument (the principal argument). Such higher rank arguments can be handled by pre-application of {↑,⊂⍤(¯1+⍴⍴⍺)⊢⍵}
and post-application of ((1-⍴⍴⍺)↓⍴⍵)⍴
.
Partition
Since key can effect non-contiguous partitions, we expect to be able to use it to effect contiguous partitions, a special case of the former. And so we can:
part ← {(+\b/⍺){⊂⍵}⌸b⌿⍵⊣b←∨\⍺}
x
Aspen
John
Susan
Roger
Opal
John
Aspen
0 1 0 1 0 0 1 part x
┌─────┬─────┬─────┐
│John │Roger│Aspen│
│Susan│Opal │ │
│ │John │ │
└─────┴─────┴─────┘
0 1 0 1 0 0 1 (⊂[0] ≡ part) x
1
(7⍴0) (⊂[0] ≡ part) x
1
Sort and Grade
sort ← {⍺⌿⍨¯1+{≢⍵}⌸⍺⍪⍵}
grade ← {(≢⍺)-⍨∊1↓¨{⊂⍵}⌸⍺⍪⍵}
If ⍵
is an array from a small “alphabet” ⍺
, then ⍺ sort ⍵
sorts ⍵
and ⍺ grade ⍵
grades it. For example:
x←?1e6⍴256 a←⎕a[?1e6⍴26]
(⍋x) ≡ (⍳256) grade x (⍋a) ≡ ⎕a grade a
1 1
x[⍋x] ≡ (⍳256) sort x a[⍋a] ≡ ⎕a sort a
1 1
The Maximum of a Vector
In Dyalog ’13 session D11, I included a slide giving relative timings on the then new key operator:
x←?1e6⍴1000
a. ¯1+{≢⍵}⌸(⍳1000),x 1.00
b. ¯1+{≢⍵}⌸(⍳1+⌈/x),x 1.62
c. ...
The timings show that key with tally takes less than a factor of 2 over finding the maximum. This is fast; unbelievably fast even. The following puzzle arose from investigations into this timing: Find the maximum of a vector of unsigned 1-byte integers without using multicore, vector instructions, loop unrolling, etc. Can you do it faster in C than the following?
max=*x++;
for(i=1;i<n;++i){if(max<*x)max=*x; ++x;}
I posed the puzzle to some expert C programmers, and they failed to solve it. An “obvious” solution obtains by considering an obvious implementation of {≢⍵}⌸x
in C:
M=256; memset(c,0,M*4); // initialize table of counts for(i=0;i<n;++i)c[*x++]++; // analoguous to c[x]+←1 in APL
Therefore, one way to find the maximum of x is to have an M-element (byte) Boolean table b, and:
memset(b,0,M); for(i=0;i<n;++i)b[*x++]=1; // analoguous to b[x]←1 in APL c=b+M; // scan from the back for(i=0;i<M;++i)if(*--c){max=c-b; break;}
Timing show that the table method is faster than the comparison method by a factor of 1.5 for 1-byte integers and 1.3 for 2-byte integers. The table method requires more code, but the main n
-loop, the time for which dominates the overall time, is simpler for the table method.
Jay Foad countered that finding the maximum (and the minimum) is much faster still with the vector instructions available in modern CPUs. Dyalog has now (version 14.1) implemented such finding, with performance benefits for key, the index-of family, grade, and sort.