Dyalog ’18 Videos, Week 2

Each week until early January, we will be releasing a selection of recordings of presentations from Dyalog’18, which was held in Belfast at the end of October 2018. Last week we kicked off with the opening keynote talks and the prize ceremony and acceptance speech by the winner of our annual problem-solving competition.

Just under half of the presentations at Dyalog User Meetings are by users who have volunteered – or sometimes been commandeered – to share stories about how they have used APL for fun or profit. These user stories provide significant motivation to the Dyalog team for future direction.

Aaron Hsu’s talk on “High Performance Tree Wrangling, the APL Way” is a pearl. Back in 2015 I gave a talk at Google on APL. One of the Google engineers asked about working with trees in APL and I was unable to give him a useful answer. Aaron is working on a compiler for APL, and trees that represent the code that is being compiled are his most important type of data structure.

In this talk Aaron demonstrates that APL is an elegant – and highly efficient – notation for working with trees, if you just pick the right representation!


Most of the talks at Dyalog User Meetings are fairly technical. The subject at the core of Ilaria Piccirilli’s talk – the fair pricing of financial instruments and subsequent evaluation of portfolios – is no exception. Mercifully, Ilaria spares us the details of the calculations – as she dryly notes, there is no “Fair Pricing for Dummies”. Instead, she offers humorous insights into the way her team used APL to deal with the explosion of computations required by regular additions to legislation requiring health checks – and the day that negative interest pulled the rug out from under most standard pricing calculations.

The other, slightly larger half of the talks at Dyalog Users meetings are by members of the Dyalog Team, talking about work that has recently been done on our products or presenting designs for future enhancements.


Adám Brudzewsky’s talk, titled Array Notation Mk III, is about a potential future extension to the APL language, which will make it possible to easily and clearly describe arrays of high rank, or with deeply nested structure, without using APL primitives to “construct” them, as is common practice today. In addition to making application code easier to read and write, a literal notation for data structures will make it easy to use text files to describe data structures which are essentially part of the source code of an application, and should be managed by a source code management system. As the name suggests, this work has been ongoing for some time, with the initial inspiration coming from a user presentation by Phil Last, back at Dyalog ’15 in Sicily. Watch the presentation and give us feedback on whether you think this idea is now sufficiently baked to become part of Dyalog APL, or we’ll need a “Mk IV” talk next year!


With the growth in usage of Dyalog APL under macOS and Linux – especially in server or cloud environments – the Dyalog Remote Integrated Development Environment is becoming a “mainstream” tool, rather than the curiosity that it was during the first few years of development. Our partners at Optima Systems are developing RIDE on Dyalog’s behalf, and Gilgamesh Athoraya is now the lead developer. In his talk on “RIDE 4.1 and Next Generation Integrations”, Gil talks first about significant new features and performance improvements to RIDE in 4.1 – and then continues to talk about how components of the RIDE technology may be re-purposed to provide APL add-ins for popular development frameworks like the new Microsoft VS Code.

Summary of this week’s videos:

Diane’s Lasagne Problem

Making Lasagne

Participants in the SA2 Performance Tuning workshop at the Dyalog ’18 User Meeting were encouraged to bring their own problems for the group to work on. Diane Hymas of ExxonMobil brought a good one. The one-liner computation is as follows:

lasagne0 ← {groups {+⌿⍵}⌸ amts ×[⎕io] spices[inds;]}

where

   n ← 8e5
   spices ← ?6000 44⍴0
   groups ← +\(16↑1 2)[?n⍴16]
   inds   ← ?n⍴≢spices
   amts   ← ?n⍴0

Applying lasagne0 to this dataset:

   ⍴ lasagne0 ⍬
100015 44
   ≢ ∪ groups
100015

   )copy dfns wsreq cmpx

   wsreq 'lasagne0 ⍬'
844799820
   cmpx  'lasagne0 ⍬'
2.12E¯1

The problem with lasagne0 is space rather than time. The 845 MB required for this dataset may be acceptable, but we can be called upon to cook up large batches of lasagne in a smallish workspace, on a machine with limited RAM. (Large n and large ≢∪groups.)

All benchmarks in this document were run in Dyalog APL version 17.0, in a 2 GB workspace, on a machine with generous RAM.

Solutions

Marshall Lochbaum solved the problem. The alternative solutions are as follows:

lasagne0 ← {groups {+⌿⍵}⌸ amts ×[⎕io] spices[inds;]}
lasagne1 ← {↑ (groups{⊂⍵}⌸amts) {+⌿⍺×[⎕io]spices[⍵;]}¨ groups{⊂⍵}⌸inds}
lasagne2 ← {↑ (groups{⊂⍵}⌸amts)      {⍺+.×spices[⍵;]}¨ groups{⊂⍵}⌸inds}
lasagne3 ← {↑ {amts[⍵]+.×spices[inds[⍵];]}¨ {⊂⍵}⌸groups}

lasagne0 is the original expression; lasagne1 and lasagne2 were derived by Marshall during the workshop; lasagne3 was suggested by a participant in the workshop. The four functions produce matching results. Comparing the space and time:

space (MB)           time
lasagne0 845 2.29e¯1 ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
lasagne1 74 3.60e¯1 ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
lasagne2 74 2.39e¯1 ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
lasagne3 74 2.93e¯1 ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

lasagne0 v  lasagne1

Nearly all of the space required to evaluate lasagne0 is accounted for by the space for computing the right argument to key:

   wsreq 'lasagne0 ⍬'
844799820

   wsreq 'amts ×[⎕io] spices[inds;]'
844799548

In fact, the array spices[inds;], all by itself, is already very large. It has shape (⍴inds),1↓⍴spices (≡ 8e5 44), each item requiring 8 bytes.

   wsreq 'spices[inds;]'
281599556

   ⍴ spices[inds;]
800000 44

   8 × ×/ ⍴ spices[inds;]
281600000

   qsize←{⎕size '⍵'}    ⍝ # bytes for array ⍵
   qsize spices[inds;]
281600040

lasagne1 avoids creating these large intermediate results, by first partitioning the arguments (groups{⊂⍵}⌸amts and groups{⊂⍵}⌸inds) and then applying a computation to each partition. In that computation, the operand function {+⌿⍺×[⎕io]spices[⍵;]}, is a partition of amts and is the corresponding partition of inds.

lasagne1, lasagne2 and lasagne3 require the same amount of space to run, so the comparison among them is on time.

lasagne1 v  lasagne2

The only change is from +⌿⍺×[⎕io]spices[⍵;] to ⍺+.×spices[⍵;], which are equivalent when is a vector. The interpreter can compute +.× in one go rather than doing +⌿ separately after doing ×[⎕io]; in such computation the interpreter can and does exploit the additional information afforded by +.× and is faster by a factor of 1.5 (= 2.39 ÷⍨ 3.60).

lasagne2 v  lasagne3

The idea in lasagne3 is doing one key operation rather than the two in lasagne2. Therefore, the changes between lasagne2 v lasagne3 are:

lasagne2 groups{⊂⍵}⌸amts
groups{⊂⍵}⌸inds
spices[⍵;]
lasagne3 {⊂⍵}⌸groups amts[⍵] spices[inds[⍵];]

All three key operations involve {⊂⍵}⌸ with groups as the key, and are roughly equally fast, each taking up no more than 7% of the total time.

   cmpx 'groups{⊂⍵}⌸amts' 'groups{⊂⍵}⌸inds' '{⊂⍵}⌸groups'
  groups{⊂⍵}⌸amts → 1.69E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
* groups{⊂⍵}⌸inds → 1.39E¯2 | -18% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
* {⊂⍵}⌸groups     → 1.36E¯2 | -20% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

lasagne3 is doing one less key operation than lasagne2 in exchange for doing, for each of the ≢∪groups (= 100015) executions of the operand function, amt[⍵] v and spices[ind[⍵];] v spices[⍵;]. Indexing is by no means slow, but it’s not as fast as references to and . Therefore, lasagne2 is faster.

The trade-off may differ for different values of groups. In this case groups are small-range integers so operations using it as the key value are fast.

Welcome to the Dyalog ’18 Videos!

Three weeks have gone by since we waved goodbye to the last Dyalog ’18 delegates in Belfast. We’ve had time to catch up on sleep, half of us have had colds and recovered from them. Jason Rivers and Richard Park have started mixing and improving the audio and video recordings, and we are ready to release the first group of processed videos.


Our plan is to release batches of 3-5 videos, with enough variety for everyone to find at least one topic of interest each week. We have not reviewed all of the material yet; there are always one or two where something went wrong and we are unable to publish the recordings (or the presenter asks that we refrain from making the talk public), but we do expect to be able to make the vast majority of the talks available over the next couple of months.

Each week, I’ll be doing my best to introduce each set with a blog entry: The first batch contains cleaned-up versions of the presentations that were streamed live from Belfast. The audio and video quality is significantly enhanced compared to the live stream, and the most confusing gaffes in my own live demo have been removed 😊.


As usual, the user meeting opened with the traditional trio of keynotes by Dyalog’s CEO Gitte Christensen, the CXO Morten Kromberg (that’s me) and CTO Jay Foad. Gitte introduces a couple of new faces at Dyalog, and the contest winners, so everyone can plan to buy the winners drinks during the week. Gitte then discusses high level direction – announcing our intention to make the Linux version available to download, and included in public Docker containers and Cloud VM images, with no questions asked.


My own session mostly consists of a live demo of the potential consequences of making Linux licences really easy to get hold of. In an imaginary conversation with a data scientist, I demonstrate the use of Dyalog APL to implement an (admittedly silly) analytical function, and subsequently make it available as a web service and via a web site, finally deplying it to the cloud using a set of Public Docker containers, without once installing Dyalog APL itself.


Jay Foad rounded Monday’s live stream off with a review of the features of the recently released version 17.0, before moving on to talk about the work that the development team is planning for versions 17.1 and 18.0, scheduled for the spring of 2019 and 2020, respectively.


In accordance with tradition, we also streamed the Prize Ceremony for the International Problem-Solving Competition and – often the most interesting talk of the year – the acceptance speech where this year’s winner talked about his code, and the experience of learning APL. This year’s winner did not let us down; it is amazing how quickly you can learn to write really, really good APL code!

Summary of this week’s videos:

Tolerated Comparison, Part 1

Warning: The formula for “tolerated” comparison in this post is not correct for actual floating-point numbers. Do not use it! The follow-up post discusses how to compute the required value precisely in the presence of floating-point error.

Tolerant comparison is an important feature in making APL usable in the real world. But it’s also a lot slower than exact comparison. We think this tradeoff is well worth it (and of course, if you don’t want to make it, just set ⎕CT←0). But once in a while we can completely get rid of the slowdown! Added in Dyalog 17.0, along with vectorised comparison code so fast you might not even notice this change, is a cool technique that I call “tolerated comparison”. It works by turning a tolerant comparison with one argument fixed into an intolerant comparison by adjusting that fixed argument. So if you’re comparing one floating-point number to many of them, your code will run 30-40% faster than it would without this improvement.

Why tolerance?

Let’s take a trip to a slightly different APL.

      ⎕CT←0

A world in which tolerant comparison hasn’t been invented. Two things are equal when they’re the same.

      0.1 = 0.3-0.2
0
      {(0.1×⍵) = ⍵÷10} ⍳8
1 1 0 1 1 0 0 1

Okay, that’s a scary place. What happened?

The floating-point format can’t possibly represent every real number. Instead, there’s a carefully chosen set containing just shy of 2*64 numbers which can be expressed. For other numbers, we pick the closest available floating-point number. Usually this is a pretty accurate approximation. But once we perform arithmetic with a few of these values, we might go from “closest possible approximation” to “second-closest approximation” or even further away. The differences are still tiny, but when the same value—mathematically speaking—is computed in two different ways the results might be different numbers. Just barely different, but in a world without tolerant comparison that means they won’t be equal any more.

Tolerant comparison is an attempt to fix this. It’s not perfect, but it’s better in a lot of real-world situations. It makes = a useful function. The rule is that two values are equal when their difference is at most ⎕CT times the larger of their magnitudes. This is often called a “relative epsilon comparison” outside of the APL world.

Notation: To avoid confusion in the following text, I have used the APL symbols <≤=≥>≠ for intolerant comparisons, and the non-APL symbols ⋦≲≂≳⋧≉ to refer to APL’s tolerant comparisons which depend on ⎕CT. Watch out for a few quirks due to the scarcity of symbols to use: the actual APL primitives are denoted by non-APL characters, and the simpler symbol < “less-than” corresponds to the more complicated “less-than and not approximately equal to”.

Tolerant comparisons

The famous formula for tolerant equality a≂b in terms of intolerant inequality (introduced here) is

(|a-b) ≤ ⎕CT×(|a)⌈(|b)

This allows for a little variation in the arguments, provided that neither one is zero. And it has some other nice properties, assuming ⎕CT≤1:

  • It’s reflexive: a≂a for any number a.
  • It’s symmetric in a and b: reversing them makes no difference in the result. This is because and (|-) are symmetric functions.
  • It scales linearly: multiplying both a and b by the same nonzero value (even a negative number) does not change the result.
  • It’s convex in each variable: if a≂b and a≂d for some a, and c is between b and d, then a≂c.

In fact, tolerant equality as in APL is the only solution (or family of solutions, taking the ⎕CT parameter into account) to those constraints, other than the trivial one in which any two numbers are equal.

The above figure illustrates tolerant comparison of two real numbers. They are exactly equal only on the grey diagonal line, and tolerantly equal within the narrow strip around it. The diagram uses a much larger value for ⎕CT than Dyalog’s default of 1e¯14, or even its maximum allowed value of 2*¯32, either of which would make that strip invisible.

One way to define tolerant inequalities in terms of tolerant equality is to define, for instance, (a≲b) ≡ (a<b)∨(a≂b) using intolerant less-than and tolerant equality. But we’ll find it more useful to define and directly in terms of intolerant comparisons as with . The formula for is:

(a-b) ≤ ⎕CT × 0⌈a⌈-b

and for we can use either (a≳b) ≡ (b≲a) or (a≳b) ≡ (-a)≲(-b) to obtain

(b-a) ≤ ⎕CT × 0⌈b⌈-a

or, negating both sides and swapping the inequality,

(a-b) ≥ ⎕CT × 0⌊a⌊-b

The 0⌈ term isn’t strictly necessary (as we’ll show later), but it helps avoid some confusion when a and b have different signs. Sometimes it’s used to speed up comparison by checking if a≤b intolerantly before performing any multiplication—if it is, the multiplication isn’t needed.

Above we can see how the equation for tolerant splits into two symmetrical half-formulas, each of which is just a little off from the exact inequality a≤b. The tolerant inequality accepts either of these conditions.

The region of tolerant equality is now the intersection of the regions for tolerant and . That is:

(a≂b) ≡ (a≲b)∧(a≳b)

We’ll find this decomposition handy later on since tolerant inequalities are so much easier to work with.

Tolerating an inequality

What happens if we fix one side of the inequality? Here’s a diagram for a≲b with a dashed line drawn where b is equal to some fixed B and a varies.

There’s only one place where the black line bounding the region where a≲b intersects the line b=B. The value of a at this intersection has an interesting property: when a is (exactly) less than or equal to that value, then it is tolerantly less than or equal to B. If we want to compare B to many other floating-point values, and we know the a coordinate of the intersection above, then we can skip all of the tolerant comparison arithmetic and just use exact comparison with that value!

Once we have a way to do this for all other inequalities will follow, since (a≳B) ≡ (-a)≲(-B) and (a⋧B) ≡ ~(a≲B).

Maybe you can figure out the value from the graphs above. But it’s also easy to find the value (and prove it works!) using a fact about the maximum function: for real numbers x, y, and z, x≤y⌈z is the same as (x≤y) ∨ (x≤z). One of the advantages of treating inequalities as functions like any other is that formulas like these are easy to manipulate. Here we go:

Start with the formula for tolerant inequality, writing q in place of ⎕CT.

(a-B) ≤ q × 0⌈a⌈-B

Distributing over , we have

(a-B) ≤ 0 ⌈ (q×a) ⌈ (-q×B)

and using (x≤y⌈z) ≡ ((x≤y) ∨ (x≤z)) twice (just like in the previous step, this can be thought of as distributing x≤ over three arguments of an associative function all at once), we can expand this formula to

((a-B) ≤ 0) ∨ ((a-B) ≤ q×a) ∨ ((a-B) ≤ -q×B)

which simplifies to

(a ≤ B) ∨ ((a×1-q) ≤ B) ∨ (a ≤ B×1-q)

or (here we need q<1, so that 1-q is positive; otherwise we’d have to reverse the middle inequality)

(a ≤ B) ∨ (a ≤ B÷1-q) ∨ (a ≤ B×1-q)

Recombining, we get

a  ≤  B ⌈ (B÷1-q) ⌈ (B×1-q)

So that’s our formula: now the tolerant equality is an intolerant equality, where the right side depends only on B!

Reading the signs

So we have a formula for a “tolerated” value b = B ⌈ (B÷1-q) ⌈ (B×1-q). When do the different possible maxima come into play?

  • If B=0, then everything on the right is zero, and b=0.
  • If B>0, then B×1-q is smaller than B, since q>0. But that means ÷1-q is larger than 1, and B÷1-q is larger than B. So b=B÷1-q.
  • If B<0, then the all the inequalities that we got with B>0 are reversed. This means b=B×1-q.

Note that none of these cases depend on the branch b=B, even though the first one touches it. That means the definition of tolerant inequality would be the same even if we remove that branch. By working backwards through the above derivation we find that the initial 0⌈ has no effect, and we can remove it from the definition.

Equality

Tolerant equality obviously can’t be reduced to intolerant equality, since many values can be tolerantly equal to a number but only one is intolerantly equal to it. But recalling that (a≂b) ≡ (a≲b)∧(a≳b), we can express it as two intolerant inequalities: tolerate b for as above. Then tolerate it for (this always yields a value at least as large as the first). a is tolerantly equal to b if it lies between the two tolerated values.

This case is particularly important for the set functions ⍳∊∩∪~, which by definition use tolerant equality to match values in one argument to values in the other. If the non-principal argument (the one containing values to be looked up, like the right argument to ) is short enough, then each element in it is looked up using a simple linear search through the principal argument. This means that one number at a time is compared to the entire principal argument (up to the first match found). So for floats we can get a substantial improvement by tolerating each element on the right before comparison.

Caution!

That was the easy part.

In this post we’ve been assuming that floating-point numbers behave like mathematical real numbers. Given that the whole reason tolerant comparison was introduced is that they don’t, maybe that wasn’t the best assumption.

The formulas B÷1-q and B×1-q for tolerated comparison are flawed: they produce values that differ slightly from the actual number that should be used in an intolerant equivalent to tolerant comparison. This is very dangerous: many programs could break if two values compare equal in one comparison but not in another! In the next post we’ll dive into the world of floating-point error analysis and find out how to get the right values. And we’ll discover that imperfect computation isn’t all bad when we find a faster alternative to a very slow floating-point division.

Progressive Index-Of

⎕io=0 is assumed throughout.

A recent Forum post motivated investigations into the progressive index-of functions in the FinnAPL Idiom Library:

pix  ← {((⍴⍺)⍴⍋⍋⍺⍳⍺,⍵) ⍳ ((⍴⍵)⍴⍋⍋⍺⍳⍵,⍺)}   ⍝ FinnAPL Idiom 1
pixa ← {((⍋⍺⍳⍺,⍵)⍳⍳⍴⍺) ⍳ ((⍋⍺⍳⍵,⍺)⍳⍳⍴⍵)}   ⍝ FinnAPL Idiom 5

In this note, we:

  • explain what is progressive index-of
  • explain why the two functions work
  • investigate the performance of the two functions
  • provide a more general solution

Progressive Index-Of

Progressive index-of is like index-of () except that each find “uses up” the target of that find. There are no duplicates in the result with the possible exception of ≢⍺ (for “not found”). Thus:

      x←'mississippi'
      y←'dismiss'

      x pix y
11 1 2 0 4 3 5

The following chart illustrates a step-by-step derivation of each progressive index:

0 1 2 3 4 5 6 7 8 9 10

m i s s i s s i p p  i      d i s m i s s
                           11
m i s s i s s i p p  i      d i s m i s s
                           11 1
m i s s i s s i p p  i      d i s m i s s
                           11 1 2
m i s s i s s i p p  i      d i s m i s s
                           11 1 2 0
m i s s i s s i p p  i      d i s m i s s
                           11 1 2 0 4
m i s s i s s i p p  i      d i s m i s s
                           11 1 2 0 4 3
m i s s i s s i p p  i      d i s m i s s
                           11 1 2 0 4 3 5

It is possible to compute the progressive index without looping or recursion, as the two FinnAPL functions demonstrate.

Why It Works

The basic idea of ⍺ pix ⍵ is to substitute for each item of and an equivalent representative, collectively c and d, whence the result obtains as c⍳d. The equivalent representative used here is ranking, specifically the ranking of the indices in .

The ranking of an array is a permutation of order ≢⍵. The smallest major cell is assigned 0; the next smallest is assigned 1; and so on. Ties are resolved by favoring the earlier-occurring cell. The ranking can be computed by ⍋⍋⍵. For example:

      x ⍪ ⍉⍪ ⍋⍋x
m i s s i s  s i p p i
4 0 7 8 1 9 10 2 5 6 3

      y ⍪ ⍉⍪ ⍋⍋y
d i s m i s s
0 1 4 3 2 5 6

⍺ pix ⍵ works on two different rankings of indices in :

⍋⍋⍺⍳⍺,⍵    rankings of indices in of and , favoring
⍋⍋⍺⍳⍵,⍺    rankings of indices in of and , favoring

The first ⍴⍺ items of the former are those for and the first ⍴⍵ of the latter are those for , and we get

pix ← {((⍴⍺)⍴⍋⍋⍺⍳⍺,⍵) ⍳ ((⍴⍵)⍴⍋⍋⍺⍳⍵,⍺)}

The second version depends on the following properties of permutations. Let p be a permutation. Then p[⍋p] ←→ ⍳≢p, the identity permutation, and therefore ⍋p is the inverse of p. Furthermore, p[p⍳⍳≢p] ←→ ⍳≢p and so p⍳⍳≢p is also the inverse of p. The inverse is unique (that’s why it’s called the inverse), therefore ⍋p ←→ p⍳⍳≢p.

      p←97?97         ⍝ a random permutation

      p[⍋p]    ≡ ⍳≢p
1
      p[p⍳⍳≢p] ≡ ⍳≢p
1
      (⍋p)     ≡ p⍳⍳≢p
1

The two rankings are permutations (because the leftmost functions are ) and we just need the first ⍴⍺ items of the former and the first ⍴⍵ items of the latter. Thus:

pixa ← {((⍋⍺⍳⍺,⍵)⍳⍳⍴⍺) ⍳ ((⍋⍺⍳⍵,⍺)⍳⍳⍴⍵)}

Performance

We note that both versions of pix contain the expressions ⍺⍳⍺,⍵ and ⍺⍳⍵,⍺, but the latter is just a rotation of the former. Thus:

pixb ← {i←⍺⍳⍺,⍵ ⋄ ((⍴⍺)⍴⍋⍋i) ⍳ ((⍴⍵)⍴⍋⍋(⍴⍺)⌽i)}
pixc ← {i←⍺⍳⍺,⍵ ⋄ ((⍋i)⍳⍳⍴⍺) ⍳ ((⍋(⍴⍺)⌽i)⍳⍳⍴⍵)}

Which is faster? The answer may surprise.

      x←?1e6⍴3e5
      y←?2e5⍴3e5

      cmpx 'x pixb y' 'x pixc y'
  x pixb y → 9.15E¯2 |  0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
  x pixc y → 9.21E¯2 |  0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

A few factors about the Dyalog APL interpreter are relevant to this performance:

  • Computing ⍺⍳⍵,⍺ as a rotation of an already computed i←⍺⍳⍺,⍵ produces a worthwhile speed-up, although only on a relatively small part of the overall computation.
          i←x⍳x,y
          cmpx '(⍴x)⌽i' 'x⍳y,x'
      (⍴x)⌽i → 5.00E¯4 |     0% ⎕⎕                            
      x⍳y,x  → 7.19E¯3 | +1337% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
    
  • Both and have special code for small range data.
          s←?1e6⍴5e5           ⍝ small range
          t←s ⋄ t[t⍳⌈/t]←2e9   ⍝ large range
    
          cmpx 's⍳s' 't⍳t'
      s⍳s → 5.87E¯3 |    0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕                     
      t⍳t → 2.00E¯2 | +240% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
    
          cmpx '⍋s' '⍋t'
      ⍋s → 3.25E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕     
      ⍋t → 3.84E¯2 | +18% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
    
  • ⍋⍵ has special code when is a permutation.
          p←1e6?1e6           ⍝ p is a permutation
          q←p ⋄ q[999999]←⊃q  ⍝ q is not; both are small-range
    
          cmpx '⍋p' '⍋q'
      ⍋p → 5.81E¯3 |    0% ⎕⎕⎕                           
    * ⍋q → 5.71E¯2 | +882% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
    
  • We saw previously that if p is a permutation then ⍋p ←→ p⍳⍳⍴p. The special code for ⍋p makes the two expressions run at roughly the same speed. The slight advantage for ⍋⍋x versus (⍋x)⍳⍳⍴x would increase if and when ⍋⍋ is recognized as an idiom.
          cmpx '⍋p' 'p⍳⍳⍴p'
      ⍋p    → 6.02E¯3 |  0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕   
      p⍳⍳⍴p → 6.57E¯3 | +9% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
    
          cmpx '⍋⍋x' '(⍋x)⍳⍳⍴x'
      ⍋⍋x      → 3.16E¯2 |  0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ 
      (⍋x)⍳⍳⍴x → 3.25E¯2 | +2% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
    
    

A General Solution

Index-of works on cells rather than just scalars. Likewise, progressive index-of can also be extended to work on cells. The core algorithm remains the same. The generalization obtains by first reshaping to have the same rank as (having major cells with the same shape), applying the core algorithm, and then reshaping its result to have the same leading shape as the original . Thus:

pixd←{
  m←≢⍺
  r←0⌊1-⍴⍴⍺
  n←×/r↓⍴⍵
  i←⍺⍳⍺⍪(n,1↓⍴⍺)⍴⍵
  (r↓⍴⍵) ⍴ ((⍋i)⍳⍳m) ⍳ ((⍋m⌽i)⍳⍳n)
}

   xx              yy
mmmm            dddd
iiii            iiii
ssss            ssss
ssss            mmmm
iiii            iiii
ssss            ssss
ssss            ssss
iiii     
pppp     
pppp                                  x
iiii                               mississippi

   ⍴xx             ⍴yy                y
11 4            7 4                dismiss

   xx pixd yy                         x pixd y
11 1 2 0 4 3 5                     11 1 2 0 4 3 5

   xx pixd 3 5 4⍴yy                   x pixd 3 5⍴y
11  1  2  0  4                     11  1  2  0  4
 3  5 11  7  6                      3  5 11  7  6
11 10 11 11 11                     11 10 11 11 11

Postscript
After having written the above, I discovered an alternative exposition on progressive index-of by Bob Smith entitled Anatomy of an Idiom. Adám Brudzewsky has produced a Stack Exchange lesson and a Jupyter Notebook based on Smith’s text.

There is also an exposition in J on the same topic, with a more verbose but easier-to-understand derivation.

2018 APL Problem Solving Competition: Phase I Problems Sample Solutions

The following are my attempts at the Phase I problems of the 2018 APL Problem Solving Competition. There are not necessarily “right answers” as personal style and taste come into play. More explanation of the code is provided here than common practice. All solutions pass all the tests specified in the official problem description.

The solutions for problems 1 and 3 are due to Brian Becker, judge and supremo of the competition. They are better than the ones I had originally.

1. Oh Say Can You See?

Given a vector or scalar of skyscraper heights, compute the number of skyscrapers which can be seen from the left. A skyscraper hides one further to the right of equal or lesser height.

   visible ← {≢∪⌈\⍵}

Proceeding from left to right, each new maximum obscures subsequent equal or lesser values. The answer is the number of unique maxima. A tacit equivalent is ≢∘∪∘(⌈\).

2. Number Splitting

Split a non-negative real number into the integer and fractional parts.

   split ← 0 1∘⊤

The function ⍺⊤⍵ encodes the number in the number system specified by numeric vector (the bases). For example, 24 60 60⊤sec expresses sec in hours, minutes, and seconds. Such expression obtains by repeated application of the process, starting from the right of : the next “digit” is the remainder of the number on division by a base, and the quotient of the division feeds into the division by the next base. Therefore, 0 1⊤⍵ divides by 1; the remainder of that division is the requisite fractional part and the quotient the integer part. That integer part is further divided by the next base, 0. In APL, remaindering by 0 is defined to be the identity function.

You can have a good argue about the philosophy (theology?) of division by 0, but the APL definition in the context of ⍺⊤⍵ gives practically useful results: A 0 in essentially says, whatever is left. For example, 0 24 60 60⊤sec expresses sec as days/hours/minutes/seconds.

3. Rolling Along

Given an integer vector or scalar representing a set of dice, produce a histogram of the possible totals that can be rolled.

   roll ← {{⍺('*'⍴⍨≢⍵)}⌸,+/¨⍳⍵}

   roll 5 3 4
3  *
4  ***
5  ******
6  *********
7  ***********
8  ***********
9  *********
10 ******
11 ***
12 *

⍳⍵ produces all possible rolls for the set of dice (the cartesian product of ⍳¨⍵) whence further application of +/¨ and then , produce a vector of all possible totals. With the vector of possible totals in hand, the required unique totals and corresponding histogram of the number of occurrences obtain readily with the key operator . (And rather messy without the key operator.) For each unique total, the operand function {⍺('*'⍴⍨≢⍵)} produces that total and a vector of * with the required number of repetitions.

The problem statement does not require it, but it would be nice if the totals are listed in increasing order. At first, I’d thought that the totals would need to be explicitly sorted to make that happen, but on further reflection realized that the unique elements of ,+/¨⍳⍵ (when produced by taking the first occurrence) are guaranteed to be sorted.

4. What’s Your Sign?

Find the Chinese zodiac sign for a given year.

   zodiac_zh ← {(1+12|⍵+0>⍵) ⊃ ' '(≠⊆⊢)' Monkey Rooster Dog Pig Rat Ox Tiger Rabbit Dragon Snake Horse Goat'}

Since the zodiac signs are assigned in cycles of 12, the phrase 12| plays a key role in the solution. The residue (remainder) function | is inherently and necessarily 0-origin; the 1+ accounts for the 1-origin indexing required by the competition. Adding 0>⍵ overcomes the inconvenient fact that there is no year 0.

Essentials of the computation are brought into sharper relief if each zodiac sign is denoted by a single character:

   zzh ← {(1+12|⍵+0>⍵)⊃'猴雞狗豬鼠牛虎兔龍蛇馬羊'}

The Chinese Unicode characters were found using https://translate.google.com and then copied and pasted into the Dyalog APL session.

5. What’s Your Sign? Revisited

Find the Western zodiac sign for a given month and day.

   zodiac_en←{
     d←12 2⍴ 1 20  2 19  3 21  4 20  5 21  6 21  7 23  8 23  9 23  10 23  11 22  12 22
     s←13⍴' '(≠⊆⊢)' Capricorn Aquarius Pisces Aries Taurus Gemini Cancer Leo Virgo Libra Scorpio Sagittarius'
     (1+d⍸⍵)⊃s
   }

For working with irregular-sized non-overlapping intervals, the pre-eminent function is , interval index. As with the Chinese zodiac, essentials of the computation are brought into sharper relief if each zodiac sign is denoted by a single character:

   zen←{
     d←12 2⍴ 1 20  2 19  3 21  4 20  5 21  6 21  7 23  8 23  9 23  10 23  11 22  12 22
     (1+d⍸⍵)⊃13⍴'♑♒♓♈♉♊♋♌♍♎♏♐'
   }

The single-character signs, Unicode U+2648 to U+2653, were found by Google search and then confirmed by https://www.visibone.com/htmlref/char/cer09800.htm. It is possible that the single-character signs do not display correctly in your browser; the line of code can be expressed alternatively as (1+d⍸⍵)⊃13⍴⎕ucs 9800+12|8+⍳12.

6. What’s Your Angle?

Check that angle brackets are balanced and not nested.

   balanced ← {(∧/c∊0 1)∧0=⊃⌽c←+\1 ¯1 0['<>'⍳⍵]}

In APL, functions take array arguments, and so too indexing takes array arguments, including the indices (the “subscripts”). This fact is exploited to transform the argument string into a vector c of 1 and ¯1 and 0, according to whether a character is < or > or “other”. For the angles to be balanced, the plus scan (the partial sums) of c must be a 0-1 vector whose last element is 0.

The closely related problem where the brackets can be nested (e.g. where the brackets are parentheses), can be solved by checking that c is non-negative, that is, ∧/(×c)∊0 1 (and the last element is 0).

7. Unconditionally Shifty

Shift a boolean vector by , where positive means a right shift and negative means a left shift.

   shift ← {(≢⍵)⍴(-⍺)⌽⍵,(|⍺)⍴0}

The problem solution is facilitated by the rotate function ⍺⌽⍵, where a negative means rotate right and positive means rotate left. Other alternative unguarded code can use or (take or drop) where a negative means take (drop) from the right and positive means from the left.

8. Making a Good Argument

Check whether a numeric left argument to ⍺⍉⍵ is valid.

   dta ← {0::0 ⋄ 1⊣⍺⍉⍵}

This is probably the shortest possible solution: Return 1 if ⍺⍉⍵ executes successfully, otherwise the error is trapped and a 0 is returned. A longer but more insightful solution is as follows:

   dt ← {((≢⍺)=≢⍴⍵) ∧ ∧/⍺∊⍨⍳(≢⍴⍵)⌊(×≢⍺)⌈⌈⌈/⍺}

The first part of the conjunction checks that the length of is the same as the rank of . (Many APL authors would write ⍴⍴⍵; I prefer ≢⍴⍵ because the result is a scalar.) The second part checks the following properties on :

  • all elements are positive
  • the elements (if any) form a dense set of integers (from 1 to ⌈/⍺)
  • a (necessarily incorrect) large element would not cause the to error

The three consecutive ⌈⌈⌈, each interpreted differently by the system, are quite the masterstroke, don’t you think? ☺

9. Earlier, Later, or the Same

Return ¯1, 1, or 0 according to whether a timestamp is earlier than, later than, or simultaneous with another.

   ts ← {⊃0~⍨×⍺-⍵}

The functions works by returning the first non-zero value in the signum of the difference between the arguments, or 0 if all values are zero. A tacit solution of same: ts1 ← ⊃∘(~∘0)∘× - .

10. Anagrammatically Correct

Determine whether two character vectors or scalars are anagrams of each other. Spaces are not significant. The problem statement is silent on this, but I am assuming that upper/lower case is significant.

   anagram ← {g←{{⍵[⍋⍵]}⍵~' '} ⋄ (g ⍺)≡(g ⍵)}

The function works by first converting each argument to a standard form, sorted and without spaces, using the local function g, and then comparing the standard forms. In g, the idiom {⍵[⍋⍵]} sorts a vector and the phrase ⍵~' ' removes spaces and finesses the problem of scalars.

A reasonable tacit solution depends on the over operator , contemplated but not implemented:

     f⍥g ⍵  ←→  f g ⍵
   ⍺ f⍥g ⍵  ←→  (g ⍺) f (g ⍵)

A tacit solution written with over: ≡⍥((⊂∘⍋ ⌷ ⊢)∘(~∘' ')). I myself would prefer the semi-tacit ≡⍥{{⍵[⍋⍵]}⍵~' '}.