News and Recommendations Regarding Dyalog Versions 17.1, 18.0 and 18.1

Background

Dyalog version 18.0 contained the largest number of optimised algorithms in the history of Dyalog APL. Unfortunately, since its release, we have found that our process for review and testing of optimisations was insufficient to cope with the quantity and nature of the optimisations. Many of the new algorithms had edge cases that escaped both the existing regression tests and the new tests that were written as part of development.

In August 2020, after the issues were reported following the release of Dyalog version 18.0, we issued a caution against the use of version 18.0 in production. We added significant additional testing during September and October, and subsequently lifted the caution in November. In April 2021, a user encountered an additional defect leading to incorrect results in a rare case of membership (), and in June a crash in an extremely rare case of “where” (monadic ). Since then, our own internal testing has found a defect in interval index (dyadic ).

Current Status

We have made patches available as defects have been detected and fixed, and “Issue 4” of version 18.0, made available to all users on 22 June 2021, includes fixes for all known issues with optimised code.

Download Dyalog version 18.0 Issue 4: Commercial users  Non-commercial users

At this time, we are executing two high priority projects, one aimed at increasing safety in the short term, the other at completely eliminating problematic optimisations within a few months:

  • We have further increased our internal testing of version 18.0, to verify the correctness of the optimisations that are still in the distributed product.
  • We are holding back the release of version 18.1, which was originally scheduled for release this month, while we remove most of the optimisations that went into version 18.0. We hope to release version 18.1 before the end of the 3rd Quarter of 2021, but will not let it out the door until we have complete confidence in this new version.

Recommendations

Dyalog Ltd recommends that users of Dyalog APL take the following action:

  • If you are using version 17.1 or earlier, plan to skip version 18.0 and upgrade directly to version 18.1 when it becomes available.
  • If you are using version 18.0, apply the latest patches or reinstall using Issue 4 when it becomes available. Continue to monitor our DSS e-mail broadcasts and apply patches quickly if we should find and fix any further issues that you think might impact your application. When version 18.1 becomes available, plan to upgrade to it at your earliest convenience.

Support for Versions 17.1 and 18.0

Considering this extraordinary situation, we have decided to extend support for version 17.1 for one additional release cycle. In other words, version 17.1 will be supported until we release the 4th subsequent version of Dyalog APL. Note that we are about to provide an updated version 17.1 installer (“Issue 3”) which collects all updates to version 17.1 since the original release.

Version 18.0 will be supported for the normal number of cycles, that is, until we release the 3rd following version. However, Dyalog Ltd recommends upgrading to version 18.1 as soon as it becomes available and you are able to schedule the upgrade.

Non-Commercial Versions

Given the nature of these defects, it is our intention to update our non-commercial distributions of version 18.0, although this process may lag behind the distribution of patches to clients paying for support by some weeks.

Conclusion

We apologise for the significant inconvenience that we know this is causing for some of you and thank you for your patience. We have made significant changes to our internal procedures regarding risk assessment, verification, and testing requirements for changes to existing primitive functions, to avoid a recurrence.

We do expect to re-apply those optimisations that we believe provide significant performance benefits to justify the risk of making changes over the next few releases, using new processes.

Sincerely,

Morten Kromberg
CTO, Dyalog Ltd.

Highlights of the 2020 Problem Solving Competition – Phase II

With Dyalog’s APL Problem Solving Competition 2021 in full swing, it’s time to highlight some of the excellent solutions that were submitted to last year’s edition.

Stefan Kruger works for IBM making databases. While he tries to learn at least one new programming language a year, he got hooked on APL and participated in the competition. This is his perspective on some solutions that the judges picked out – call it the “Judges’ Pick”, if you like; smart, novel, or otherwise noteworthy solutions that can serve as an inspiration.

This blog post is also available as an interactive Jupyter Notebook document.


By Stefan Kruger

I’ll show a cool solution or two to each Phase II problem and dive into the details of a couple. If you need to refresh your memory with what the problems looked like, there’s a PDF of the Phase II problems.

Oh, and note that at the time of writing there is still plenty time to take part in the current edition of the competition (and really, who knew bowling was so complicated?) – there are some juicy cash prices to be won.

Problem 1: Take a Dive (1 task)

Level of Difficulty: Low

So let’s kick off with problem 1. The task was to calculate the score of an Olympic dive, consisting of a technical difficulty rating and a vector containing either 3, 5 or 7 judges’ scores. Only the central three ordered judges’ scores should be considered, which should be summed and multiplied by the technical difficulty rating.

Here is a cunning trick that wasn’t at all obvious:

∇ score←dd DiveScore scores;sorted;cenzored;rotator
  ⍝ 2020 APL Problem Solving Competition Phase II
  ⍝ Problem 1, Task 1 - DiveScore
   
  sorted←{⍵[⍋⍵]}scores
   
  ⍝  0 1 2 rotates score indexes to 123, 23451 or 3456712
  ⍝  So three center values always goes first
  ⍝  51 = (0 1 2∧.= 3 5 7 ∘.|⍳100) ⍳ 1
  rotator←51
 
  cenzored←3↑rotator⌽sorted
  score←⍎2⍕dd+.×cenzored
∇
      2.9 2.6 2.7 DiveScore¨(7 7.5 6.5 8 8 7.5 7)(9.5 8 8.5)(7.5 7 7 8.5 8)
63.8 67.6 60.75

This contestant figured out that if a vector of length 3, 5 or 7 is rotated 51 steps, then the original central three items will always end up at the beginning. No, really. It turns out that 51 is the first number X such that 0 1 2≡3 5 7|X. They tabulated the options and picked the first solution, guessing that it’d be less than 100:

      ⍸0 1 2∧.=3 5 7∘.|⍳100
51

But there is another way – this is one of those situations where the Chinese Remainder Theorem comes in handy, especially since it’s available on APLcart:

      3 5 7 {m|⍵+.×⍺(⊣×⊢|∘⊃{0=⍵:1 0 ⋄ (⍵∇⍵|⍺)+.×0 1,⍪1,-⌊⍺÷⍵})¨⍨⍺÷⍨m←×/⍺} 0 1 2 ⍝ https://aplcart.info?q=chinese
51

If you figured that out, award yourself a well-deserved pat on the back. For us mortals, we probably all did something rather more pedestrian:

DiveScore ← {
    d ← 2-2÷⍨7-≢⍵       ⍝ How many items should we drop each side?
    ⍺+.×(-d)↓d↓⍵[⍋⍵]
}

Problem 2 – Another Step in the Proper Direction (1 task)

Level of Difficulty: Medium

Problem 2 builds upon Problem 5 from Phase I. In short, we are asked to write a function Steps that takes a two-element vector to the right, defining a start and end value, and an optional left integer argument that tweaks how we generate values from start to end. The complexity here comes from the many combinations of behaviours from what exactly is given as the left argument: integer or float? positive or negative? Also, the range must be inclusive, even if a floating-point step size means that the end point is overshot. I took this on thinking it would be trivial – it wasn’t.

Here’s a great solution that manages to combine this functionality with a call to a single dfn:

∇ steps←{p}Steps fromTo;segments;width
  width ← |-/fromTo
  :If 0=⎕NC'p' ⍝ No left argument: same as Problem 5 of Phase I
      segments ← 0,⍳width
  :ElseIf p0 ⍝ p is the step size
      segments ← p {⍵⌊⍺×0,⍳⌈⍵÷⍺} width
  :ElseIf p=0 ⍝ As if we took zero step
      segments ← 0
  :EndIf
  ⍝ Take into account the start point and the direction.
  steps ← fromTo {(⊃⍺)+(-×-/⍺)×⍵} segments
∇

I ended up with something more convoluted, with a few ugly special cases, and shamelessly borrowing from dfns.iotag:

Steps ← {
    range ← {
        r ← ⍺-s×⎕IO-⍳⌊1-(⍺-⊃⍵)÷s←×/1↓⍵,(⍺>⊃⍵)/¯1 ⍝ "inspired" by dfns.iotag
        (⊃⍵)≠⊃⊖r: r,⊃⍵ ⋄ r   ⍝ Ensure endpoint is included – yeuch :(
    }
    ⍺ ← ⍬
    (b e) ← ⍵
    ⍺≡⍬: b range e        ⍝ No ⍺
    ⍺=0: b                ⍝ Zero step; return start point
    ⍺>0: b range e ⍺      ⍝ Positive ⍺
    len ← (e-b)÷count←⌊-⍺ ⍝ Negative ⍺
    len=0: b/⍨1+count     
    b range e len
}

Problem 3 – Past Tasks Blast (1 task)

Level of Difficulty: Medium

The task here was to scrape the Dyalog APL Problem Solving Competition webpage to extract all links to PDF files. We get the suggestion to use either Dyalog’s HttpCommand or shell out to a system mechanism for fetching a web page.

To use HttpCommand, we first need to load it:

      ]load HttpCommand
#.HttpCommand

Here’s a slightly tweaked competition submission, showing great flair in how to process XML:

PastTasks ← {
    url ← ⍵
    r ← (HttpCommand.Get url).Data  ⍝ get page contents
    (d n c a t) ← ↓⍉⎕XML r          ⍝ depth; name; content; attributes; type
    (k v) ← ↓⍉ ⊃⍪/ ((,'a')∘≡¨n)/a   ⍝ extract key-value pairs of <a> elements
    urls ← ('href'∘≡¨k)/v           ⍝ get URLs
    pdfs ← ('.pdf'∘≡¨¯4↑¨urls)/urls ⍝ filter .pdfs
    base ← ⊃⌽⊃('base'∘≡¨n)/a        ⍝ base URL
    base∘,¨pdfs
}

The problem statement suggests that a regex-based solution might be tolerable. Here’s a stab at that approach:

PastTasks ← {
    body ← (HttpCommand.Get ⍵).Data
    pdfs ← '<a href="(.+?\.pdf)"'⎕S'\1'⊢body
    base ← '<base href="(.+?)"'⎕S'\1'⊢body
    base,¨pdfs
}

So which is the “better” solution? Well, the first approach has a number of advantages: firstly, is much more robust (provided that the web page is valid XHTML, which we are told is a given), meaning that we can abdicate responsibility for dealing with markup quirks (single vs double quotes, whitespace etc) to the built-in ⎕XML system function, and secondly, there is that (in)famous quote from Jamie Zawinski:

Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems. – jwz

Mixing in a liberal helping of regular expressions in with APL is perhaps not helping APL’s unfair reputation for being write-only.

However, when dealing with patterns in textual data, as we unquestionably are here, regular expressions – even in a powerful language like APL – are sharp tools that are hard to beat, and any programmer worth their salt owes it to themselves to master them. In the case above, had the data not neatly been parseable as XML, it would have been more awkward to solve a problem like this relying only on APL primitives.

Problem 4 – Bioinformatics (2 tasks)

Level of Difficulty: Medium

The two tasks making up Problem 4 are borrowed from Project Rosalind, which is a Bioinformatics problem collection that often has great APL affinity:

and a hint that one benefits from understanding modular multiplication, as this isn’t built into Dyalog APL.

Here is a great example:

revp ← {                    ⍝ r ← revp dna
   dnaNum ← 'ACGT'⍳⍵        ⍝ Convert to 1..4 so that A+T = C+G = 5
   FindRevp ← {             ⍝ Given chunk size, extract positions and build the output format
       chunks ← ⍵,/dnaNum
       isRevp ← (⊢≡5-⌽)¨chunks
       ⍵,⍨⍪⍸isRevp
   }
   ⊃⍪/FindRevp¨4 6 8 10 12  ⍝ Test against all chunk sizes and collect results
}
sset ← {          ⍝ r←sset n
   bin ← 2⊥⍣¯1⊢⍵  ⍝ Binary digits
   arr ← ⌽2*bin   ⍝ Repeated squaring: Starting from MSB and 1, square ⍵, multiply ⍺, modulo m
   mod ← 1000000
   {mod|⍺×⍵*2}/arr,1
}

This contestant also saw fit to include their test suite; a nice touch! Roger Hui’s version of assert has become the de facto standard, and the contestant puts it to good use:

Assert ← {⍺←'assertion failure' ⋄ 0∊⍵:⍺ ⎕SIGNAL 8 ⋄ shy←0} ⍝ Roger Hui's Assert
RevpTest ← {
   s ← 'TCAATGCATGCGGGTCTATATGCAT'
   ans ← revp s
   Assert 8 2≡⍴ans:
   Assert 5 4 7 4 17 4 18 4 21 4 4 6 6 6 20 6≡∊ans:
  
   header ← 'Contest2020/Data/'  ⍝ Change as needed
   data1 ← ∊1↓⊃⎕NGET (header,'rosalind_revp_1_dataset.txt') 1
   ans1 ← ↑⍎¨⊃⎕NGET (header,'rosalind_revp_1_output.txt') 1
   data2 ← ∊1↓⊃⎕NGET (header,'rosalind_revp_2_dataset.txt') 1
   ans2 ← ↑⍎¨⊃⎕NGET (header,'rosalind_revp_2_output.txt') 1
   Assert ans1 ≡ revp data1:
   Assert ans2 ≡ revp data2:
   'Test passed'
}
SsetTest ← {
   Assert 8 = sset 3:
   Assert 551872 = sset 857:
   Assert 935424 = sset 870:
   'Test passed'
}

Problem 5 – Future and Present Value (2 tasks)

Level of Difficulty: Medium

Problem 5 is some hedge fund maths, or something where my eyes glazed over before I fully understood the ask. What is this, K‽

This solution is impressively compact – I removed the comments to highlight the APL artistry on display: no less than three scans, count ’em!

rr ← {AR×+\⍺÷AR←×\1+⍵} 
pv ← {+/⍺÷×\1+⍵}

Here’s how the competitor outlined how their solution works:

This can be calculated elegantly with the following operations:

  1. Find the accumulated interest rate (AR) for each term (AR←×\1+⍵).
  2. Deprecate the cashflow amounts by dividing them by AR. This finds the present value of all the amounts.
  3. Accumulate all the present values of the amounts to find the total present value at each term.
  4. Multiply by AR to find future values at each term.

This way the money that was invested or withdrawn in a term is not changed for that term, but the money that came from the previous terms is multiplied by the current interest rate for each term arriving to the correct recurrent relation:

Step 2) amounts[i]/AR[i] ⍝ ≡ PV[i]
Step 3) amounts[i]/AR[i] + APV[i-1]
Step 4) amounts[i] + APV[i-1]×AR[i]
amounts[i] + APV[i-1]×AR[i-1]×(1+rate[i])
amounts[i] + r[i-1]×(1+rate[i]) ⍝ ≡ r[i]

Problem 6 – Merge (1 task)

Level of Difficulty: Medium

Mail merge – gotta love it. Your spam folder is full of bad examples of this: “Dear $FIRSTNAME, do you want to purchase a bridge?” We’re given a template file with patterns such as @firstname@ which are to be replaced with values stored in a JSON file. Here’s a smart approach from a competitor who knows their way around the @ operator:

Merge ← {
   templateFile ← ⍺
   jsonFile ← ⍵
   template ← ⊃⎕NGET templateFile
   ns ← ⎕JSON⊃⎕NGET jsonFile

   getValue ← {
       0=⍴⍵:,'@'   ⍝ '@@'         → ,'@'
       6::'???'    ⍝ ~⍵∊ns.⎕NL ¯2 → '???'
       ⍕ns⍎⍵       ⍝  ⍵∊ns.⎕NL ¯2 → ⍕ns.⍵
   }
   ∊getValue¨@(⍴⍴1 0⍨)'@'(1↓¨=⊂⊢)template
}

The key insight here is that since each template starts and ends with the same marker, we can partition the data on sections beginning with @ and then we’ll have a vector where every other element is a template to be substituted. Here’s an example of this:

      ↑('@'(1↓¨=⊂⊢) '@title@ @firstname@ @lastname@, would you be interested in the Brooklyn Bridge?') (1 0 1 0 1 0)
┌─────┬─┬─────────┬─┬────────┬─────────────────────────────────────────────────┐
│title│ │firstname│ │lastname│, would you be interested in the Brooklyn Bridge?│
├─────┼─┼─────────┼─┼────────┼─────────────────────────────────────────────────┤
│1    │0│1        │0│1       │0                                                │
└─────┴─┴─────────┴─┴────────┴─────────────────────────────────────────────────┘

I added the second row for clarity to show the alternating templates. Cool, huh? However, this only works correctly if the data leads with a template. Consider:

      '@'(1↓¨=⊂⊢) 'Dear @firstname@ @lastname@, or maybe the Golden Gate?'
┌─────────┬─┬────────┬───────────────────────────┐
│firstname│ │lastname│, or maybe the Golden Gate?│
└─────────┴─┴────────┴───────────────────────────┘

We still have the alternating templates, but the prefix (Dear ) is lost. We can tweak the Merge function a bit to cater for this if we need to:

Merge ← {
    templateFile ← ⍺
    jsonFile ← ⍵
    template ← ⊃⎕NGET templateFile
    ns ← ⎕JSON⊃⎕NGET jsonFile
    first ← templ⍳'@'
    first>≢templ: templ    ⍝ No templates at all
    prefix ← first↑templ   ⍝ Anything preceding the first '@'?

    getValue ← {
        0=⍴⍵:,'@'   ⍝ '@@'         → ,'@'
        6::'???'    ⍝ ~⍵∊ns.⎕NL ¯2 → '???'
        ⍕ns⍎⍵       ⍝  ⍵∊ns.⎕NL ¯2 → ⍕ns.⍵
    }
    ∊prefix,getValue¨@(⍴⍴1 0⍨)'@'(1↓¨=⊂⊢)template
}

Now, the competition is pitched such that “proper array solutions” are preferred – and for good reasons, most of the time. However, it’s hard to overlook some industrial regex action in this case. Strictly for Perl-fans:

Merge ← {
    mrg ← ⎕JSON⊃⎕NGET ⍵
    keys ← mrg.⎕NL¯2
    vals ← mrg.⍎¨keys

    ('@',¨(keys,'' '[^@]+'),¨'@')⎕R((⍕¨vals),'@' '???')⊃⎕NGET ⍺
}

Problem 7 – UPC (3 tasks)

Level of Difficulty: Medium

Problem 7 had us learning more about bar codes than we ever thought necessary. Read them, write them, verify them, scan them – forwards and backwards no less. Good scope for stretching your array muscles on this one. The eagle-eyed amongst you may have spotted that the verification aspect is a simplified version of Luhn’s algorithm, which a certain Morten Kromberg used to illustrate APL’s array capabilities at JIO a while back.

Here’s a good solution:

CheckDigit ← (10|∘-+.×∘(11⍴3 1))          ⍝ Computes the check digit for a UPC-A barcode.

UPCRD ← 114 102 108 66 92 78 80 68 72 116 ⍝ Right digits of a UPC-A barcode, base 10.
bUPCRD ← ⍉2∘⊥⍣¯1⊢UPCRD                    ⍝ Bit matrix with one right digit per row.
WriteUPC ← {
   ⍝ Writes the bits of a UPC-A barcode.  
   ~((11∘=≢)∧(∧/0∘≤∧≤∘9))⍵: ¯1            ⍝ Check for simple errors
   b ← bUPCRD[⍵,CheckDigit ⍵;]  
   1 0 1, (,~6↑b), 0 1 0 1 0, (,6↓b), 1 0 1 
}
ReadUPC ← {
   ⍝ Reads a UPC-A barcode into its digits.
   ~(∧/0∘≤∧≤∘1)⍵: ¯1                 ⍝ Input isn't a bit vector
   95≠≢⍵: ¯1                         ⍝ Number of bits must be 95
   (b l m r e) ← ⍵ ⊂⍨ (∊¯1∘↓,⌽) (3↑1)(42↑1)(5↑1)
   
   b ∨⍥(≢∘1 0 1) e: ¯1               ⍝ Wrong patterns for the guards
   m≢0 1 0 1 0: ¯1
   bits ← ↓12 7⍴ l,r
   C ← (↓bUPCRD)∘⍳ ~@(⍳6)            ⍝ Convert bits to digits
   tf ← ~∧/10 > nums ← C bits        ⍝ Should we try flipping the bits?
   nums ← (nums×1-tf) + tf×C⌽↓⌽↑bits
   ∨/10=nums: ¯1                     ⍝ Bits simply aren't right
   (¯1↑nums)≠CheckDigit 11↑nums: ¯1  ⍝ Bad check digit
   nums
}

Problem 8 – Balancing the Scales (1 task)

Level of Difficulty: Hard

Our task is to partition a set of numbers into two groups of equal sum if this is possible, or return if not. This is a well-known NP-complete problem called The Partition Problem and, as such, has no polynomial time exact solutions. The problem statement indicates that we only need to consider a set of 20 numbers or fewer, which is a bit of a hint on what kind of solution is expected.

This problem, in common with many other NP problems, also has a plethora of interesting heuristic solutions: polynomial algorithms that whilst not guaranteed to always find the optimal solution will either get close, or be correct for a significant subset of the problem domain in a fraction of the time the exact algorithms would take.

However, it’s clear that Dyalog expects us to give an exact solution, and has given us an upper bound on the input data length. Finally, we’re offered the cryptic advice that

Understanding the nuances of the problem is the key to developing a good algorithm.

Yes, thank you, master Yoda.

Here’s a great, efficient solution:

Balance←{
   sum←1⊥⍵
   2|sum: ⍬   ⍝ Lists with an odd sum cannot be split into equal parts.
   halfsum←sum÷2
  
   ⍝ A partitioning method based on the algorithm by Horowitz and Sahni.
   ⍝ The basic idea of the algorithm is to split the input into two parts,
   ⍝ and then generate all subset sums for these parts. Then the problem
   ⍝ becomes finding a sum of two subset sums from different parts
   ⍝ equal to the desired value. Instead of sorting the sums and comparing
   ⍝ them like in the original algorithm, standard APL searching primitives
   ⍝ ∊ and ⍳ are used. Another key idea is to generate the subset sums
   ⍝ in a specific order, so that the nth subset sum in the vectors a and b
   ⍝ is the sum of the elements chosen by the binary representation of n.
   ⍝ This means that we can get the elements of the solution sum
   ⍝ without having to generate anything but the sums.
   horowitzsahni←{
       s←⍵(↑{⍺⍵}↓)⍨⌊2÷⍨≢⍵                          ⍝ Split the input.
       a b←⊃¨(⊢,+)/¨s,¨0                           ⍝ Generate the subset sums.
       indexes←a {(⊢,⍵⍳⍺⌷⍨(≢⍺)⌊⊢)1⍳⍨⍺∊⍵} halfsum-b ⍝ Search for solution indexes.
       indexes[2]>≢b: ⍬
       ⍵ {(⍺/⍨~⍵)(⍵/⍺)} ∊(2⍴¨⍨≢¨s)⊤¨indexes-1      ⍝ Get the solution from the indexes.
   }
  
   ⍝ A simple exhaustive search. It uses the same binary representation
   ⍝ idea as the horowitzsahni function.
   exhaustive←{
       i←halfsum⍳⍨⊃(⊢,+)/⍵,0
       i>2*≢⍵: ⍬
       ⍵ {(⍺/⍨~⍵)(⍵/⍺)} (2⍴⍨≢⍵)⊤i-1
   }

   ⍝ The exhaustive method performs better than the Horowitz-Sahni method
   ⍝ for small input sizes. 14 seems to be a reasonable cutoff point.
   14>≢⍵: exhaustive ⍵
   horowitzsahni ⍵
}

There are a number of clever touches here – there are actually two different solutions, an exhaustive search and an implementation of the algorithm due to Horowitz and Sahni, which, although still exponential, is known to be one of the fastest for certain subsets and input sizes. A switch based on input size checks for the crossover point and chooses the fastest option. And this is fast – five times faster than that of the Grand Prize winner, and four orders of magnitude faster than the slowest solution.

Such a performance spread is intriguing, so there are clearly lessons to be learned here. When I tried this problem, I ended up with a pretty straight-forward (a.k.a. naive) brute force search:

Balance ← {⎕IO←0
    total ← +/⍵
    2|total: ⍬             ⍝ Sum must be divisible by 2
    psum ← total÷2         ⍝ Our target partition sum
    bitp ← ⍉2∘⊥⍣¯1⍳2*≢⍵    ⍝ All possible bit patterns up to ≢⍵
    idx ← ⍸<\psum=bitp+.×⍵ ⍝ First index of partition sum = target
    ⍬≡idx: ⍬               ⍝ If we have no 1s, there is no solution
    part ← idx⌷bitp        ⍝ Partition corresponding to solution index
    (part/⍵)(⍵/⍨~part)     ⍝ Compress input by solution pattern and inverse
}

If you come to APL from a scalar language, that approach must seem incredibly wasteful: make all bit patterns. Try all sums. Search for the right one, if it exists. But as it turns out, this is APL home turf advantage. Let’s try to demonstrate this point. If you did this “loop and branch”, you’d iterate over the bit patterns and stop once you find the first solution – in fact, for the test data in the problem specification, the first solution appears at around the 1500th bit pattern if you generate them as I do above. The vector version would need to consider the whole space of around

      ¯1+2*20
1048575

a million or so, so quite a difference. Surely, in this case the scalar approach should be way faster? Only one way to find out. We can make a scalar version in several ways – here’s the “Scheme” version:

BalanceScalar ← {⎕IO←0     ⍝ Warning: this is not the APL Way, as we shall see.
    total ← +/⍵
    2|total: ⍬             ⍝ Sum must be divisible by 2
    psum ← total÷2         ⍝ Our target partition sum
    data ← ⍵
    bitp ← ↓⍉2∘⊥⍣¯1⍳2*≢⍵   ⍝ Pre-compute the bit patterns
    {                      ⍝ Try one sum after the other, halt on first solution
        0=⍵: ⍬
        patt ← ⍵⊃bitp
        psum=patt+.×data: (patt/data)(data/⍨~patt) ⍝ Exit on first solution found
        ∇¯1+⍵
    } ¯1+≢bitp
}

Dyalog’s got game when it comes to tail call optimisation, right? OK, let’s race:

      'cmpx'⎕CY'dfns'
      d ← 10 81 98 27 28 5 1 46 63 99 25 39 84 87 76 85 78 64 41 93
      cmpx 'Balance d' 'BalanceScalar d'
  Balance d       → 2.7E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕            
* BalanceScalar d → 3.9E¯2 | +43% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

Vectorisation, Boolean vectors and primitive functions wins the day. We didn’t go completely scalar, to be fair, as we still pre-computed all the binary patterns.

But back to the task at hand – let’s pit ourselves against the intellectual might of Horowitz and Sahni:

horowitzsahni←{
    sum←1⊥⍵
    2|sum: ⍬   ⍝ Lists with an odd sum cannot be split into equal parts.
    halfsum←sum÷2
    s←⍵(↑{⍺⍵}↓)⍨⌊2÷⍨≢⍵                          ⍝ Split the input.
    a b←⊃¨(⊢,+)/¨s,¨0                           ⍝ Generate the subset sums.
    indexes←a {(⊢,⍵⍳⍺⌷⍨(≢⍺)⌊⊢)1⍳⍨⍺∊⍵} halfsum-b ⍝ Search for solution indexes.
    indexes[2]>≢b: ⍬
    ⍵ {(⍺/⍨~⍵)(⍵/⍺)} ∊(2⍴¨⍨≢¨s)⊤¨indexes-1      ⍝ Get the solution from the indexes.
}
      cmpx 'horowitzsahni d' 'Balance d' 'BalanceScalar d'
  horowitzsahni d → 4.7E¯5 |      0%                                         
* Balance d       → 2.8E¯2 | +59266% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕            
  BalanceScalar d → 4.0E¯2 | +84466% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

Ouch! Well, told you my exhaustive search was naive. An impressive performance from the competitor – but also an impressive performance from Dyalog APL – even my knocked up exhaustive search runs in a pretty decent 25–30ms or so, about half the time of my shoddy Python attempt (although out-speeding Python is a low bar). I’m keeping the above implementation of Horowitz/Sahni handy for next edition of Advent of Code, where this problem always seems to crop up in some shape or form.

Problem 9 – Upwardly Mobile (1 task)

Level of Difficulty: Hard

And so for the final question. We were offered strong hints that a neat array-oriented solution might not be possible, but that the judges were prepared to be proven wrong.

Here’s a nicely compact, recursive solution:

∇ weights ← Weights filename;diag;FindWeights;start
    diag ← ↑(≠∘(⎕UCS 10)⊆⊢)⊃⎕NGET filename
    FindWeights ← {
        '┌┐│'∊⍨⊃⍵: ∇1↓⍵                    ⍝ if on any of these, go down        
        ⎕A∊⍨⊃⍵: ⎕A=⊃⍵                      ⍝ if on a letter, give weights
        r_disp ← '┐'⍳⍨0⌷⍵                  ⍝ otherwise, (i.e. on '┴'), find the displacement of right branch,
        l_disp ← -1+'┌'⍳⍨⌽0⌷⍵              ⍝ ...and the left branch
        wts ← ↑(∇r_disp⌽⍵)(∇l_disp⌽⍵)      ⍝ recurse,
        +⌿wts×[0]⌽(+/wts)×r_disp (-l_disp) ⍝ ...and calculate new weights
    }
    start ← diag⌽⍨⍸'┴│'∊⍨0⌷diag            ⍝ starting position attained by ⌽'ing to '┴' or '│'
    weights ← (~∘0÷∨/)FindWeights start    ⍝ remove 0s and get lowest weights
∇

Finally, someone took the suggestion that an array-based solution might not be possible as a personal challenge and produced the following:

Weights ← {
    m  ← ↑(⎕UCS 10)(≠⊆⊢)⊃⎕NGET ⍵ ⍝ no empty lines midway through so this is fine
    fm ← m='┴'               ⍝ fulcrum mask
    ER ← {+\1-⍵\¯2-⌿0⍪⍸⍵}    ⍝ distance to closest 1 to the left
      
    wa ← +/,m∊⎕A             ⍝ weight amount
    wi ← (⍳wa)@{⍵} m∊⎕A      ⍝ weight indexes
    fa ← +/,fm               ⍝ fulcrum amount
    fir← wa + ⍳fa            ⍝ fulcrum indexes (reduced)
    fi ← fir@{⍵} fm          ⍝ fulcrum indexes
    ai ← fi+wi               ⍝ all indexes
    ai+← ⍉(m∊'┌┐') {⍺\⍵/⍨⍵≠0}⍤1⍥⍉ 0@1⊢ai ⍝ extend indexes upwards to the ┌┐s that need them (exclude top ┴ as it isn't matched)
      
    ld ←  ER⍤1⊢ m='┌'        ⍝ distance to left
    rd ← ⌽ER⍤1⌽ m='┐'        ⍝ distance to right
    xp ← (⍴m)⍴⍳2⊃⍴m          ⍝ x position
    fml← ↓fm                 ⍝ fulcrum mask & its lines
    ail← ↓ai                 ⍝ all index lines
    GET← {⊃,/ail⌷⍨∘⊂¨fml/¨⍵} ⍝ get an item of ai for each fulcrum at x position ⍵
    lir← GET ↓xp-ld          ⍝ left indexes (reduced)
    rir← GET ↓xp+rd          ⍝ right indexes (reduced)
    ldr← fm /⍥, ld           ⍝ left distance (reduced)
    rdr← fm /⍥, rd           ⍝ right distance (reduced)
      
    in ← ↑⊃{(+/⍵[⍺])@(⊃⍺)⊢ ⍵}/ (↓⍉↑fir lir rir) , ⊂↓(⍳fa+wa)∘.=⍳wa ⍝ included weights for each index
    cf ← (ldr ×⍤¯1⊢ in[lir;]) - rdr ×⍤¯1⊢ in[rir;] ⍝ coefficients
    ws ← (1,(≢cf)⍴0) ⌹ ((2⊃⍴cf)↑1)⍪cf              ⍝ unscaled weights
    (⊢÷∨/) ws                                      ⍝ scale weights to integers
}

I take my hat off in admiration of the audacity: “An array solution might not be possible, eh? Hold my beer.”

So there we have it, a smörgåsbord of clever solutions to serve as an inspiration for us all. The 2020 edition of the competition sported a slightly simplified format where you were expected to tackle every problem instead of the approach in previous years where you had to make a subset selection from themed groups – this new approach remains for the current (2021) edition.

You are taking part, aren’t you?

APL Seeds ’21: Wednesday 31 March

Last Wednesday we hosted APL Seeds ’21, an event for those just starting their APL journey. Although we knew we had a good programme with some exceptional presenters in place, we very quickly had to increase our Zoom webinar limit to accommodate the 287 people who registered to attend! We were surprised and excited by the demographic, spanning 32 countries and with the vast majority having only basic or no APL experience.






The meeting started with a brief introduction from Dyalog’s Managing Director, Gitte Christensen, who shared her initial “Eureka” APL moment and gave some examples of situations in which APL is used today. Richard Park then took us on a whirlwind tour of APL’s past (including the very cool 1975 APL demonstration!) before demystifying the “beautiful squiggles” that define APL and introducing the modern resources that are available for learning APL (for a summary of these see the suggestions for learning resources available on our website).

The main presentations began with Rodrigo Girão Serrão giving a basic introduction to APL functions and syntax. Using the example of manually justifying text, he showed just how natural it is to process data in arrays by combining a few functions and operators. His initial exploration using a small snippet of text worked instantaneously and without issue on a whole book. After seeing this, hopefully you’ll feel an urge to learn some more – either because you got hooked (like Rodrigo did!) or simply because you want to learn how to think in an array-oriented way, which is very relevant in many situations today, such as when working with GPUs.

Martin Janiczek used a real-life example from the market insights and consumer trends company that he works for (GWI). Despite being a self-described “APL baby”, having learned APL for only around a month, he was able to get to grips with the tree structures that he wanted to use, and talked about how learning APL led him to change his overall approach to the problem. He achieved a highly-performant working prototype in two weeks and with only 172 lines of code, despite starting from the position of a complete APL beginner. Although ultimately his APL model was not taken into production, it inspired a complete rethink and new approach in the eventual product.

Conor Hoekstra (NVIDIA) describes himself as “not an APLer but a big fan”, and his enthusiasm is obvious and contagious! His explorations in APL are YouTube famous, and here again he deftly shows how APL can be used as a tool of thought to explore problems from many different angles with relative ease. He went through multiple different solutions to writing an All Equals function (is every element in a list the same as every other element?), playing with different primitives and comparing the performance of the solutions.

The final presentation came from Tomas Gustafsson, creator of the stunning Stormwind boating simulator. Tomas introduced the technology behind the 3D engine that he uses for his simulator, explaining the code that makes it all happen, before walking us through creating some simple 3-D shapes (a rotating triangle and an icosahedron) and the pitfalls that this entails. From the comments we know he seemed to inspire several members of the audience to want to know more… so for those of you that do, his code examples will soon be available from the APL Seeds web page.

We hope everyone found the event useful and enjoyable (the feedback seems to indicate that you did – thank you!). Relevant materials have started to be uploaded to the APL Seeds ’21 webpage – this page also includes links to recordings of the presentations, which are all on dyalog.tv:

 

Welcome Rodrigo Girão Serrão

The story of how Rodrigo got his first internship at Dyalog is, in his opinion, a textbook example of serendipity. As 2020 started, Rodrigo began actively participating in an online code golf community, where people try to solve programming challenges in as few bytes of code as possible. Whilst his golfing skills were possibly lacking, the challenges he posted were usually well accepted. Posting many challenges meant Rodrigo got exposed to answers in all sorts of programming languages, from C, Java, Python and JavaScript, to Jelly, 05AB1E, Husk…and APL. Because of the context and the aspect of it, Rodrigo first thought APL was one of those “esolangs” and not a serious programming language.

Rodrigo’s fascination with APL led him to start frequenting The APL Orchard chatroom, where a small number of brilliant people convened to discuss all things APL. Here he met Adám Brudzewsky, who was keen on teaching APL to newcomers, and so began Rodrigo’s journey to learn APL.

His interest in APL kept growing, and he found it to be a simple and expressive language that also incorporated his affinity with mathematics. One day, while lurking in The APL Orchard, Adám asked Rodrigo if he would be interested in taking an intern position at Dyalog…a few emails later it was established that Rodrigo would work as a part-time intern at Dyalog during the Summer of 2020. This enabled Dyalog to make the most of Rodrigo’s skills in teaching and technical writing, and meant Rodrigo could indulge his passion for sharing knowledge about mathematics and programming while still finishing his MSc in Applied Mathematics. After his internship, Rodrigo took some time to complete his MSc thesis before returning to Dyalog to finish what he had started and hopefully to take part in many other interesting projects. When he is not working for Dyalog, Rodrigo may be found leading a Portuguese APL meetup, writing a blog post for his website (mathspp.com), or maybe leading a workshop or course. Other than working, Rodrigo likes to spend time with his loved ones, read fantasy books, eat chocolate, and watch silly comedy movies.

Welcome Shuhao Yang

Image

Shuhao joined Dyalog straight after he completed his Master’s degree in quantum computing, which happened to be during the second COVID-related lockdown in the UK. This has given him a quite unusual experience of starting a career as he has yet to meet a single member of the Dyalog team face to face! Shuhao obtained his Bachelor’s degree in mathematics – although he was interested in computer science, he studied mathematics as a way of looking for the root of CS and computing. He has broad interests across different software including Matlab, Python and LaTeX and has developed a solid knowledge on C++.

Shuhao enjoys the romantic theories in computer science and always wanted to work in one of the summit areas of CS – compilers, graphics and operating systems. He’s very happy that he now has the opportunity to work on the Dyalog APL interpreter.

The APL Orchard

If you have not already been there, I highly recommend visiting the APL Orchard (apl.chat). Adám Brudzewsky originally started this chat room on Stack Exchange to teach and answer questions, including many successful introductory sessions to people who wanted to learn APL. Since then, it has become an extremely active discussion forum, with a very wide range of interesting conversations, from regularly helping more newcomers get started with APL to theoretical discussions about the design of future array languages that will improve APL and perhaps compete against it some day.

Interested in APL?

One of Adám’s infamous invitations to a personalised APL introduction

Several of the most active participants have written their own APL implementations and some are working on array languages that are quite different from APL. It is a lively crowd, generating lots of thought-provoking discussions that feel a lot like the big arguments that I remember witnessing between Iverson, Benkard, Bernecky, Falkoff, Trenchard More and others at APL conferences back in the days when the APL language was still being born.

The participants include a mix of current, past and future employees of Dyalog Ltd, and it is a source of much useful inspiration for future work on our product and healthy challenges to conventional wisdom, so I try to spend some time there every day.

APL Orchard Stats

APL Orchard message statistics. The bottom contains the all-time total number of participants and messages.

To the New APLers

The Orchard is sometimes a challenging environment for a Dyalog CTO. Many of the bright young minds with computer science backgrounds that arrive here are quick to latch on to the fact that many things about APL go against the grain of what they were taught at university, and that a few of the design decisions made during the 55 years since the first APL implementation were questionable and could do with rationalisation. There is a constant barrage of complaints that we are not working hard to “fix” the things that they consider to be wrong about APL. This is, of course, how things should be when youthful enthusiasm meets us “fossils”; I fully understand many of the points made and actually agree with quite a few of them.

It is sometimes hard work having to repeatedly defend why Dyalog does not do things like just fix index origin at 0 and give the customers a few years to refactor their code. The implication is often that we are incompetent, and occasionally there is the insinuation that we are driven by questionable commercial motives like trying to get rich by locking the customers in and then doing a minimum of work.

Evolution, not Revolution

In my opinion, a large part of the value of our product rests on the fact that our customers can do things like read new legislation that affects the way their code needs to compute something, write code to deal with it before lunch, and then expect the code to keep working without changes until the law changes again, which might be in four years (depending on which country you live in), or in a few decades, after the current programmers have retired. Meanwhile, the code might need to move from the mainframe via UNIX and DOS to Microsoft Windows or macOS and then on to Linux in the cloud, and whatever comes after that.

APLers actually do enjoy refactoring code *if* it is in order to meet new challenges in the markets that they serve with products based on APL or compute better results. Our duty is to make this easy for them. On the other hand, there is no business value in refactoring code because we decided to change the way the language works, due to theoretical considerations – and it is our duty to protect them from that.

Breaking changes to Dyalog APL are quite simply not an option, at least not if users have to take immediate action. When we accidentally make a breaking change despite our best efforts not to, there is much wailing and gnashing of teeth, and we have to drop everything to fix it. That doesn’t mean we can’t make significant changes to how things work, but we must take an evolutionary approach, where the old way continues to work until everyone who needs it has moved on, possibly controlled by (sometimes undocumented) switches, like the one that forces APL to continue to give a DOMAIN ERROR on ¯1*0.5 to protect legacy code that relies on trapping the error in certain financial calculations.

Over the past decade, we have been able to move almost all of our user base from a 32-bit product with a fixed “alphabet” of 256 characters to using 64-bit APL with full Unicode support, in many cases without requiring any changes to application code. The stragglers are mostly major clients with support contracts that justify the additional cost of continuing to support the “legacy” version of the product until they are eventually able to move on. Once the Raspberry Pi stabilises as a 64-bit platform, we may finally be able to completely retire 32-bit Dyalog APL. I estimate that the “Classic” (non-Unicode) version probably needs to exist for another 5-10 years.

The next big challenges for us are to ensure that code can be moved to macOS and Linux (and “the cloud”) with a minimum of changes, and that both new and existing users have ways to integrate both new and legacy APL code nicely with modern source code management systems and continuous integration pipelines.

Existing Customers First

Our first priority at Dyalog is to support our existing clients and make sure that they remain competitive in their respective marketplaces.

Some contributors to the APL Orchard have suggested that this prevents us from being able to attract new users, which they believe can only happen if we quickly fix some of the “warts” in the language, or add some even more powerful language constructs.

First of all: not a single person who I consider to be a real prospect for writing substantial new application code in APL has brought up a single one of these issues as a reason why they might not want to adopt APL.

Secondly: if we do not first ensure our own financial stability, we will not have the resources to perform evolution, let alone revolution. For example, our investment in the creation of the APL Orchard, the APL Wiki, and the creation of new training materials for potential new users of APL, all depend on this.

As they say in the pre-flight briefing: put on your own oxygen mask before helping others.

Third: it is my very strong conviction that making breaking changes would violate the trust that exists between Dyalog and its customers, and I would personally consider it to be unethical. This may sound like a radical position, but it is one that I would expect to resonate with the younger generation. My personal motivation for working on software stems from witnessing the value that is created when people use software that I have helped create. Once someone starts using “my” product (and they continue to indicate that it is important to them by paying for support), I have an obligation to protect them from harm, if it is in my power.

Conclusion

APL is Fun
Old timers: If you are looking for some interesting discussions about current and future array languages, to help some newcomers, and maybe learn some new tricks yourself, I suggest that you head over to the APL Orchard and dive in, it is a lot of fun! You might want to check out apl.wiki/APL_Orchard, which has some helpful hints and tips, before you start.

New folks: I hope that this explanation of some of the philosophy that drives our work at Dyalog helps to explain some of the decisions that we make and that you find to be puzzling. I fully recognise that it is hard to imagine that a software company might have customer relationships that started before you were born 😊.

There are a couple of other topics that I should possibly also explain my views on, such as the dilemma of whether APL is a Notation or a Programming Language, and why Dyalog APL is not open source, but this post is very long already. Depending on the response I get to this post, I may return to that in the future.