Discussion Point: DNA Analysis
Charles Brenner performs forensic analysis of DNA and DNA mixtures – an intensive mathematical process. Others use complicated statistical and Monte Carlo methods but by using APL as his tool of thought, Charles has devised techniques which are more accurate and many times faster than competing applications supported by multi-million dollar funding.
Discussion Point: A Different Kind of Selfie
In today’s session “The Tuning Pipeline”, Roger Hui noted that for the index-of family of functions, a faster algorithm is often possible if the left and right arguments are “the same”, up to a factor of two. “The same” means the that the values have identical references so that they can be compared with a simple and quick check. For example, x⍳x
is a selfie but x⍳0+x
or x⍳(⍴x)⍴x
are not.
For example, for the inverted-table index-of 8⌶
,
u←(' ',⎕a,⎕d)[?8e4 30⍴37]
x←u[?1e5⍴≢u;]
p←,⊂x
q←,⊂x
cmpx 'p(8⌶)p' 'p(8⌶)q'
p(8⌶)p → 3.98E¯3 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
p(8⌶)q → 8.72E¯3 | +118% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
Selfies occur in less obvious places – when finding uniques (∪x
), in the key operator, and in x∧.=⍉y
as well as in x⍳x
and ⍳⍨x
. You can try this yourself for various datatypes. A selfie which is not faster is an opportunity for a performance improvement.
Discussion Point: Sorting is Faster Than Grading
Also in “The Tuning Pipeline” session, Roger Hui and Kimmo Kekäläinen looked at some of the recent performance improvements in Dyalog. It has long had idioms for sorting ({⍵[⍋⍵]}
, {⍵[⍋⍵;]}
, etc.) Interestingly, for some common datatypes, sorting is faster than grading. You can try this yourself for various datatypes:
cmpx '{⍵[⍋⍵]}x' '⍋x' ⊣ x←?1e6⍴2e9
{⍵[⍋⍵]}x → 3.23E¯2 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
* ⍋x → 8.34E¯2 | +158% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
The point is written up in this essay on the J website. The webpage uses J but the explanation applies to APL (or any other language). Quoting from the webpage: “Grade needs to keep track of the argument array and the list of indices. Sort just needs to keep track of the first. … When the argument items can be manipulated efficiently, as would be the case when the items are machine units (1, 2, 4, or 8 bytes), then sort can eschew a separate index array.”
John Scholes’ “Distractions” – An Objective Review
An awed silence gripped the room as John unveiled ground-breaking techniques for minimising life’s distractions and maximising programming productivity. It is too early to say how far-reaching this approach will turn out be in real life situations and whether it will affect The Global Economy as a whole. John appeared to have all categories of distraction covered. Let’s see. Like Woodstock ’69 and The Isle of Wight ’70, in years to come you may be able to say “I was there”.
And Also …
Some of the many other things we saw and heard today:
- Dyalog has taken on two new employees since Dyalog ’13
- John Daintree can’t take selfies but embraces high resolution touch devices
- Data files used in Finnish pension microsimulations reduced to one sixth of their size when component file compression was enabled
- MyDyalog launched live at the user meeting
- Fiona wants to promote your APL application in a banner on the Dyalog homepage
- Morten computed Mandlebrot set images, performing the calculation in parallel with isolates running on servers in Holland and Hong Kong
Tomorrow…
Tomorrow’s schedule features five user presentations and four Dyalog presentations covering the themes of code management and reuse, performance, presentation tools and cryptography. In the evening Morten will demonstrate the progress he has made with the ‘bots (something that should be very familiar to regular readers of the Dyalog blog!).