50847534

⎕io←0 throughout.

I was re-reading A Mathematician’s Apology before recommending it to Suki Tekverk, our summer intern, and came across a statement that the number of primes less than 1e9 is 50847478 (§14, page 23). The function pco from the dfns workspace does computations on primes; ¯1 pco n is the number of primes less than n:

      )copy dfns pco

      ¯1 pco 1e9
50847534

The two numbers 50847478 and 50847534 cannot both be correct. A search of 50847478 on the internet reveals that it is incorrect and that 50847534 is correct. 50847478 is erroneously cited in various textbooks and even has a name, Bertelsen’s number, memorably described by MathWorld as “an erroneous name erroneously given the erroneous value of π(1e9) = 50847478.”

Although several internet sources give 50847534 as the number of primes less than 1e9, they don’t provide a proof. Relying on them would be doing the same thing — arguing from authority — that led other authors astray, even if it is the authority of the mighty internet. Besides, I’d already given Suki, an aspiring mathematician, the standard spiel about the importance of proof.

How do you prove the 50847534 number? One way is to prove correct a program that produces it. Proving pco correct seems daunting — it has 103 lines and two large tables. Therefore, I wrote a function from scratch to compute the number of primes less than , a function written in a way that facilitates proof.

sieve←{
  b←⍵⍴{∧⌿↑(×/⍵)⍴¨~⍵↑¨1}2 3 5
  b[⍳6⌊⍵]←(6⌊⍵)⍴0 0 1 1 0 1
  49≥⍵:b
  p←3↓{⍵/⍳⍴⍵}∇⌈⍵*0.5
  m←1+⌊(⍵-1+p×p)÷2×p
  b ⊣ p {b[⍺×⍺+2×⍳⍵]←0}¨ m
}

      +/ sieve 1e9
50847534

sieve ⍵ produces a boolean vector b with length such that b/⍳⍴b are all the primes less than . It implements the sieve of Eratosthenes: mark as not-prime multiples of each prime less than ⍵*0.5; this latter list of primes obtains by applying the function recursively to ⌈⍵*0.5. Some obvious optimizations are implemented:

  • Multiples of 2 3 5 are marked by initializing b with ⍵⍴{∧⌿↑(×/⍵)⍴¨~⍵↑¨1}2 3 5 rather than with ⍵⍴1.
  • Subsequently, only odd multiples of primes > 5 are marked.
  • Multiples of a prime to be marked start at its square.

Further examples:

      +/∘sieve¨ ⍳10
0 0 0 1 2 2 3 3 4 4

      +/∘sieve¨ 10*⍳10
0 4 25 168 1229 9592 78498 664579 5761455 50847534

There are other functions which are much easier to prove correct, for example, sieve1←{2=+⌿0=(⍳3⌈⍵)∘.|⍳⍵}. However, sieve1 requires O(⍵*2) space and sieve1 1e9 cannot produce a result with current technology. (Hmm, a provably correct program that cannot produce the desired result …)

We can test that sieve is consistent with pco and that pco is self-consistent. pco is a model of the p: primitive function in J. Its cases are:
   pco n   the n-th prime
¯1 pco n   the number of primes less than n
 1 pco n   1 iff n is prime
 2 pco n   the prime factors and exponents of n
 3 pco n   the prime factorization of n
¯4 pco n   the next prime < n
 4 pco n   the next prime > n
10 pco r,s a boolean vector b such that r+b/⍳⍴b are the primes in the half-open interval [r,s).

      ¯1 pco 10*⍳10      ⍝ the number of primes < 1e0 1e1 ... 1e9
0 4 25 168 1229 9592 78498 664579 5761455 50847534

      +/ 10 pco 0 1e9    ⍝ sum of the sieve between 0 and 1e9
50847534
                         ⍝ sum of sums of 10 sieves
                         ⍝ each of size 1e8 from 0 to 1e9
      +/ t← {+/10 pco ⍵+0 1e8}¨ 1e8×⍳10
50847534
                         ⍝ sum of sums of 1000 sieves
                         ⍝ each of size 1e6 from 0 to 1e9
      +/ s← {+/10 pco ⍵+0 1e6}¨ 1e6×⍳1000
50847534

      t ≡ +/ 10 100 ⍴ s
1
                         ⍝ sum of sums of sieves with 1000 randomly
                         ⍝ chosen end-points, from 0 to 1e9
      +/ 2 {+/10 pco ⍺,⍵}/ 0,({⍵[⍋⍵]}1000?1e9),1e9
50847534

      ¯1 pco 1e9         ⍝ the number of primes < 1e9
50847534
      pco 50847534       ⍝ the 50847534-th prime
1000000007
      ¯4 pco 1000000007  ⍝ the next prime < 1000000007
999999937
      4 pco 999999937    ⍝ the next prime > 999999937
1000000007     
      4 pco 1e9          ⍝ the next prime > 1e9
1000000007

                         ⍝ are 999999937 1000000007 primes?
      1 pco 999999937 1000000007
1 1
                         ⍝ which of 999999930 ... 1000000009
                         ⍝ are prime?
      1 pco 999999930+4 20⍴⍳80
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1

      +/∘sieve¨ 999999937 1000000007 ∘.+ ¯1 0 1
50847533 50847533 50847534
50847534 50847534 50847535

Constructions

Suki Tekverk, a summer intern, spent the last two weeks here studying APL. She will be a high school senior next September and is adept in math, so in addition to APL we also considered some math problems, proofs, and proof techniques, including the following:

Given line segments x and y, construct (using compass and straight-edge) line segments for the following values:
   x+y
   x-y
   x×y
   x÷y

The first two are immediate. Constructing the last two are straightforward if you are also given 1 (or some other reference length from which to construct 1). Can you construct x×y and x÷y without 1?

Constructing x÷y is impossible if you are not given 1 : From x and y alone you can not determine how they compare to 1. If you can construct x÷y, then you can construct x÷x and therefore relate x and y to 1, contradicting the previous statement.

Can you construct x×y without 1? I got stuck (and lazy) and posed the question to the J Chat Forum, and received a solution from Raul Miller in less than half an hour. Miller’s proof recast in terms similar to that for x÷y is as follows:

From x and y alone you can not determine how they compare to 1. If you can construct x×y, then you can construct x×x, whence:
   if x<x×x then x>1
   if x=x×x then x=1
   if x>x×x then x<1
contradicting the previous statement.

I last thought about this problem in my first year of college many decades ago. At the time Norman M. (a classmate) argued that there must be a 1 and then did the construction for x÷y using 1. I recall he said “there must be a 1” in the sense of “1 has to exist if x and y exist” rather than that “you have to use a 1 in the construction” or “you can not construct x÷y without using 1”. I don’t remember what we did with x×y; before Miller’s message I had some doubt that perhaps we were able to construct x×y without using 1 all those years ago. (Norman went on to get a Ph.D. at MIT and other great things.)

The Story of a Summer Intern

by Suki Tekverk

As a born and raised New Yorker, I thought I had seen it all, but two short weeks spent in Vancouver with Roger Hui and his family proved me very wrong! I went from being at the top of the totem pole to one of the most clueless people around. Luckily, these new experiences provoked new paths of thought and ultimately left me more prepared to face the world than I was when my plane shakily landed in this foreign place.

Upon my arrival, I was greeted by Stella, Roger’s wife, and their children, Nicholas and Rachel, who promptly made me a sign for my bedroom door, making me feel right at home.

The next morning, I went right to work, unsuccessfully writing a dfn for the Fibonacci sequence. After struggling for at least 40 minutes, Roger showed me a solution. This solution was nearly the exact definition of the Fibonacci sequence, setting a precedent for the uncomplicated and direct solutions that ensued. Over the next week we ripped through a Monty Hall simulation, recursive Peano addition and multiplication, recursive and non-recursive binomial coefficients, the Cantor set, the Sierpinski Carpet and more. Over this week, I not only discovered new realms of mathematics but also ate foreign food, hiked through a canyon, watched a Chinese film, made cookies, went to a horse ranch, listened to an orchestra performance, experienced sugar withdrawal and saw a sunset on a mountain. This was just the first week.

The main focus of week two was proofs. At first, I was skeptical of the value of writing proofs in APL as opposed to writing them on paper, but being able to run each step of the proof was quite useful. I started with proof by induction, then by contradiction. After learning the basics I proved the sum of an arithmetic series, sum of geometric series, sum of integers, sum of odd integers and sum of binomial coefficients. This week showed me how closely APL and mathematical notation align. However, I am still annoyed by the fact that the order of operations isn’t how I learnt it in school; it will take some getting used to. My next task will be buying a Windows computer for APL.

Overall, these two weeks in Vancouver were a great learning experience on many levels. Coding in APL has led me to consider possible career options as well as help me understand the nature of the language. Living in a city so far away from home made me question things I considered commonplace and forced me to realize that New York is only a small part of a much larger world.

It’s APL…but not as we know it!

by the Dyalog Duck
As the party behind Dyalog’s Twitter account I often search our feed for #APL or just APL to see if anyone is talking about us or APL in general and I am frequently surprised by what pops up. We have often had a giggle in the office over some of the things we find and it was suggested to me that others might find these interesting too hence this rather random blog post.

Probably the most well known alternative APL is the container shipping and ocean freight company American President Lines (@APLShipping). The Dyalog staff often stand at the railway crossing on our way to the village bakery for lunch and chuckle at the APL containers whizzing past on the train … we are easily amused!

My personal favourite other APL has to be Athletic Propulsion Labs (@APLrunning) with their ‘oh so gorgeous running shoes’ … their trainers are things that I desire on a daily basis!! For you non-shoe lovers out there, they are also the official footwear supplier for the Renault Sport Formula One team.

The APL alter-ego that comes up most frequently in my Twitter search is a musical one … it seems we also share #APL with a 4 member boyband from South Korea. This APL (@APL_world) is extremely popular with teenage girls from Japan and Korea who frequently tweet their undying love for the boys … it makes you wonder who thought it was a good idea to name a teenage pop sensation after a programming language!

Still in the musical world we also have an uber cool APL namesake in rapper apl.de.ap (@apldeap) who is also one of the founding members of the Black Eyed Peas.

Another APL that we are proud to share #APL with is the John Hopkins University Applied Physics Laboratory (@JHUAPL) in Maryland USA. NASA’s Juno spacecraft successfully entered Jupiter’s orbit in July carrying an instrument suite that includes the Jupiter Energetic Particle Detector Instrument (JEDI) built by the Johns Hopkins Applied Physics Laboratory. APL in space … how cool is that!

twitterapl_6The Association of Professional Landscapers (@The_APL) are the APL that fill my screen with greenness, tweeting some amazing photos of their garden and landscaping projects.

twitterapl_7Finally, if you are having a bad day and just need cute puppies and kittens to coo over, the Animal Protective League (@APLSpringfield) in Springfield USA is the perfect APL for you. As well as providing cuteness overload they also do a great job re-homing animals in need.

So there you go folks … a few of my favourite alternative APLs for your amusement. Of course there are lots of other APLs out there in Twitterland but frankly most them are just not that interesting … and a few, to spare your blushes, are just not the sort of thing we want to share!

A Memo Operator

Abstract

Some functions are simply stated and easily understood as multiply-recursive functions. For example, the Fibonacci numbers are {1≥⍵:⍵ ⋄ (∇ ⍵-2)+∇ ⍵-1}. A drawback of multiple recursion is abysmal performance even for moderate-sized arguments, consequently motivating resorts to circumlocutions. The memo operator addresses the performance problem and restores multiple recursion to the toolkit of thought. We present a memo operator modelled as a 6-line d-operator.

⎕io←0 throughout.

The Man Who Knew Infinity

Jay sent out an e-mail asking who would be interested to go see the film The Man Who Knew Infinity. Fiona’s response: “Ah, Mr. 1729; I’m in.” John was game, and so was I.

The Number of Partitions
Nick found an essay by Stephen Wolfram on Ramanujan written on occasion of the film, wherein Wolfram said:

A notable paper [Ramanujan] wrote with Hardy concerns the partition function (PartitionsP in the Wolfram Language) that counts the number of ways an integer can be written as a sum of positive integers. …

In Ramanujan’s day, computing the exact value of PartitionsP[200] was a big deal — and the climax of his paper. But now, thanks to Ramanujan’s method, it’s instantaneous:

In[1]:= PartitionsP[200]
Out[1]= 3 972 999 029 388

A partition of a non-negative integer n is a vector v of positive integers such that n = +/v, where the order in v is not significant. For example, the partitions of the first seven non-negative integers are:

┌┐
││
└┘
┌─┐
│1│
└─┘
┌─┬───┐
│2│1 1│
└─┴───┘
┌─┬───┬─────┐
│3│2 1│1 1 1│
└─┴───┴─────┘
┌─┬───┬───┬─────┬───────┐
│4│3 1│2 2│2 1 1│1 1 1 1│
└─┴───┴───┴─────┴───────┘
┌─┬───┬───┬─────┬─────┬───────┬─────────┐
│5│4 1│3 2│3 1 1│2 2 1│2 1 1 1│1 1 1 1 1│
└─┴───┴───┴─────┴─────┴───────┴─────────┘
┌─┬───┬───┬─────┬───┬─────┬───────┬─────┬───────┬─────────┬───────────┐
│6│5 1│4 2│4 1 1│3 3│3 2 1│3 1 1 1│2 2 2│2 2 1 1│2 1 1 1 1│1 1 1 1 1 1│
└─┴───┴───┴─────┴───┴─────┴───────┴─────┴───────┴─────────┴───────────┘

Can we compute the partition counting function “instantaneously” in APL?

We will use a recurrence relation derived from the pentagonal number theorem, proved by Euler more than 160 years before Hardy and Ramanujan:

equation (11) in http://mathworld.wolfram.com/PartitionFunctionP.html. In APL:

      pn  ← {0≥⍵:0=⍵ ⋄ -/+⌿∇¨rec ⍵}
      rec ← {⍵ - (÷∘2 (×⍤1) ¯1 1 ∘.+ 3∘×) 1+⍳⌈0.5*⍨⍵×2÷3}

      pn¨0 1 2 3 4 5 6
1 1 2 3 5 7 11

      pn 30
5604

Warning: don’t try pn 200 because that would take a while! Why? pn 200 would engender pn being applied to each element of rec 200 and:

   rec 200
199 195 188 178 165 149 130 108 83 55 24 ¯10
198 193 185 174 160 143 123 100 74 45 13 ¯22

Each of the non-negative values would itself engender further recursive applications.

A Memo Operator

I recalled from J the memo adverb (monadic operator) M. Paraphrasing the J dictionary: f M is the same as f but may keep a record of the arguments and results for reuse. It is commonly used for multiply-recursive functions. Sounds like just the ticket!

The functionistas had previously implemented a (dyadic) memo operator. The version here is more restrictive but simpler. The restrictions are as follows:

  • The cache is constructed “on the fly” and is discarded once the operand finishes execution.
  • The operand is a dfn {condition:basis ⋄ expression} where none of condition, base and expression contain a and recursion is denoted by in expression
  • The arguments are scalar integers.
  • The operand function does not have ¯1 as a result.
  • A recursive call is on a smaller integer.

With these restrictions, our memo operator can be defined as follows:

M←{
 f←⍺⍺
 i←2+'⋄'⍳⍨t←3↓,⎕CR'f'
 0=⎕NC'⍺':⍎' {T←(1+  ⍵)⍴¯1 ⋄  {',(i↑t),'¯1≢T[⍵]  :⊃T[⍵]   ⋄ ⊃T[⍵]  ←⊂',(i↓t),'⍵}⍵'
          ⍎'⍺{T←(1+⍺,⍵)⍴¯1 ⋄ ⍺{',(i↑t),'¯1≢T[⍺;⍵]:⊃T[⍺;⍵] ⋄ ⊃T[⍺;⍵]←⊂',(i↓t),'⍵}⍵'
}

      pn M 10
42

      pn 10
42

Obviously, the lines in M are crucial. When the operand is pn, the expression which is executed is as follows; the characters inserted by M are shown in red.

{T←(1+⍵)⍴¯1 ⋄ {0≥⍵:0=⍵ ⋄ ¯1≢T[⍵]:⊃T[⍵] ⋄ ⊃T[⍵]←⊂-/+⌿∇¨rec ⍵}⍵}

The machinations produce a worthwhile performance improvement:

      pn M 30
5604
      cmpx 'pn M 30'
3.94E¯4
      cmpx 'pn 30'
4.49E1 
      4.49e1 ÷ 3.94e¯4
113959

That is, pn M 30 is faster than pn 30 by a factor of 114 thousand.

Can APL compute pn M 200 “instantaneously”? Yes it can, for a suitable definition of “instantaneous”.

      cmpx 'm←pn M 200'
4.47E¯3
      m
3.973E12
      0⍕m
 3972999029388

M also works on operands which are anonymous, dyadic, or have non-scalar results.

Fibonacci Numbers

   fib←{1≥⍵:⍵ ⋄ (∇ ⍵-2)+∇ ⍵-1}

   fib¨ ⍳15
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377
   fib M¨ ⍳15
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377

   cmpx 'fib 30'
1.45E0 
   cmpx 'fib M 30'
1.08E¯4

Anonymous function:

      {1≥⍵:⍵ ⋄ (∇ ⍵-2)+∇ ⍵-1}M¨ ⍳15
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377

Combinations
⍺ comb ⍵ produces a matrix of all the size- combinations of ⍳⍵. The algorithm is from the venerable APL: An Interactive Approach by Gilman and Rose.

      comb←{(⍺≥⍵)∨0=⍺:((⍺≤⍵),⍺)⍴⍳⍺ ⋄ (0,1+(⍺-1)∇ ⍵-1)⍪1+⍺ ∇ ⍵-1}

      3 comb 5
0 1 2
0 1 3
0 1 4
0 2 3
0 2 4
0 3 4
1 2 3
1 2 4
1 3 4
2 3 4

      3 (comb ≡ comb M) 5
1

      cmpx '8 comb M 16' '8 comb 16'
  8 comb M 16 → 1.00E¯3 |     0% ⎕                             
  8 comb 16   → 3.63E¯2 | +3526% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

A faster comb is possible without a memo operator, but it’s not easy. (See for example section 4.1 of my APL87 paper.) A memo operator addresses the performance problem with multiple recursion and restores it to the toolkit of thought.

Presented at the BAA seminar at the Royal Society of Arts on 2016-05-20

I-Beam Mnemonics

I-Beam () is an operator that takes as its operand a numeric code and derives a function which isn’t really considered to be part of the APL language – for example: something which could be experimental, which might provide access to parts of the interpreter that should only be accessed with care, or may set up specific conditions within the interpreter to help in testing. Principally, I-Beam functions exist for internal use within Dyalog but as of version 15.0 there are 55 I-Beam functions that are published and available for general use. How on earth are you supposed to remember all the different codes?

To an extent, you are not. It is perhaps no bad thing that they are a little difficult to remember: I-Beam functions are experimental, liable to be changed or even removed and should be used with care. The codes appear to be somewhat random partly because they are somewhat random – mostly selected on the whim of the developer who implemented the function. However, some codes were chosen less randomly than others – quite a few are grouped so that I-Beams that provide related functionality appear consecutively or at least close together but the most interesting ones are “named” – or, at least, given a numeric code that is derived from a name or otherwise memorable value. So here are some of those unofficial mnemonics that may help you better remember them too.

A favourite trick for deriving a code from a meaningful name is to devise a name containing the letters I, V, X, L, C, D and M, and then convert that into a number as if it were a Roman numeral. Thus:

  • “Inverted table IndeX of” is abbreviated to IIX, which is I-Beam 8
  • “Syntax Colouring” rather awkwardly becomes “Cyntax Colouring”, thence CC and I-Beam 200
  • “Called Monadically” is CM, or 900
  • “Memory Manager” is MM – I-Beam 2000
  • “Line Count” is LC, or rather L,C – 50100

Another favourite is to use numbers that look or sound like something else:

  • The compression I-Beam is 219, which looks a bit like “ZIP”
  • The case folding I-Beam is 819, or “BIg”
  • When support for function trains was in development an I-Beam was used to switch between different implementations – 1060, or “lOCO[motive]” (this I-Beam is no longer in use)
  • The number of parallel threads I-Beam is 1111 – four parallel lines
  • The fork I-Beam is 4000: 4000 is 4K; “Four K” said very quickly might sound like “Fork”

For others the function is associated, or can be associated, with something already numeric:

  • The I-Beam to update function timestamps is 1159 – a memorable time
  • The I-Beam to delete the content in unused pockets is 127 – the same as the ASCII code for the delete character
  • The draft JSON standard is RFC 7159 and support for it is implemented as 7159⌶
  • The now-deprecated I-Beam to switch between Random Number Generator algorithms is 16807 – the default value used to derive the random seed in a clear workspace

Tips for the use of I-Beam functions

  • Do not use I-Beam functions directly in application code – always encapsulate their use in cover-functions that can quickly be modified to protect your application in the event that Dyalog should change the behaviour of, or withdraw, an I-Beam function.
  • Do not use I-Beam functions that are not documented by Dyalog in the Language Reference or elsewhere – those designed to test the interpreter may have undesirable side effects.
  • Do let Dyalog know if you find an I-Beam function particularly useful and/or have suggestions for its development. Some new features are initially implemented as I-Beam functions to allow such feedback to shape their final design.