Using Dyalog APL to study the geographical distribution of vegetation species in northern tropical Africa
By Vibeke Ulmann and Nicolas Delcros
At Dyalog we have been wondering for some time about a recurring spark of applications for educational licences from Africa. In this article we speak to Professor Michel Godron, who has been using APL for 20 years for statistical processing of databases holding various ecological measurements. He has now developed an application in Dyalog APL that facilitates a cooperation project between 4 universities in France, Ivory Coast, Bénin and Burkina Faso aimed to study the evolution of landscapes in northern tropical Africa.
Overview of the project
Michel explains, "The African project we are currently working on targets the evolution of the landscapes over the period 2000-2015, and the timeframe for its implementation is 2014-2017. In another project we are monitoring the displacement over time of high-altitude species in the southern Alps due to the climate changes. In the field of studies of landscape ecology, standard statistical inference doesn't apply because we have no a priori distribution model, and because the samples are too sparse to represent the universe - which is strongly heterogeneous. By the way, this also applies to many other fields, e.g. sociology.
"Non-inferential statistics are not mainstream and are rarely taught, so I had to implement the methodology myself – simply because there is no commercial application software available. The users of the application are all students and scientists from 4 universities in 4 different countries. They open the workspace and use the APL functions directly to get text-mode (session) results."
Methodology – an example of non-inferential statistics
Michel uses an example, "Let's say we want to sample 50 places, and for each one note if there is a presence of Holly and/or Ivy. We then get the following contingency table:
Ivy | ||||
---|---|---|---|---|
Present | Absent | Total | ||
Holly | Present | 2 | 4 | 6 |
Absent | 10 | 34 | 44 | |
Total | 12 | 38 | 50 |
"As you can see the possible number of coexistences are 0, 1, 2, 3, 4, 5 or 6, and the probability of finding the 2 observed coexistences is given by: P = (6! 44! 12! 38!) / (2! 4! 10! 34! 50!) = 0,306
"The 7 possible contingency tables have the following probabilities, which sum up to unity:
Number of coexistences | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
---|---|---|---|---|---|---|---|
Probability | .173 | .379 | .306 | .116 | .021 | .001 | .00006 |
"The lower the probability in a table, the more information it carries about the relation between the two species."
NOTE: Léon Brillouin (Science and Information, Academic Press, 2nd ed., New-York 1962) defines the level of information of an event as log2(1/P).
Michel continues, "Once we have the probabilities of all the contingency tables between all the studied species, we can start building an ecological map of species, grouping them by descending information – that is ascending probability. In doing so, it appears that groups of species are more or less linked to each other, like islands of an archipelago that may be linked by isthmus or high underwater grounds. Hence the name "archipelago algorithm".
"In this case the computation is done on a 2-state variable (presence/absence), but can be generalised to multi-state variables (e.g. altitude range) using Maxwell-Boltzmann's macro-state probability formula. But the bottom line is that no reference to the whole universe or to a model is made, so the statistical interpretation of the data is purely empirical and not biased by estimators."
Why use Dyalog APL?
The advantage of using Dyalog APL for developing this application is the ease of expression for matrix calculus (for example, the inner product) and the ease of developing functions transparently. The logic structure of APL is simply ideal to build the new statistical 'non-inferential' functions which are necessary.
It took 4 people collaborating over 1 year to have the first demonstrable prototype version ready – specific to the first site of measurement. It is currently being adapted to cover the other sites of measurement.
The software development process
With regards to the process of developing the prototype, Michel comments, "Humility is a key requirement for being able to adapt to local constraints. But at least there were no data protection issue to overcome. And the total outcome of the solution has been great, as we have been able to replace classical estimations with non-inferential analysis, for example for measuring the biodiversity.
"We have dispensed with a user interface development, as all the students use the workspace in prompt mode from a Dyalog Version 14.0 development interpreter. We have implemented a process whereby they are allowed access to the whole workspace – although we have no security system to enforce that. Neither do we have any data security implemented – so the data are not kept in a secure environment. But then again, who, outside the project, would be interested in our data until we finish the final report?" Michel laughs.
Michel is currently also looking to extend the application to cover other ecological fields. However, the application is developed for scientific use only and there are no plans to make it into a commercial application. "It is intended for university scientists working with me, and currently 5 users are working with the application", says Michel.
Working with Dyalog the company
Asked about the collaboration with Dyalog, Michel states, "I have been extremely satisfied with the help provided by the persons of support@dyalog.com during the development process – we have developed a very trustful relationship over time. I consider their technical support to score 10 out of 10. And as for Dyalog APL – well that is truly exceptional! If I could point out one thing where there is room for improvement, it would be lack of training material. I have had to develop my own initiation workspace."
For further information:
If you are interested in knowing more about landscape ecology Michel recommends:
- Landscape Ecology, Richard T. T. Forman and Michel Godron, 1986, Wiley
If you want to know more about the archipelago algorithm used, he recommends you review:
- Écologie et évolution du monde vivant, M. Godron, 2012, Ed. L'Harmattan, chapter 5