Simple Statistics

Since the nature of APL is to handle data as complete sets, rather than concentrate on the individual elements it's a natural pairing with statistics (which also concerns itself with drawing inferences from the whole).  Here are some simple statistical calculations expressed as Dyalog Dynamic Functions (easily translated into the functional representations of other APLs).  The definitions below assume a vector argument.


      ssize←{⍴⍵}                         ⍝ Sample size 
      ssize 2 3 5              
3
      amean←{(+/⍵)÷⍴⍵}                   ⍝ Arithmetic mean
      amean 2 3 5
3.333333333
      gmean←{(×/⍵)*÷⍴⍵}                  ⍝ Geometric mean
      gmean 1 .5 .25                          
0.5
      hmean←{(⍴⍵)÷+/1÷⍵}                 ⍝ Harmonic mean
      hmean 4 6                                            
4.8
      variance←{(+/(⍵-amean ⍵)*2)÷¯1+⍴⍵} ⍝ Variance
      variance 1 2 3  4 5 6
3.5
      stdev←{(variance ⍵)*.5}            ⍝ Standard deviation
      stdev 1 2 3 4 5 6
1.870828693
      sterr←{(stdev ⍵)÷(⍴⍵)*.5}          ⍝ Standard error
      sterr 1 2 3 4 5 6
0.7637626158
      max←{⌈/⍵}                          ⍝ Maximum
      max ?10⍴5
4
      min←{⌊/⍵}                          ⍝ Minimum
      min ?10⍴5
0
      range←{(max ⍵)-min ⍵}              ⍝ Range
      range ?10⍴5       
4


In practice we would probably want to put some sort of protection around these functions to ensure that they are not exposed to traumatic data values.

Equally, in practice we might extend the definitions to examine properties of a data array by column or row..

      colamean← {(+/[0]⍵)÷0⊃⍴⍵}   ⍝ Columnwise arithmetic mean
      colamean 3 4⍴⍳12
4 5 6 7
      rowamean← {(+/[1]⍵)÷1⊃⍴⍵}   ⍝ Rowwise arithmetic mean
      rowamean 3 4⍴⍳12
1.5 5.5 9.5


Page created 18 January 2010.
Copyright © Dogon Research 2009-2010