## repeated squaring

APL-related discussions - a stream of APL consciousness.
Not sure where to start a discussion ? Here's the place to be
Forum rules
This forum is for discussing APL-related issues. If you think that the subject is off-topic, then the Chat forum is probably a better place for your thoughts !

### repeated squaring

If function f “multiplies” then f⍨ squares; squaring n times raises to power 2*n and is therefore a fast way to exponentiate.

`      ×⍨ 39      ×⍨ ×⍨ ×⍨ ×⍨ 3 43046721      ×⍨⍣4 ⊢ 343046721      3 * 2*443046721`

One way to compute the n-th power for an arbitrary n (when n may be not a power of 2), is to find the squares corresponding to 1 bits in the binary representation of n, and then compute the product.

`      3*25847288609443      2⊥⍣¯1⊢251 1 0 0 1      ⍸ ⊖ 2⊥⍣¯1⊢250 3 4      {×⍨⍣⍵⊢3}¨ 0 3 43 6561 43046721      ×⌿ {×⍨⍣⍵⊢3}¨ 0 3 4847288609443`

The examples show that it would be nice if ⍣ accepts an array right operand. (In that case, the squaring would be done 4 times, rather than 0 times and 3 times and 4 times.) Until that blessed day, the repeated squaring computation can be coded as follows:

`      RS← {⊃ ⍺⍺⌿ b⌿ (⊢,⊂∘(⍺⍺⍨)∘⊃∘⊖)⍣(¯1+≢b) ⊂⍺ ⊣ b←⊖2⊥⍣¯1⊢⍵}              ⍝ └──────────────────────────┘              ⍝    ←→    ⍺⍺⍨⍣(⍸b)      3 ×RS 25847288609443      3 +RS 2575`

In the opening sentence above, multiplication is in quotes because the method works for other kinds of multiplication, such as matrix multiplication or ⍵[⍵] (as in the text fitting problem). The last example 3 +RS 25 illustrates that it works for + as “multiplication”, whose “squaring” is doubling and whose “exponentiation” is multiplication.

An Application: Fibonacci Numbers

We illustrate repeated squaring by some computations on the Fibonacci numbers.

`      fib← {1≥⍵:⍵ ⋄ (∇ ⍵-2)+∇ ⍵-1}      fib¨ ⍳100 1 1 2 3 5 8 13 21 34`

The double recursion is quite inefficient: The number of function calls for fib n is fib n+1, which is exponential in n. More efficient formulations exist:

`      fib2← {⊃ +⍀∘⊖⍣⍵ ⍳2}      fib2¨ ⍳100 1 1 2 3 5 8 13 21 34      fib3← {⌊0.5+r÷⍨(2÷⍨1+r←5*0.5)*⍵}      fib3¨ ⍳100 1 1 2 3 5 8 13 21 34      cmpx 'fib 30' 'fib2 30' 'fib3 30'  fib 30  → 1.56E0  |    0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕  fib2 30 → 0.00E0  | -100%                                 fib3 30 → 0.00E0  | -100%                                     cmpx 'fib2 30' 'fib3 30'  fib2 30 → 9.75E¯6 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕  fib3 30 → 1.81E¯6 | -82% ⎕⎕⎕⎕⎕⎕`

Whither repeated squaring? The relationship (fib n)=(fib n-2)+(fib n-1) can be written in matrix form, and:

`      ⊢ A←2 2⍴0 1 1 10 11 1      A +.× 0 11 1      A +.× A +.× 0 11 2      A +.× A +.× A +.× 0 12 3      A +.× A +.× A +.× A +.× 0 13 5      A +.× A +.× A +.× A +.× A +.× 0 15 8     (A +.× A +.× A +.× A +.× A) +.× 0 15 8      fib 68      ¯1 ¯1↑(A+.×A+.×A+.×A+.×A)8`

fib n for n≥1 is the entry at the lower right corner of the (n-1)-th power of A. The last parenthesized expression, the (A +.× … +.× A), is where repeated squaring can be used.

`      fibm← {1≥⍵:⍵ ⋄ ⊃ ¯1 ¯1↑ A +.×RS ⍵-1}      fibm¨ ⍳100 1 1 2 3 5 8 13 21 34      cmpx 'fib2 50' 'fib3 50' 'fibm 50'  fib2 50 → 1.58E¯5 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕    fib3 50 → 1.73E¯6 | -90% ⎕⎕⎕                             fibm 50 → 1.70E¯5 |  +7% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕`

The last benchmark shows that the matrix method with repeated squaring is competitive with the other methods. We next show that it is superior to the other methods for large n.

For large n, fib3, based on computations on irrational numbers (5*0.5), is not in the running, at least not without a large amount of work (which we won’t do here). For large n, we first develop functions for extended precision arithmetic:

`      carry ← {(+⌿∧⍀0=t)↓t←{1↓+⌿1 0⌽0,0 10⊤⍵}⍣≡0,⍵}      xp    ← {carry (m↑⍺)+m↑⍵ ⊣ m←-(≢⍺)⌈≢⍵}              ⍝ plus      xt    ← {carry +⌿ (m-0 1) ⍴ (m←(≢⍺)+0,≢⍵) ↑ ⍺∘.×⍵}  ⍝ times      3 1 4 1 5 9 2 6 5 3 xp 2 7 1 8 2 83 1 4 1 8 6 4 4 8 1      3141592653 + 2718283141864481      3 1 4 1 5 9 2 6 5 3 xt 2 7 1 8 2 88 5 3 9 7 2 8 4 7 6 7 9 6 8 4      3141592653 × 271828853972847679684`

“Numbers” are vectors of decimal digits, most significant digit first. xp is extended precision plus and xt is extended precision times. (The FFT-based xtimes in the dfns workspace is faster for large numbers, but we don’t need it to prove the point.) Modifying fib2 and fibm for extended precision computations:

`      xfib2← {⊃ xp⍀∘⊖⍣⍵ ,¨⍳2}      xfib2¨ ⍳10┌─┬─┬─┬─┬─┬─┬─┬───┬───┬───┐│0│1│1│2│3│5│8│1 3│2 1│3 4│└─┴─┴─┴─┴─┴─┴─┴───┴───┴───┘      ⊢xA←,¨A┌─┬─┐│0│1│├─┼─┤│1│1│└─┴─┘      xfibm← {1≥⍵:,⍵ ⋄ ⊃ ¯1 ¯1↑ xA xp.xt RS ⍵-1}      xfibm¨ ⍳10┌─┬─┬─┬─┬─┬─┬─┬───┬───┬───┐│0│1│1│2│3│5│8│1 3│2 1│3 4│└─┴─┴─┴─┴─┴─┴─┴───┴───┴───┘`

Let us compute a Fibonacci number with 200 decimal digits. From fib3, we know that

`      r←5*0.5      (2÷⍨1+r)⍟r×1e200958.6666692945176      fib3 9587.255617127345179E199`

Release the hounds!

`      (xfib2 ≡ xfibm) 9581      t←xfibm 958      ⍴t200      t7 2 5 5 6 1 7 1 2 7 3 4 4 9 4 5 7 8 3 4 9 5 1 3 … 7 9 1 8 0 7 9      (54⍴10 1/1 0) \ 4 50 ⍴ ⎕d[t]7255617127 3449457834 9513639664 8012357907 43082018833452265955 2369453901 0959098708 8044945653 64650643087041102422 5291077532 1041899600 8519495018 07458257175818735941 5023021847 2032795681 1455446059 3757918079      {⎕fr←1287 ⋄ ¯40 ⍕ r÷⍨(2÷⍨1+r←5*0.5)*958} ⍬    ⍝ sanity check 7.255617127344945783495136396648071______E199`

How effective is repeated squaring (xfibm)?

`      cmpx 'xfib2 958' 'xfibm 958'  xfib2 958 → 1.53E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕  xfibm 958 → 4.07E¯3 | -74% ⎕⎕⎕⎕⎕⎕⎕⎕                            cmpx 'xfib2 2000' 'xfibm 2000'  xfib2 2000 → 4.50E¯2 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕  xfibm 2000 → 6.50E¯3 | -86% ⎕⎕⎕⎕`

Collected Definitions

`      ⍝RS    ← {⊃ ⍺⍺⌿ b⌿ (⊢,⊂∘(⍺⍺⍨)∘⊃∘⊖)⍣(¯1+≢b) ⊂⍺ ⊣ b←⊖2⊥⍣¯1⊢⍵}fib   ← {1≥⍵:⍵ ⋄ (∇ ⍵-2)+∇ ⍵-1}fib2  ← {⊃ +⍀∘⊖⍣⍵ ⍳2}fib3  ← {⌊0.5+r÷⍨(2÷⍨1+r←5*0.5)*⍵}A     ← 2 2⍴0 1 1 1fibm  ← {1≥⍵:⍵ ⋄ ⊃ ¯1 ¯1↑ A +.×RS ⍵-1}carry ← {(+⌿∧⍀0=t)↓t←{1↓+⌿1 0⌽0,0 10⊤⍵}⍣≡0,⍵}xp    ← {carry (m↑⍺)+m↑⍵ ⊣ m←-(≢⍺)⌈≢⍵}              ⍝ plusxt    ← {carry +⌿ (m-0 1) ⍴ (m←(≢⍺)+0,≢⍵) ↑ ⍺∘.×⍵}  ⍝ timesxA    ← ,¨Axfib2 ← {⊃ xp⍀∘⊖⍣⍵ ,¨⍳2}xfibm ← {1≥⍵:,⍵ ⋄ ⊃ ¯1 ¯1↑ xA xp.xt RS ⍵-1}`

The initial portion of this post is translated from the J Wiki Essay Repeated Squaring, 2005-09-28.
Roger|Dyalog

Posts: 208
Joined: Thu Jul 28, 2011 10:53 am

### Re: repeated squaring

Roger|Dyalog wrote:For large n, fib3, based on computations on irrational numbers (5*0.5), is not in the running, at least not without a large amount of work (which we won’t do here).

It turns out that it is not a large amount of work and it was done already in 2005 in the J Wiki Essay Fibonacci Sequence, section Q and Z ring extensions. There, the explanation is rather cryptic, if I do say so myself:
Based on Binet’s formula … operations are done in Q[√5] and Z[√5] with powers computed by repeated squaring.

In this note we will provide a more leisurely derivation.

Binet’s formula states that:

`      fib n  ←→  (√5) ÷⍨ ((2÷⍨1+√5)*n) - (2÷⍨1-√5)*n      fib n  ←→  (2*n) ÷⍨ (√5) ÷⍨ ((1+√5)*n) - (1-√5)*n     ⍝ (*)`

We will do computations in the ring extension Z[√5] -- the integers extended with √5. A “number” is an ordered pair of integers (a,b) whose real value (value as a real number) is (a+b×√5). Plus and times for such numbers are done as follows:

`      ⍝rp← +                    ⍝ plus:  (a,b) rp (c,d) ←→ ((a+c),b+d)rt← (1 5 +.× ×) , +.×∘⊖  ⍝ times: (a,b) rt (c,d) ←→ (((a×c)+5×b×d),(a×d)+b×c)      3 4 rp 6 ¯29 2      3 4 rt 6 ¯2¯22 18`

(Misalignments in the display are due to defects in the APL Chat Forum software.)

In (*), the key computations are (1+√5)*n and (1-√5)*n or, in the ordered pair representation, 1 1 and 1 ¯1 to the power n. We can use the RS (repeated squaring) operator from the previous post to do the power:

`      RS← {⊃ ⍺⍺⌿ b⌿ (⊢,⊂∘(⍺⍺⍨)∘⊃∘⊖)⍣(¯1+≢b) ⊂⍺ ⊣ b←⊖2⊥⍣¯1⊢⍵}      1 1 rt RS 132134016 954368      1 ¯1 rt RS 132134016 ¯954368      fib3 ← {⌊0.5+r÷⍨(2÷⍨1+r←5*0.5)*⍵}      fib3 13233      (2 × 954368) ÷ 2*13233`

We observe that the first item of the ordered pairs are the same and the second item is the negation of each other. Whence in (*), ((1+√5)*n) - (1-√5)*n ↔ (1 1 rt RS n) - (1 ¯1 rt RS n) ↔ (0 2×1 1 rt RS n) whose real value is √5 × ⊃⌽(0 2×1 1 rt RS n). But this value is divided by √5, so we are left with just ⊃⌽(0 2×1 1 rt RS n), twice the value of the second item of 1 1 rt RS n. No computation involving √5 is required. The final division by 2*n can be avoided by halving at each step. Therefore:

`      fib4 ← {0=⍵:0 ⋄ ⊃⌽ 1 1 (÷∘2 rt) RS ⍵}      fib4¨ ⍳100 1 1 2 3 5 8 13 21 34      fib3¨ ⍳100 1 1 2 3 5 8 13 21 34`

For exact calculations of fib n for large n using this idea, we work with ordered pairs of extended precision numbers, that is, ordered pairs of vectors of decimal digits, and extend the operations in fib4. (carry, xp, xt, and xfibm are from the previous post, reproduced here.)

`      ⍝carry ← {(+⌿∧⍀0=t)↓t←{1↓+⌿1 0⌽0,0 10⊤⍵}⍣≡0,⍵}xp    ← {carry (m↑⍺)+m↑⍵ ⊣ m←-(≢⍺)⌈≢⍵}              ⍝ plusxt    ← {carry +⌿ (m-0 1) ⍴ (m←(≢⍺)+0,≢⍵) ↑ ⍺∘.×⍵}  ⍝ timesxA    ← ,¨Axfibm ← {1≥⍵:,⍵ ⋄ ⊃ ¯1 ¯1↑ xA xp.xt RS ⍵-1}halve ← {carry (⌊⍵÷2)+¯1↓0,5×2|⍵}qt    ← ((,¨1 5) xp.xt xt¨) , xp.xt∘⊖ xfib4 ← {1≥⍵:,⍵ ⋄ ⊃⌽ (,¨1 1) (halve¨ qt) RS ⍵}      xfib4¨ ⍳10┌─┬─┬─┬─┬─┬─┬─┬───┬───┬───┐│0│1│1│2│3│5│8│1 3│2 1│3 4│└─┴─┴─┴─┴─┴─┴─┴───┴───┴───┘      (xfib4 ≡ xfibm)¨ ⍳101 1 1 1 1 1 1 1 1 1      ⎕d[xfib4 300]222232244629420445529739893461909967206666939096499764990979600      fib3 3002.2223224462942268E62`

xfib4 is faster than xfibm:

`      cmpx 'xfibm 958' 'xfib4 958'  xfibm 958 → 3.87E¯3 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕  xfib4 958 → 2.94E¯3 | -25% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕             cmpx 'xfibm 2000' 'xfib4 2000'  xfibm 2000 → 6.40E¯3 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕  xfib4 2000 → 3.75E¯3 | -42% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕`

Finally, an aside: fib3 is defined as ⌊0.5+r÷⍨(2÷⍨1+r)*⍵ where r←5*0.5, which replaces the r÷⍨(2÷⍨1-r)*⍵ term in Binet’s formula by ⌊0.5+… . The maneuver derives from that the magnitude of the replaced term is always less than 0.5 (and diminishes exponentially with ⍵):

`      12 ¯6 ⍕ r÷⍨(2÷⍨1-r)*4 5⍴⍳20 4.47214E¯1 ¯2.76393E¯1  1.70820E¯1 ¯1.05573E¯1  6.52476E¯2 ¯4.03252E¯2  2.49224E¯2 ¯1.54029E¯2  9.51949E¯3 ¯5.88337E¯3  3.63612E¯3 ¯2.24725E¯3  1.38888E¯3 ¯8.58372E¯4  5.30503E¯4 ¯3.27869E¯4  2.02634E¯4 ¯1.25235E¯4  7.73994E¯5 ¯4.78354E¯5`
Roger|Dyalog

Posts: 208
Joined: Thu Jul 28, 2011 10:53 am