## Trying out Rank

General APL language issues

### Trying out Rank

Yesterday I had to code up a new solution to an old problem: Given a boolean matrix B and a corresponding matrix M of the same dimension, pick one item from each row of M where you find the first 1 in the corresponding row of B. The result is a vector with the same number of items as rows in M and B.

This is easily done with scatter-point indexing:

`      V←M[(⍳≢M),¨i]`

The trick is finding i.

In the old days of APL*Plus II, you could do this:

`      B⌊.⍳1`

to produce i directly. I think they had a slightly different implementation of inner product with non-scalar function -- this won't work in Dyalog.

So I tried:

`      (↓B)⍳¨1`

which does the job and apparently fast too.

However I have been looking for an excuse to use the rank operator and this looked like a possible place to try, so I have this:

`      ({⍵⍳1}⍤¯1)B`

This appears not to be a good use of rank because it is comparatively slow and to me looks significantly more obscure. Here are some timings, including adding the the row index for the final scatter point indexing:

`      bm←?1000000 10 ⍴2      cmpx exp3 exp4  (↓bm)⍳¨1     → 1.4E¯1 |    0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕                               ({⍵⍳1}⍤¯1)bm → 5.2E¯1 | +279% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕       cmpx exp1 exp2  (⍳≢bm),¨(↓bm)⍳¨1       → 3.2E¯1 |    0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕                             ↓(⍳≢bm)({⍺(⍵⍳1)}⍤¯1)bm → 9.8E¯1 | +205% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕`

My guess is that rank is not too useful here because the result of the "iota each" is already in the shape I need and there no need to restructure it, and perhaps other reasons.

Not sure if I have a real question here, or just ruminating, but since I tend to use only vectors and matrices and avoid higher ranking arrays, will rank tend to be less useful to me? Where should I look to use rank effectively?

Bonus question: is scatter-point indexing the best solution to the original problem?
paulmansour

Posts: 375
Joined: Fri Oct 03, 2008 4:14 pm

### Re: Trying out Rank

There are other options to that kind of indexing, one slightly better in speed but it does not use rank:

Code: Select all
`     i←?1000000⍴15    M←?1000000  15⍴10    cmpx  'i⌷⍤0 1⊢M' 'M[(⍳≢M),¨i]'  'i⌷¨↓M'  i⌷⍤0 1⊢M    → 3.3E¯1 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕  M[(⍳≢M),¨i] → 2.5E¯1 | -24% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕           i⌷¨↓M       → 2.1E¯1 | -38% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕                   wsreq¨  'i⌷⍤0 1⊢M' 'M[(⍳≢M),¨i]'  'i⌷¨↓M'41000832 31869120 40000132`
DanB|Dyalog

### Re: Trying out Rank

Thank you for posting your comments. These are relatively early days for the implementation of the rank operator, and feedback on what cases are actually used inform what special code are worth implementing and/or the priority of the possible special codes.

Code: Select all
`      b←?1e6 10⍴2      cmpx '(↓b)⍳¨1' 'b⍳⍤1⊢1'  (↓b)⍳¨1 → 1.45E¯1 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕                     b⍳⍤1⊢1  → 2.77E¯1 | +90% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕      i←9⌊b⍳⍤1⊢1  ⍝ the 9⌊ avoids INDEX ERROR in the following      m←?1e6 10⍴127      cmpx 'm[(⍳≢m),¨i]' 'i⌷⍤¯1⊢m' '(,m)[i+(1↓⍴m)×⍳≢m]'  m[(⍳≢m),¨i]        → 2.66E¯1 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕               i⌷⍤¯1⊢m            → 3.87E¯1 | +45% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕  (,m)[i+(1↓⍴m)×⍳≢m] → 6.88E¯3 | -98% ⎕                                       `

I bet that special code for ⍳⍤r and ⌷⍤r would be at least as good as the best of the alternatives. (And never bet against an implementer in this sort of thing. :-)
Roger|Dyalog

Posts: 238
Joined: Thu Jul 28, 2011 10:53 am

### Re: Trying out Rank

Well, it's a bit rough yet, but the following timings will obtain in version 15.0:

Code: Select all
`      i←?1e6⍴10      m←?1e6 10⍴127      cmpx 'm[(⍳≢m),¨i]' 'i⌷⍤¯1⊢m' '(,m)[i+(1↓⍴m)×⍳≢m]'  m[(⍳≢m),¨i]        → 3.19E¯1 |    0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕  i⌷⍤¯1⊢m            → 1.38E¯3 | -100%  (,m)[i+(1↓⍴m)×⍳≢m] → 7.75E¯3 |  -98% ⎕                                       `

A "side-by-side" comparison, including the best possible alternative:

Code: Select all
`      v←,m ⋄ j←i+(1↓⍴m)×⍳≢m  ⍝ pre-compute arrays for best possible alternative      cmpx 'i⌷⍤¯1⊢m' '(,m)[i+(1↓⍴m)×⍳≢m]' 'v[j]'  i⌷⍤¯1⊢m            → 1.25E¯3 |    0% ⎕⎕⎕⎕⎕⎕⎕⎕                                  (,m)[i+(1↓⍴m)×⍳≢m] → 6.35E¯3 | +407% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕  v[j]               → 2.31E¯3 |  +84% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕                         `

Will also do ⍳⍤r, which will be a lot more work than this.
Roger|Dyalog

Posts: 238
Joined: Thu Jul 28, 2011 10:53 am

### Re: Trying out Rank

Because my code still has to work in Dyalog 13.2 I don't actually have any examples of using the primitive but a quick search reveals seven distinct calls to an APL model direct operator (rk) four of which are dyadic catenate
`      ,rk l r`
one is monadic enclose of the major cells
`      ⊂rk ¯1`
and two are a dyadic and a monadic call in another utility operator in which the argument(s) and rank(s) are computed and the function is an operand.
`      (f ⍺)⍺⍺ rk(l r)g ⍵`
and
`      ⍺⍺ rk r g ⍵`
I believe I could come up with faster algorithms for each example of catenate but have chosen to anticipate the move to ⍤ for clarity. The others were the most obvious ways to do what had to be done.

Phil Last

Posts: 576
Joined: Thu Jun 18, 2009 6:29 pm

### Re: Trying out Rank

Both ⍺,⍤r⊢⍵ and ⊂⍤r⊢⍵ have special code in the initial release of ⍤ (version 14.0). We are working on improvements to ⊂⍤r⊢⍵ and ↓⍵ with a factor of about 1.5.

⍺⍺⍤r for a general ⍺⍺ can not be sped up by much.
Roger|Dyalog

Posts: 238
Joined: Thu Jul 28, 2011 10:53 am

### Re: Trying out Rank

Roger|Dyalog wrote:⍺⍺⍤r for a general ⍺⍺ can not be sped up by much.

Actually ⍺⍺⍤r for a general ⍺⍺ would benefit from the sped up to ⊂⍤r⊢⍵, which is one justification for that speed-up.
Roger|Dyalog

Posts: 238
Joined: Thu Jul 28, 2011 10:53 am

### Re: Trying out Rank

b⍳⍤1 0⊢0 or 1 has been "souped up", and:

Code: Select all
`      b←1=?1e6 10⍴2                                       cmpx 'b⍳⍤1 0⊢1' '(↓b)⍳¨1'  b⍳⍤1 0⊢1 → 9.25E¯3 |     0% ⎕⎕⎕                                       (↓b)⍳¨1  → 1.40E¯1 | +1408% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕`

The expression also illustrates of why it's better to pad the last dimension of a boolean array to be a multiple of 8:

Code: Select all
`      c←b,1e6 6⍴0      ⍴c1000000 16      cmpx 'c⍳⍤1 0⊢1' '(↓c)⍳¨1'  c⍳⍤1 0⊢1 → 6.63E¯3 |     0% ⎕⎕                                        (↓c)⍳¨1  → 1.39E¯1 | +2000% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕`
Roger|Dyalog

Posts: 238
Joined: Thu Jul 28, 2011 10:53 am