rotational duplicates
Forum rules
This forum is for discussing APL-related issues. If you think that the subject is off-topic, then the Chat forum is probably a better place for your thoughts !
This forum is for discussing APL-related issues. If you think that the subject is off-topic, then the Chat forum is probably a better place for your thoughts !
3 posts
• Page 1 of 1
rotational duplicates
Skip Cave posted an interesting puzzle on the J Programming Forum on 2020-05-16. A rewording of the problem is as follows:
Some solutions are presented in the following message.
.
.
.
.
.
.
.
.
.
.
.
.
.
Given an integer matrix, find the rows which are rotational duplicates of each other. Two rows r and s are rotational duplicates if r≡k⌽s for some integer k.
Some solutions are presented in the following message.
.
.
.
.
.
.
.
.
.
.
.
.
.
- Roger|Dyalog
- Posts: 238
- Joined: Thu Jul 28, 2011 10:53 am
Re: rotational duplicates
Given an integer matrix, find the rows which are rotational duplicates of each other. Two rows r and s are rotational duplicates if r≡k⌽s for some integer k.
The crux of the problem is to find a signature (key) which uniquely identifies the rotational variants of a vector. The groups of rotational duplicates obtain by (sig mat){⊂⍵}⌸mat. The solutions here use as signature the lexicographically least (first in a sort) of the possible rotations of a vector.
Test Data
odometer← ⊢ ⊤⍤1 0 ⍳∘(×/)
x←,⍨ ⍪⍨ odometer 5⍴5
⍴x
6250 10
4↑x
0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 1
0 0 0 0 2 0 0 0 0 2
0 0 0 0 3 0 0 0 0 3
¯4↑x
4 4 4 4 1 4 4 4 4 1
4 4 4 4 2 4 4 4 4 2
4 4 4 4 3 4 4 4 4 3
4 4 4 4 4 4 4 4 4 4
sig0
sig0← (⊃∘⍋ ⌷ ⊢) ∘ (⍳∘≢ ⌽⍤0 1 ⊢) ⍤1
⍴ sig0 x
6250 10
⍴ q← ∪ sig0 x
629 10
(10↑q) (¯10↑q)
┌───────────────────┬───────────────────┐
│0 0 0 0 0 0 0 0 0 0│2 4 4 4 3 2 4 4 4 3│
│0 0 0 0 1 0 0 0 0 1│2 4 4 4 4 2 4 4 4 4│
│0 0 0 0 2 0 0 0 0 2│3 3 3 3 3 3 3 3 3 3│
│0 0 0 0 3 0 0 0 0 3│3 3 3 3 4 3 3 3 3 4│
│0 0 0 0 4 0 0 0 0 4│3 3 3 4 4 3 3 3 4 4│
│0 0 0 1 1 0 0 0 1 1│3 3 4 3 4 3 3 4 3 4│
│0 0 0 1 2 0 0 0 1 2│3 3 4 4 4 3 3 4 4 4│
│0 0 0 1 3 0 0 0 1 3│3 4 3 4 4 3 4 3 4 4│
│0 0 0 1 4 0 0 0 1 4│3 4 4 4 4 3 4 4 4 4│
│0 0 0 2 1 0 0 0 2 1│4 4 4 4 4 4 4 4 4 4│
└───────────────────┴───────────────────┘
(Misalignments in the display are due to defects in the APL Chat Forum software.)
sig0 applies independently to each row of the matrix. It finds all the rotations of a row, and selects the first in the sort of these rotations.
sig1
sig1← (⊃∘⍋ ⌷ ⊢) ∘ ((⍸ ⊢ = ⌊/) ⌽⍤0 1 ⊢) ⍤ 1
(sig0 ≡ sig1) x
1
The lexicographically least rotation of a vector necessarily begins with the smallest element. Instead of doing all rotations of a row, sig1 works with only those rotations which begins with the smallest element.
sig2
sig2←{
(n c)←⍴⍵
0 1↓ t ⌷⍨⊂ (⊂c×⍳n)⌷ ⍋t← (c⌿⍳n), ((n×c)⍴⍳c) ⌽⍤¯1 ⊢c⌿⍵
}
(sig0 ≡ sig2) x
1
sig2 is like sig0, working with all rotations of a row, but does so on the entire matrix at once and not separately on each row (not f⍤1).
sig3
sig3←{
(n c)←⍴⍵ ⍝ # rows and columns
r←(⍳n)×1+2×⌈/|,⍵ ⍝ the "radius" increment for each row
q←⍵+⍤¯1⊢r ⍝ increase each row by the radius
b←q=⍤¯1⌊/q ⍝ where elements equal a row minimum
s←+/b ⍝ # times that happens for each row
r -⍤¯1⍨ t ⌷⍨ ⊂ (⊂+\¯1↓0,s) ⌷ ⍋t← (c|⍸,b) ⌽ s⌿q
}
(sig0 ≡ sig3) x
1
Likewise, sig3 is sig1 reworked to work on the entire matrix at once. It assumes that the maximal element in the entire matrix (needed for the “radius”) is not so large as to consume all available precision. If the maximal element is indeed too large, do sig3⍳⍨x instead of just sig3 x.
Benchmarks
cmpx 'sig0 x' 'sig1 x' 'sig2 x' 'sig3 x'
sig0 x → 2.44E¯2 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
sig1 x → 1.99E¯2 | -19% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
sig2 x → 1.94E¯2 | -21% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
sig3 x → 2.73E¯3 | -89% ⎕⎕⎕
w←?100 600⍴10 ⍝ wide matrix
cmpx 'sig0 w' 'sig1 w' 'sig2 w' 'sig3 w'
sig0 w → 4.08E¯2 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
sig1 w → 3.84E¯3 | -91% ⎕
sig2 w → 1.24E¯1 | +203% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
sig3 w → 4.84E¯3 | -89% ⎕
n←?6000 10⍴10 ⍝ narrow matrix
cmpx 'sig0 n' 'sig1 n' 'sig2 n' 'sig3 n'
sig0 n → 2.40E¯2 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
sig1 n → 1.68E¯2 | -30% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
sig2 n → 1.85E¯2 | -23% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
sig3 n → 1.66E¯3 | -94% ⎕⎕
These benchmarks illustrate the rule-of-thumb that code that lets primitive functions “see” larger amounts of data tends to be faster.
- Roger|Dyalog
- Posts: 238
- Joined: Thu Jul 28, 2011 10:53 am
Re: rotational duplicates
The “radius” technique used in sig3 of the previous post converts a function on a cell (here vectors) into an equivalent one on every cell of an entire array. The idea is that by adding a different multiple of the radius to different cells, the function operating on all the cells at once automatically produces independent results.
We illustrate this with an example. Suppose we want to sort the rows of matrix y (sort each row; funny language, English :-) One way is to do {⍵[⍋⍵]}⍤1 ⊢y:
With the radius approach, we first find the number 1+2×⌈/|,y, the “radius”:
When the rows are incremented by successive multiples of the radius, a value in a row is necessarily less than any of the values in the next row. If the incremented matrix is ravelled and then sorted, the values of a row stay together. A final reshape and subtraction of r produce the required result:
f invokes {⍵[⍋⍵]} once, independent of the number of rows in the argument. In contrast, {⍵[⍋⍵]}⍤1⊢y can (for all we know) invoke {⍵[⍋⍵]} on each row of y. The technique also works for grade ⍋ (and other functions).
Benchmarks done in version 17.1. The results may change if/when the {⍵[⍋⍵]}⍤1 and ⍋⍤1 implementations are improved.
(Misalignments in the display are due to defects in the APL Chat Forum software.)
We illustrate this with an example. Suppose we want to sort the rows of matrix y (sort each row; funny language, English :-) One way is to do {⍵[⍋⍵]}⍤1 ⊢y:
y
4 ¯18 17 52 ¯5 52 30 ¯10 ¯18 44
¯7 ¯13 16 ¯20 ¯14 21 59 20 6 34
20 ¯19 42 29 33 19 45 ¯35 ¯33 ¯34
{⍵[⍋⍵]}⍤1 ⊢y
¯18 ¯18 ¯10 ¯5 4 17 30 44 52 52
¯20 ¯14 ¯13 ¯7 6 16 20 21 34 59
¯35 ¯34 ¯33 ¯19 19 20 29 33 42 45
With the radius approach, we first find the number 1+2×⌈/|,y, the “radius”:
⌈/|,y
59
1+2×⌈/|,y
119
⊢ r← (⍳≢y) × 1+2×⌈/|,y
0 119 238
When the rows are incremented by successive multiples of the radius, a value in a row is necessarily less than any of the values in the next row. If the incremented matrix is ravelled and then sorted, the values of a row stay together. A final reshape and subtraction of r produce the required result:
y +⍤¯1 ⊢r
4 ¯18 17 52 ¯5 52 30 ¯10 ¯18 44
112 106 135 99 105 140 178 139 125 153
258 219 280 267 271 257 283 203 205 204
{⍵[⍋⍵]} , y+⍤¯1 ⊢r ⍝ newlines manually inserted into display
¯18 ¯18 ¯10 ¯5 4 17 30 44 52 52
99 105 106 112 125 135 139 140 153 178
203 204 205 219 257 258 267 271 280 283
(⍴y) ⍴ {⍵[⍋⍵]} , y+⍤¯1 ⊢r
¯18 ¯18 ¯10 ¯5 4 17 30 44 52 52
99 105 106 112 125 135 139 140 153 178
203 204 205 219 257 258 267 271 280 283
r -⍤¯1⍨ (⍴y) ⍴ {⍵[⍋⍵]} , y+⍤¯1 ⊢r
¯18 ¯18 ¯10 ¯5 4 17 30 44 52 52
¯20 ¯14 ¯13 ¯7 6 16 20 21 34 59
¯35 ¯34 ¯33 ¯19 19 20 29 33 42 45
f← {r -⍤¯1⍨ (⍴⍵)⍴ {⍵[⍋⍵]} , ⍵+⍤¯1 ⊢r←(⍳≢⍵)×1+2×⌈/|,⍵}
f y
¯18 ¯18 ¯10 ¯5 4 17 30 44 52 52
¯20 ¯14 ¯13 ¯7 6 16 20 21 34 59
¯35 ¯34 ¯33 ¯19 19 20 29 33 42 45
{⍵[⍋⍵]}⍤1 ⊢y
¯18 ¯18 ¯10 ¯5 4 17 30 44 52 52
¯20 ¯14 ¯13 ¯7 6 16 20 21 34 59
¯35 ¯34 ¯33 ¯19 19 20 29 33 42 45
f invokes {⍵[⍋⍵]} once, independent of the number of rows in the argument. In contrast, {⍵[⍋⍵]}⍤1⊢y can (for all we know) invoke {⍵[⍋⍵]} on each row of y. The technique also works for grade ⍋ (and other functions).
g ← {(⊃⌽⍴⍵) | (⍴⍵) ⍴ ⍋ , ⍵+⍤¯1 ⊢(⍳≢⍵)×1+2×⌈/|,⍵}
(⍋⍤1 ≡ g) y
1
Benchmarks done in version 17.1. The results may change if/when the {⍵[⍋⍵]}⍤1 and ⍋⍤1 implementations are improved.
x←?1e4 10⍴100
cmpx 'f x' '{⍵[⍋⍵]}⍤1 ⊢x'
f x → 2.67E¯3 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
{⍵[⍋⍵]}⍤1 ⊢x → 5.04E¯3 | +88% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
cmpx 'g x' '⍋⍤1 ⊢x'
g x → 3.32E¯3 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
⍋⍤1 ⊢x → 3.93E¯3 | +18% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
x←?1e4 5⍴100
cmpx 'f x' '{⍵[⍋⍵]}⍤1 ⊢x'
f x → 1.73E¯3 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
{⍵[⍋⍵]}⍤1 ⊢x → 4.44E¯3 | +156% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
cmpx 'g x' '⍋⍤1 ⊢x'
g x → 2.17E¯3 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
⍋⍤1 ⊢x → 3.77E¯3 | +73% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
x←?1e4 20⍴100
cmpx 'f x' '{⍵[⍋⍵]}⍤1 ⊢x'
f x → 4.46E¯3 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
{⍵[⍋⍵]}⍤1 ⊢x → 6.11E¯3 | +37% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
cmpx 'g x' '⍋⍤1 ⊢x'
g x → 5.75E¯3 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
⍋⍤1 ⊢x → 4.13E¯3 | -29% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
x←?1e4 80⍴100
cmpx 'f x' '{⍵[⍋⍵]}⍤1 ⊢x'
f x → 1.39E¯2 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
{⍵[⍋⍵]}⍤1 ⊢x → 1.02E¯2 | -27% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
cmpx 'g x' '⍋⍤1 ⊢x'
g x → 1.88E¯2 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
⍋⍤1 ⊢x → 5.58E¯3 | -71% ⎕⎕⎕⎕⎕⎕⎕⎕⎕
(Misalignments in the display are due to defects in the APL Chat Forum software.)
- Roger|Dyalog
- Posts: 238
- Joined: Thu Jul 28, 2011 10:53 am
3 posts
• Page 1 of 1
Who is online
Users browsing this forum: No registered users and 1 guest
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group