enumerating combinations
Forum rules
This forum is for discussing APL-related issues. If you think that the subject is off-topic, then the Chat forum is probably a better place for your thoughts !
This forum is for discussing APL-related issues. If you think that the subject is off-topic, then the Chat forum is probably a better place for your thoughts !
2 posts
• Page 1 of 1
enumerating combinations
A combination of size ⍺ from ⍳⍵ is a sorted vector of length ⍺ with unique items, all of which are in ⍳⍵. The following function from [Hui 1987, §4.1], rewritten in current Dyalog APL, generates all combinations of size ⍺ from ⍳⍵, in sorted order.
(Aside: 1+(⍺-1)∇(⍵-1) ←→ ⍺∇⍢(-∘1)⍵ where ⍢ is the prospective under AKA dual operator.)
In this note, we consider the problem of enumerating combinations: Given ⍺ and ⍵ and a combination c, determine the index of that combination in ⍺ comb ⍵. For example, for ⍺←4 and ⍵←6, the index of 0 2 3 4 is 6 and that of 1 2 3 5 is 11. The index is of course (⍺ comb ⍵)⍳c, but we want to compute the index without creating ⍺ comb ⍵.
In ifc1, k is the leading term of the combination in question. The sum (m-1)+.!(n-k)+⍳k accounts for the combinations whose leading terms are less than k. Using the sum directly as in ifc1 is not satisfactory for an enormous combination such as 23 1e4 ifc1 ¯23↑⍳1e4 (with k=9977).
What is this sum? To use a particular example, the terms of the sum are
and are adjacent entries in Pascal’s triangle, shown in green below:
This is the sum analyzed in Pascal’s Ladder, and the sum equals (m!n)-m!n-k (the blue (56) minus the red (4) ), the difference of two binomial coefficients instead of the sum of k binomial coefficients. Whence:
We now develop an enumeration which works for large combinations, by using the extended precision number facility nats, a monadic operator, from the dfns workspace. The results of nats-derived functions are character vectors of decimal digits; the arguments can be character vectors or scalars or ordinary integers. (We leave as an exercise for the reader the enumeration of humongous combinations, combinations where ⌈/c is 2*53 or larger, combinations whose elements require extended precision integers.)
It is relatively straightforward to model extended precision versions of functions, by replacing + by +nats, × by ×nats, etc.
Translating ifc into the extended precision domain, we get:
Collected Definitions
See Enumerating Permutation for the enumeration of permutations. Portions of this text previously appeared in the J Wiki essay Combination Index, 2005-11-11.
⍝
comb←{
(⍺=0)∨⍺=⍵:(1,⍺)⍴⍳⍺
c←(⍺-1)!(⍺-1)+⊖k←⍳1+⍵-⍺
(c⌿k), ⊃ ⍪⌿ (-c) ↑¨ ⊂1+(⍺-1)∇(⍵-1)
}
2 comb 4 3 comb 5 4 comb 6
0 1 0 1 2 0 1 2 3
0 2 0 1 3 0 1 2 4
0 3 0 1 4 0 1 2 5
1 2 0 2 3 0 1 3 4
1 3 0 2 4 0 1 3 5
2 3 0 3 4 0 1 4 5
1 2 3 0 2 3 4
1 2 4 0 2 3 5
1 3 4 0 2 4 5
2 3 4 0 3 4 5
1 2 3 4
1 2 3 5
1 2 4 5
1 3 4 5
2 3 4 5
(Aside: 1+(⍺-1)∇(⍵-1) ←→ ⍺∇⍢(-∘1)⍵ where ⍢ is the prospective under AKA dual operator.)
In this note, we consider the problem of enumerating combinations: Given ⍺ and ⍵ and a combination c, determine the index of that combination in ⍺ comb ⍵. For example, for ⍺←4 and ⍵←6, the index of 0 2 3 4 is 6 and that of 1 2 3 5 is 11. The index is of course (⍺ comb ⍵)⍳c, but we want to compute the index without creating ⍺ comb ⍵.
⍝
ifc1←{
(m n)←⍺
1≥m:⊃⍵,0
((m-1)+.!(n-k)+⍳k) + (⍺-1,1+k) ∇ 1↓⍵-1+k←⊃⍵
}
4 6 ifc1 0 2 3 4
6
4 6 ifc1⍤1 ⊢4 comb 6
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
(⍳4!6) ≡ 4 6 ifc1⍤1 ⊢4 comb 6
1
In ifc1, k is the leading term of the combination in question. The sum (m-1)+.!(n-k)+⍳k accounts for the combinations whose leading terms are less than k. Using the sum directly as in ifc1 is not satisfactory for an enormous combination such as 23 1e4 ifc1 ¯23↑⍳1e4 (with k=9977).
What is this sum? To use a particular example, the terms of the sum are
m←3 ⋄ n←8 ⋄ k←4
(m-1)!(n-k)+⍳k
6 10 15 21
and are adjacent entries in Pascal’s triangle, shown in green below:
This is the sum analyzed in Pascal’s Ladder, and the sum equals (m!n)-m!n-k (the blue (56) minus the red (4) ), the difference of two binomial coefficients instead of the sum of k binomial coefficients. Whence:
ifc← {(m n)←⍺ ⋄ -⌿ (m-⍳m) +.! n-(¯1↓0,1+⍵),⍪⍵}
4 6 ifc 0 2 3 4
6
4 6 ifc⍤1 ⊢4 comb 6
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
(⍳4!6) ≡ 4 6 ifc⍤1 ⊢4 comb 6
1
We now develop an enumeration which works for large combinations, by using the extended precision number facility nats, a monadic operator, from the dfns workspace. The results of nats-derived functions are character vectors of decimal digits; the arguments can be character vectors or scalars or ordinary integers. (We leave as an exercise for the reader the enumeration of humongous combinations, combinations where ⌈/c is 2*53 or larger, combinations whose elements require extended precision integers.)
)copy dfns nats ⍝ http://dfns.dyalog.com/n_nats.htm
3 +nats 4
7
(,'7') ≡ 3 +nats 4
1
3 ×nats 4
12
'12' ≡ 3 ×nats 4
1
3 ×nats '4'
12
It is relatively straightforward to model extended precision versions of functions, by replacing + by +nats, × by ×nats, etc.
xbin← {(0≥⍺)∨⍺≥⍵:⍕⍺=⍵ ⋄ (⊃ ×nats⌿ ⍵-d) ÷nats (⊃ ×nats⌿ 1+d←⍳⍺⌊⍵-⍺)}
11 ! 33
193536720
11 xbin 33
193536720
47 ! 99
4.47391E28
47 xbin 99
44739148260023940935799206928
Translating ifc into the extended precision domain, we get:
xfc← {(m n)←⍺ ⋄ ⊃ -nats⌿ (m-⍳m) (+nats).xbin n-(¯1↓0,1+⍵),⍪⍵}
4 6 ifc ¯4↑⍳6
14
4 6 xfc ¯4↑⍳6
14
4!6
15
23 1e4 xfc ¯23↑⍳1e4
3771461434429626616429953717057906562371311072521139025732118617119999
23 xbin 1e4
3771461434429626616429953717057906562371311072521139025732118617120000
23!1e4
3.77146E69
0 ⍕ 23!1e4
3771461434429628______________________________________________________
Collected Definitions
⍝
comb←{
(⍺=0)∨⍺=⍵:(1,⍺)⍴⍳⍺
c←(⍺-1)!(⍺-1)+⊖k←⍳1+⍵-⍺
(c⌿k), ⊃ ⍪⌿ (-c) ↑¨ ⊂1+(⍺-1)∇(⍵-1)
}
ifc1←{
(m n)←⍺
1≥m:⊃⍵,0
((m-1)+.!(n-k)+⍳k) + (⍺-1,1+k) ∇ 1↓⍵-1+k←⊃⍵
}
ifc ← {(m n)←⍺ ⋄ -⌿ (m-⍳m) +.! n-(¯1↓0,1+⍵),⍪⍵}
xbin ← {(0≥⍺)∨⍺≥⍵:⍕⍺=⍵ ⋄ (⊃ ×nats⌿ ⍵-d) ÷nats (⊃ ×nats⌿ 1+d←⍳⍺⌊⍵-⍺)}
xfc ← {(m n)←⍺ ⋄ ⊃ -nats⌿ (m-⍳m) (+nats).xbin n-(¯1↓0,1+⍵),⍪⍵}
See Enumerating Permutation for the enumeration of permutations. Portions of this text previously appeared in the J Wiki essay Combination Index, 2005-11-11.
- Roger|Dyalog
- Posts: 238
- Joined: Thu Jul 28, 2011 10:53 am
Re: enumerating combinations
The previous post left some unfinished business, namely, producing the combination from the index. In this post we derive cfi (combination from index) and cfx (combination from extended-precision index), inverses of the ifc and xfc functions. First, cfi.
The expression ⊖(m-1)!(m-1)↓⍳n in cfi computes the same vector as c←(⍺-1)!(⍺-1)+⊖k←⍳1+⍵-⍺ in comb, with ⍺=m and ⍵=n. c[k] is the number of combinations whose leading item is k. For combination index ⍵ in cfi, the leading item of the corresponding combination is 1⍳⍨⍵<+⍀c.
In cfx, the binomial coefficients in c, (m-1)!(m-1)+⊖k←⍳1+n-m, and the number of them can both be very large. The sum +\c is seen to be equivalent to (m!n)-m!(m-1)+⊖⍳1+n-m using reasoning similar to the Pascal’s Ladder derivation for ifc.
The latter expression is amenable to binary search. Such search can avoid computing a binomial coefficient until it is needed.
As an intermediate step towards cfx, we write a version of cfi which uses ilead:
cfx obtains by extending ilead and cfi1 to use extended precision operations:
Examples:
The last example illustrates that since the number rows of m comb n with a leading 0 is (m-1)!(n-1), the
(m-1)!(n-1)-th row is necessarily 1+⍳m.
Collected Definitions
The computation of cfx is rather labored. One has the feeling that it can be simplified by additional insights into the mathematics.
⍝
cfi←{
(m n)←⍺
1≥m:m⍴⍵
s← +⍀ ⊖ (m-1) ! (m-1) ↓ ⍳n
k← 1 ⍳⍨ ⍵<s
k , (1+k) + (⍺-1,1+k)∇(⍵-k⌷0,s)
}
m←4 ⋄ n←9
(m,n) cfi 0
0 1 2 3
(m,n) cfi ¯1+m!n
5 6 7 8
comb←{
(⍺=0)∨⍺=⍵:(1,⍺)⍴⍳⍺
c←(⍺-1)!(⍺-1)+⊖k←⍳1+⍵-⍺
(c⌿k), ⊃ ⍪⌿ (-c) ↑¨ ⊂1+(⍺-1)∇(⍵-1)
}
(m comb n) ≡ (m,n) cfi⍤1 0 ⊢⍳m!n
1
The expression ⊖(m-1)!(m-1)↓⍳n in cfi computes the same vector as c←(⍺-1)!(⍺-1)+⊖k←⍳1+⍵-⍺ in comb, with ⍺=m and ⍵=n. c[k] is the number of combinations whose leading item is k. For combination index ⍵ in cfi, the leading item of the corresponding combination is 1⍳⍨⍵<+⍀c.
⊖(m-1)!(m-1)↓⍳n
56 35 20 10 4 1
⊢ c← (m-1)!(m-1)+⊖k←⍳1+n-m
56 35 20 10 4 1
⊢ c49 ←m comb n
0 1 2 3
0 1 2 4
0 1 2 5
0 1 2 6
0 1 2 7
...
4 5 7 8
4 6 7 8
5 6 7 8
{≢⍵}⌸c49[;0]
56 35 20 10 4 1
In cfx, the binomial coefficients in c, (m-1)!(m-1)+⊖k←⍳1+n-m, and the number of them can both be very large. The sum +\c is seen to be equivalent to (m!n)-m!(m-1)+⊖⍳1+n-m using reasoning similar to the Pascal’s Ladder derivation for ifc.
+\c
56 91 111 121 125 126
(m!n)-m!(m-1)+⊖⍳1+n-m
56 91 111 121 125 126
The latter expression is amenable to binary search. Such search can avoid computing a binomial coefficient until it is needed.
⍝
ilead←{
(m n)←⍺ ⋄ a←m!n ⋄ i←⍵
bs←{
⍵<⍺: (n-1)-⍵
d←a-m!j←⍺+⌊2÷⍨⍵-⍺
i<d: (j+1) ∇ ⍵
i>d: ⍺ ∇ (j-1)
n-j
}
(m-1)bs(n-1)
}
(m,n) ilead⍤1 0 ⍳m!n
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 5
⍉ {⍺,≢⍵} ⌸(m,n) ilead⍤1 0⍳m!n
0 1 2 3 4 5
56 35 20 10 4 1
As an intermediate step towards cfx, we write a version of cfi which uses ilead:
⍝
cfi1←{
f←{(q i z)←⍵ ⋄ (q-1+k),(i-(⍺!q)-⍺!q-k),⊂z,k←(⍺,q)ilead i}
(m n)←⍺
(⍳m) + +⍀ ⊃⌽ ⊃ f⌿ (1+⍳m),⊂n,⍵,⊂⍬
}
(m comb n) ≡ (m,n) cfi1⍤1 0 ⍳m!n
1
cfx obtains by extending ilead and cfi1 to use extended precision operations:
⍝
xlead←{
(m n)←⍺ ⋄ a←m xbin n ⋄ i←⍵
bs←{
⍵<⍺:(n-1)-⍵
d←a-nats m xbin⊢j←⍺+⌊2÷⍨⍵-⍺
⍎i<nats d: (j+1) ∇ ⍵
⍎i>nats d: ⍺ ∇ (j-1)
n-j
}
(m-1)bs(n-1)
}
cfx←{
f←{
(q i z)←⍵
(q-1+k), (⊂i -nats (⍺ xbin q) -nats ⍺ xbin q-k), ⊂z,k←(⍺,q)xlead i
}
(m n)←⍺
(⍳m) + +⍀ ⊃⌽ ⊃ f⌿ (1+⍳m),⊂n,(⊂⍵),⊂⍬
}
Examples:
5 ! 5e6
2.60416E31
⊢ x←5 xbin 5e6
26041614583369791656250001000000
5 5e6 cfx x -nats 1
4999995 4999996 4999997 4999998 4999999
¯5↑⍳5e6
4999995 4999996 4999997 4999998 4999999
99!299
1.38608E81
99 xbin 299
138608382108618824826112784210880186009348866858121623622101121910158544277466
9540
(1+⍳100) ≡ 100 300 cfx 99 xbin 299
1
The last example illustrates that since the number rows of m comb n with a leading 0 is (m-1)!(n-1), the
(m-1)!(n-1)-th row is necessarily 1+⍳m.
Collected Definitions
⍝
)copy dfns nats
comb←{
(⍺=0)∨⍺=⍵:(1,⍺)⍴⍳⍺
c←(⍺-1)!(⍺-1)+⊖k←⍳1+⍵-⍺
(c⌿k), ⊃ ⍪⌿ (-c) ↑¨ ⊂1+(⍺-1)∇(⍵-1)
}
ifc1←{
(m n)←⍺
1≥m:⊃⍵,0
((m-1)+.!(n-k)+⍳k) + (⍺-1,1+k) ∇ 1↓⍵-1+k←⊃⍵
}
ifc ← {(m n)←⍺ ⋄ -⌿ (m-⍳m) +.! n-(¯1↓0,1+⍵),⍪⍵}
xbin ← {(0≥⍺)∨⍺≥⍵:⍕⍺=⍵ ⋄ (⊃ ×nats⌿ ⍵-d) ÷nats (⊃ ×nats⌿ 1+d←⍳⍺⌊⍵-⍺)}
xfc ← {(m n)←⍺ ⋄ ⊃ -nats⌿ (m-⍳m) (+nats).xbin n-(¯1↓0,1+⍵),⍪⍵}
cfi ← {
(m n)←⍺
1≥m:m⍴⍵
s← +⍀ ⊖ (m-1) ! (m-1) ↓ ⍳n
k← 1 ⍳⍨ ⍵<s
k , (1+k) + (⍺-1,1+k)∇(⍵-k⌷0,s)
}
ilead←{
(m n)←⍺ ⋄ a←m!n ⋄ i←⍵
bs←{
⍵<⍺: (n-1)-⍵
d←a-m!j←⍺+⌊2÷⍨⍵-⍺
i<d: (j+1) ∇ ⍵
i>d: ⍺ ∇ (j-1)
n-j
}
(m-1)bs(n-1)
}
cfi1←{
f←{(q i z)←⍵ ⋄ (q-1+k),(i-(⍺!q)-⍺!q-k),⊂z,k←(⍺,q)ilead i}
(m n)←⍺
(⍳m) + +⍀ ⊃⌽ ⊃ f⌿ (1+⍳m),⊂n,⍵,⊂⍬
}
xlead←{
(m n)←⍺ ⋄ a←m xbin n ⋄ i←⍵
bs←{
⍵<⍺:(n-1)-⍵
d←a-nats m xbin⊢j←⍺+⌊2÷⍨⍵-⍺
⍎i<nats d: (j+1) ∇ ⍵
⍎i>nats d: ⍺ ∇ (j-1)
n-j
}
(m-1)bs(n-1)
}
cfx←{
f←{
(q i z)←⍵
(q-1+k), (⊂i -nats (⍺ xbin q) -nats ⍺ xbin q-k), ⊂z,k←(⍺,q)xlead i
}
(m n)←⍺
(⍳m) + +⍀ ⊃⌽ ⊃ f⌿ (1+⍳m),⊂n,(⊂⍵),⊂⍬
}
The computation of cfx is rather labored. One has the feeling that it can be simplified by additional insights into the mathematics.
- Roger|Dyalog
- Posts: 238
- Joined: Thu Jul 28, 2011 10:53 am
2 posts
• Page 1 of 1
Who is online
Users browsing this forum: No registered users and 1 guest
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group