As per my action item from the last meeting here are alternatives for assignment of values to a fixed number
of bins when the range_list contains duplicate values.
There seem to be 3 alternatives.
A) Eliminate duplicate values, sort values
- Determine the set of unique values. Let V be the cardinality of this set.
- Sort the values in increasing order.
- Evenly allocate the values to the bins. The first bin gets the first N values, the second bin gets the next N values, and so on. The last bin gets the remaining values. N is the truncated integer value of V/n, where n is the size of the bin as specified by the user using "bins b[n]" notation. Note that if V < n, the bins 0 to V-1 will get one value each starting from the smallest value from the sorted set; rest of the bins do not get any value.
B) Eliminate duplicate values, retain user order
- Determine the ordered set of unique values. If a value occurs more than once, the 1st occurance of the value is retained and each subsequenct repetition of the value is ignored. Let V be the cardinality of this set.
- Evenly allocate the values to the bins. The first bin ...<same text as in A>.
C) Retain duplicate values, retain user order
- If a fixed number of bins is specified and the 'range_list' contains duplicate values, the duplicate values are retained for the purpose of allocating values to bins.
- Determine the set of values (including duplicates). Let V be the cardinality of this set.
- Evenly allocate the values to the bins. The first bin ...<same text as in A>.
My preference is for option B. The users order of values should be preserved and the elimination of duplicate values is consistent with the '[]' form where a bin is created for each (unique) value.
---------------------------------------------------------------------------
For illustration of the alternatives, here each of the alternatives is applied to:
bins bns1[3] = { 9, 1, 3, 1, 7, 1, 5, 2, 1, 4, 8, 6 };
bins bns2[3] = {5,6, [8:11], [1:9] };
A) bns1; the sorted unique values are [1:9]
1st bin: [1:3]
2nd bin: [4:6]
3rd bin: [7:9]
bns2; the sorted unique values are [1:11]
1st bin: [1:3]
2nd bin: [4:6]
3nd bin: [7:11]
B) bns1; the unique ordered values are {9,1,3,7,5,2,4,8,6}
1st bin: {9,1,3}
2nd bin: {7,5,2}
3rd bin: {4,8,6}
bns2; the sorted unique values are {5,6,8,9,10,11,1,2,3,4,7}
1st bin: {5,6,8}
2nd bin: {9,10,11}
3nd bin: {1,2,3,4,7}
C) bns1; there are 12 values
1st bin: {9,1,3,1}
2nd bin: {7,1,5,2}
3rd bin: {1,4,8,6}
bns2; there are 15 values
1st bin: {5,6,8,9,10}
2nd bin: {11,1,2,3,4}
3nd bin: {5,6,7,8,9}
- Ray
Received on Wed Sep 29 14:29:49 2004
This archive was generated by hypermail 2.1.8 : Wed Sep 29 2004 - 14:30:11 PDT