Monday, 30 June 2008
GCJ Practice C: Egg Drop, Part V: Code for large data set
# egg_drop.rb
## Precalculation of G
$g=[]
101.times do |drops|
$g[drops] = []
101.times do |breaks|
$g[drops][breaks] = (drops == 0 or breaks == 0) ? 0 : 1+$g[drops-1][breaks-1]+$g[drops-1][breaks]
end
end
## Precalculation of H
$h=[]
100001.times do |drops|
if (drops % 100 == 0)
puts drops
end
$h[drops] = []
7.times do |breaks|
$h[drops][breaks] = (drops == 0 or breaks == 0) ? 0 : 1+$h[drops-1][breaks-1]+$h[drops-1][breaks]
end
end
$infinity = 1 << 32
## Definition of f
def f(drops, breaks)
return (breaks <= 0 or drops <= 0) ? 0 :
(breaks == 1) ? drops :
(drops >= 100 and breaks >= 7) ? $infinity :
(drops >= 100000 and breaks >= 2) ? $infinity :
(drops >= 100) ? $h[drops][breaks] :
(breaks >= drops) ? $g[drops][drops] :
$g[drops][breaks]
end
case_number = 0
## Open files for reading and writing
File.open('I:\CodeJam\C-large.out','w') do |out_file|
File.open('I:\CodeJam\C-large.in','r') do |in_file|
while line = in_file.gets
## Parse input - skip first line
in_floors, in_drops, in_breaks = line.split.map do |word|
word.to_i()
end
if in_drops
puts line
## Calculate output
out_floors, out_drops_upper, out_breaks_upper = in_floors, in_drops, in_breaks
out_drops_lower, out_breaks_lower = 0, 0
if (f(in_drops,in_breaks) >= $infinity)
out_floors = -1
else
out_floors = f(in_drops,in_breaks)
end
while (out_drops_lower + 1 < out_drops_upper)
out_drops_middle = (out_drops_lower + out_drops_upper) / 2
if (f(out_drops_middle, in_breaks) >= in_floors)
out_drops_upper = out_drops_middle
else
out_drops_lower = out_drops_middle
end
end
while (out_breaks_lower + 1 < out_breaks_upper)
out_breaks_middle = (out_breaks_lower + out_breaks_upper) / 2
if (f(in_drops, out_breaks_middle) >= in_floors)
out_breaks_upper = out_breaks_middle
else
out_breaks_lower = out_breaks_middle
end
end
out_drops, out_breaks = out_drops_upper, out_breaks_upper
## Put output into required format
case_number += 1
outString = "Case ##{case_number}: #{out_floors} #{out_drops} #{out_breaks}"
out_file.puts outString
end
end
end
end
GCJ Practice C: Egg Drop, Part III: Solution for large data set
In working out how to scale this problem to the large data set, we need to identify what the problem areas of code are that will take too long with the larger bounds of the data. The problematic areas are highlighted in the below code:
Set f(D,B)=0 for D in [0..100] and B in [0..100]
For D in [1..100] do
For B in [1..100] do
Set f(D,B)=1+f(D-1,B)+f(D-1,B-1)
Next B
Next D
Read in F,D,B
If f(D,B) >= 2^32
Set F2 = -1
else
Set F2 = f(D,B)
end if
Set D2 = D
While (f(D2-1,B) >= F) do
Subtract 1 from D2
End while
Set B2 = B
While (f(D,B2-1) >= F) do
Subtract 1 from B2
End while
output F2, D2, B2
The latter two issues are with trying to find the boundary between the breaking and non-breaking regions by looking one at a time. These clearly cry out for a binary search.
Suppose that we know that the numbers [1,X] are split into two (possibly empty) sets Low and High, such that if i is in Low, then so are all smaller numbers, and if i is in High, then so are all bigger numbers. Suppose also that we want to know the largest element of Low (0 if Low is empty). The following pseudocode will find it:
Set i = X
While (i > 0 and i notin Low)
Subtract 1 from i
End while
print i
However, in worst-case running-time, Low will be empty and this will take X steps. Binary search will find the answer much quicker. The following pseudocode indicates how binary search works:
Set lowerBound = 0
Set upperBound = X+1
while (lowerBound + 1) < upperBound
midvalue = floor((lowerBound+uppperBound)/2)
lowerBound = midValue
else
upperBound = midValue
end if
end while
print lowerBound
Note that each time we run through the loop, lowerBound is 0 or is in Low, and upperBound is X+1 or is in High. Thus when the loop completes, lowerBound is the highest element of Low (or 0 if Low is empty), and upperBound is the smallest element of High or (X+1 if High is empty)
Thus we can replace the part of the code looking for D2 with the following:
Set D2u = D # Smallest currently known value such that f(.,B) >= F
Set D2l = 0 # Largest currently known value such that f(.,B) < F
while (D2l + 1) < D2u
D2m = floor((D2u+D2l) / 1)
if f(D2m,B) >= F
Set D2u = D2m
else
Set D2l = D2m
end if
end while
Set D2 = D2u
We can do similarly with the code looking for B2.
However, the first block of code suffers from a much more chronic problem. We are trying to calculate the values of f(D,B) for all D and B in the range, and there are just too many such D,B to do them all in the required time. We have to replace this with some quicker way to only calculate the values of f which we need.
The method I used during the contest was to find a non-recursive formula for f and just apply this as a function. Note that the non-recursive formula I found was still to slow to work out f(2^30,2^30), but I could put in a quick test to see if f(D,B)>2^32, and then quickly calculate f(D,B) if it were not that large.
However, this is more complicated than this problem needs. All it really needs is to spend some time looking at the f(D,B) values calculated in the previous problem. Almost all of them are over 2^32, and so their precise values are not needed by the program. In particular, f(100,7)>2^32, and hence f(D,B)>2^32 for all D>=100 and B>=7.
Some quick experimenting will show that f(100000, 2)>2^32 and hence f(D,B)>2^32 for all D>=100000 and B>=2. Thus for D>=100000, the only important value of f(D,.) is f(D,1)=D.
Thus we can replace the original calculation section with the following:
Initialize a double array G of size 101,101
Initialize a double array H of size 100001,7
Set G[D,B]=0 for D in [0..100] and B in [0..100]
For D in [1..100] do
For B in [1..100] do
Set G[D,B]=1+G[D-1,B]+G[D-1,B-1]
Next B
Next D
Set H[D,B]=0 for D in [0..100000] and B in [0..6]
For D in [1..100] do
For B in [1..6] do
Set H[D,B] = G[D,B]
Next B
Next D
For D in [101..100000] do
For B in [1..6] do
Set H[D,B]=1+H[D-1,B]+H[D-1,B-1]
Next B
Next D
Define function f(D,B) as:
If (D >= 100 and B >= 7) return 2^32
If (D >= 100000 and B >= 2) return 2^32
If (B = 1) return D
If (D <= 0) return 0
If (B <= 0) return 0
If (D >= 100) {
# Since we are here, we know that neither B <= 0 or B = 1 are true
# Thus B >= 2
# We therefore know, since we are here, that D >= 100000 is not true
# Also, since we are here, we know that D >= 100 and hence B >= 7 is not true
# Thus 0 <= B <= 6 and 0 <= D <= 99999, so D,B is in the range of H
return H[D,B]
}
# Now we know that 0 <= D < 100
if (B > D) Set B = D # You can't break eggs you don't drop
# We now know that 0 <= B < 100 also, so D,B i sin the range of G
return G[D,B]
End Define
In my next post, I will put this all together for some code that solves the large data set.
GCJ Practice C: Egg Drop, Part III: Solution for small data set
The discussion above is enough to fully solve the small data set. Here is the solution is pseudocode:
Set f(D,B)=0 for D in [0..100] and B in [0..100]
For D in [1..100] do
For B in [1..100] do
Set f(D,B)=1+f(D-1,B)+f(D-1,B-1)
Next B
Next D
Read in F,D,B
If f(D,B) >= 2^32
Set F2 = -1
else
Set F2 = f(D,B)
end if
Set D2 = D
While (f(D2-1,B) >= F) do
Subtract 1 from D2
End while
Set B2 = B
While (f(D,B2-1) >= F) do
Subtract 1 from B2
End while
output F2, D2, B2
Here is the same solution in Ruby:
# egg_drop.rb
## Precalculation of f
f=[]
101.times do |drops|
f[drops] = []
101.times do |breaks|
f[drops][breaks] = (drops == 0 or breaks == 0) ? 0 : 1+f[drops-1][breaks-1]+f[drops-1][breaks]
end
end
case_number = 0
## Open files for reading and writing
File.open('I:\CodeJam\C-small.out','w') do |out_file|
File.open('I:\CodeJam\C-small.in','r') do |in_file|
while line = in_file.gets
## Parse input - skip first line
in_floors, in_drops, in_breaks = line.split.map do |word|
word.to_i()
end
if in_drops
## Calculate output
out_floors, out_drops, out_breaks = in_floors, in_drops, in_breaks
if (f[in_drops][in_breaks] >= (1 <<>
out_floors = -1
else
out_floors = f[in_drops][in_breaks]
end
while (f[out_drops-1][in_breaks] >= in_floors)
out_drops -= 1
end
while (f[in_drops][out_breaks-1] >= in_floors)
out_breaks -= 1
end
## Put output into required format
case_number += 1
out_file.puts "Case # #{case_number}: #{out_floors} #{out_drops} #{out_breaks}"
end
end
end
end
This solves the small data set in a couple of seconds. However, since the number of drops and breaks can get as large as 2,000,000,000, the double loop at the start (and the associated array) are not feasible. Thus we will need a different idea to solve the large data set.
GCJ Practice C: Egg Drop, Part II: Recursive Solution, Sample Cases
The reformulation of the problem was as follows:
X is a variable from the range {0, 1, ..., F}.
We wish to evaluate X asking at most D questions of the form "Is X >= f?", from which we are allowed at most B answers "No".
We write Solvable(F,D,B) if it is possible to come up with a strategy that will work no matter what F is.
PART I
Suppose your first question is "Is X >= i". Then there are two possibilities:
- The answer is yes, and you are looking to evaluate whether X = i, i+1, ..., F, using at most D-1 questions and at most B "No" responses.
- The answer is no, and you are looking to evaluate whether X = 0, ..., i-1, using at most D-1 questions and at most B-1 "No" responses.
You can guarantee solvability in the first case if and only if Solvable(F-i, D-1, B) (with floor i becoming the new floor 0). You can guarantee solvability in the second case if and only if Solvable(i-1, D-1, B-1).
PART II
It is clear that Solvable(F,D,B) implies Solvable(F',D,B) for all F' < F. It is also clear that since you are asking D yes no questions, you can only get 2^D possible sets of responses, and so Solvable(2^D,D,B) is not true (as this would require different sets of responses for floors 0, 1, ..., 2^D).
Thus there is a largest integer F such that Solvable(F,D,B) holds (call it f(D,B)) and then Solvable(F,D,B) is equivalent to F <= f(D,B).
PART III
If D=0, you can't ask any questions, and so there can be no floors. If B = 0, you can't ask any questions you don't know the answer to, and so there can be no floors. Thus f(D,0)=f(0,B)=0.
If D>0 and B>0, then F <= f(D,B) iff Solvable(F,D,B) iff there exists i such that Solvable(i-1,D-1,B-1) and Solvable(F-i,D-1,B), i.e. iff there exists i such that i <= f(D-1,B-1) and F-(i+1) <= f(D-1,B).
This states that F <= f(D,B) iff F-(f(D-1,B-1)+1) <= f(D-1,B), which is iff F <= f(D-1,B)+f(D-1,B-1)+1.
Thus f(D,B)=f(D-1,B)+f(D-1,B-1)+1.
If we define f(D,B) by f(D,0)=f(0,B)=0 and f(D,B)=f(D-1,B)+f(D-1,B-1)+1, then Solvable(F,D,B) is equivalent to F <= f(D,B).
In the problem we are given values F,D,B for which Solvable(F,D,B) is true, and asked to return the largest F, F_max, such that Solvable(F_max, D, B) is true, the smallest D, D_min, such that Solvable(F, D_min, B) is true and the smallest B, B_min, such that Solvable(F, D, B_min) is true.
The smallest D_min such that Sovable(F,D_min,B) is true is the smallest D_min such that F <= f(D_min, B), and similarly B_min is the smallest value such that F <= f(D, B_min).
Thus, with the two sample inputs being F,D,B = (3,3,3) and (7,5,3), we need to know the values of f(D,B) for all 0<=D<=5 and 0<=B<=3.
We can compute these using f(0,B)=f(D,0)=0 and the recursive formula
f(D,B)=f(D-1,B-1)+f(D-1,B)+1:
f(0,0)=0 f(0,1)=0 f(0,2)=0 f(0,3)=0
f(1,0)=0 f(1,1)=1 f(1,2)=1 f(1,3)=1
f(2,0)=0 f(2,1)=2 f(2,2)=3 f(2,3)=3
f(3,0)=0 f(3,1)=3 f(3,2)=6 f(3,3)=7
f(4,0)=0 f(4,1)=4 f(4,2)=10 f(4,3)=14
f(5,0)=0 f(5,1)=5 (5,2)=15 f(5,3)=25
So, for (F,D,B)=(3,3,3):
F_max = f(3,3)=7.
D_min = (smallest D such that f(.,3)>=3) = 2
B_min = (smallest B such that f(3,.)>=3) = 1
for (F,D,B)=(7,5,3)
F_max = f(5,3)=25
D_min = (smallest D such that f(.,3)>=7) = 3
B_min = (smallest B such that f(5,.)>=7) = 2
GCJ Practice C: Egg Drop, Part I: Clarification
Since some people had difficulty understanding the problem, I will attempt to state it clearer:
There is a building with F floors. We are interested in knowing from which floors we can drop an egg without it breaking. We make the following assumptions:
- If you drop two eggs from the same floor, then either both will break or neither will break. It does not depend on time of day, whether the egg is a chicken egg or an easter egg, or anything other than the floor from which the egg is dropped.
- If it breaks when you drop it from the k'th floor, it will also break from any higher floor.
- (Equivalent to the previous) If it does not break when you drop it from the k'th floor, it iwll also not break from any lower floor.
If we define X to be the highest floor from which you can drop an egg and it will not break, it is clear that:
- X takes some value in the range {0,1,...,F}. (X=0 means the egg breaks from every floor, while X=F means the egg breaks from no floor).
- Working out for precisely which floors an egg will break is equivalent to working out the value of X (f <= X means that the egg does not break from floor f, while f > X means that it does)
- Dropping an egg from floor f is equivalent to asking "Is X >= f". A yes response is equivalent to an intact egg, and a no response is equivalent to a broken egg.
Suppose you have a 100 floor building (F = 100). One way to answer the question is to go to all the floors in any old order, and drop an egg off each. If D>=100 and B>=100, then you have solved the building. This shows Solvable(100, 100, 100) is true.
However, you are wasting effort (and eggs). Suppose now, that you drop the first egg from the 30th floor. One of two things will happen:
- It will break, and you then know that it breaks from the 30th, 31st, 32nd, ..., 100th floors. You still have to discover about the 1st, 2nd, ..., 29th. You have broken 1 egg and dropped 1 egg.
- It does not break, and you then know that it will not break from the 1st, 2nd, ..., 29th, 30th floor. You are left to discover about the 31st, ..., 100th. You have broken 0 eggs and dropped 1 egg.
This shows that Solvable(100,71,70) holds.

















































