Google Interview Answers, part 0

This is the first in a series of posts where I will attempt to answer some of the interesting interview questions which are given to programming candidates at Google. Unless otherwise noted, I answered these questions of my own accord and did not look them up.

This post is literate haskell, so you can copy/paste it into a file with the extension .lhs and it will compile in ghc.

> {-# LANGUAGE NoMonomorphismRestriction #-}
> import Data.List
> import Data.Maybe
> import Control.Monad
> import System.Random

Question: Given a function which produces a random integer in the range 1 to 5, write a function which produces a random integer in the range 1 to 7.

I’m going to slightly modify the question. First, I want to deal with integers starting with 0, so the ranges will be 0 to 4 and 0 to 6. Second, random numbers are a little tricky in Haskell, since they violate referential integrity. My answers could be written monadically to handle this, but instead I will write functions that take an infinite list (or lists) of integers, assumed to be uniformly random and in the range of 0 to 4. Here’s my first answer:

> random_0_to_6_take_0 = head

head is the haskell function that returns the first member of a list. Since the first member is given as a random integer in the range 0 to 4, it is also a random number in the range 0 to 6. But that’s not the intent of this question. Let’s get a little closer to meeting that intent.

> random_0_to_6_take_1 ns = (sum $ take 6 ns) `div` 4

take returns the first n (in this case 6) members of a list. sum adds them up. So, this function sums up 6 random numbers to get 0-24, and integer divides the result by 4 to get a number from 0 to 6.

Again, this answer matches the specification of the question, but likely not the intent. This will produce a random integer between 0 and 6, with 5 and 6 as possible outputs, but the distribution of the outputs won’t be uniform. They’ll be biased towards the middle, just as two 6-sided dice are most likely to add up to 7 when rolled. Additionally, this function will produce 6 only once in 15625 calls (on average). Let’s fix that:

> random_0_to_6_take_2 ns0 ns1 =
>   let n = fromJust $
>            find (< 21) $
>            zipWith (\x y -> x + (5 * y)) ns0 ns1
>   in
>     n `div` 3

zipWith traverses two lists memberwise, mapping each pair of members using the given function. Since x and y are uniformly distributed between 0 and 4, x + (5 * y) is uniformly distributed between 0 and 24. find returns the first member satisfying the given predicate, which nets us an integer, uniformly distributed between 0 and 20. Dividing by 3 and truncating gives an integer, uniformly distributed between 0 and 6. Mission accomplished! Now to make it work.

> skip1 [] = []
> skip1 (hd:tl) = hd:(skip2 tl)

> skip2 [] = []
> skip2 (hd:tl) = skip1 tl 

> take_2 lst = random_0_to_6_take_2 (skip1 lst) (skip2 lst)

> aggregate = map (\g -> (head g, length g)) . group . sort

> create_many ls f = zipWith ($) (replicate 1000000 f) ls

> random_list :: RandomGen g => g -> [Int]
> random_list = randomRs (0, 4)

> run ls f = putStrLn . show . aggregate $ create_many ls f

> functions = [random_0_to_6_take_0, random_0_to_6_take_1, take_2]

> main = do
>   gen <- getStdGen
>   let lists = map random_list $ map mkStdGen $ randoms gen
>   mapM_ (run lists) functions

And here’s an output I got. Notice that in 1,000,000 iterations, the second function produced 6 only 63 times.

[(0,200092),(1,200329),(2,199494),(3,199689),(4,200396)]
[(0,5456),(1,94158),(2,344817),(3,395923),(4,146038),(5,13545),(6,63)]
[(0,142393),(1,142973),(2,142851),(3,143253),(4,142905),(5,142912),(6,142713)]

Overall, this was great fun. What an excellent question! I hope my answer is enlightening.