Median of Medians -
i have read order statistics find k-th smallest (or largest) element in array of size n in linear time o(n).
there 1 step needs find median of medians.
- split array [n/5] parts. each part has 5 elements.
- find median in each part. (we have [n/5] numbers now)
- repeat step 1 , 2 until have last number. (i.e. recursive)
t(n) = t(n/5) + o(n) , can t(n) = o(n).
but, true that, number not median of medians, median of medians of medians of medians of medians of medians, if have large array.
please consider array has 125 elements.
first, split 25 parts , find 25 medians. then, split these 25 numbers 5 parts , find 5 medians, finally, obtain number median of medians of medians. (not median of medians)
the reason why care that, can understand there @ [3/4]*n elements smaller (or larger) median of medians. if not median of medians median of medians of medians? in worse case there must less elements smaller (or larger) pivot, means pivot closer bound of array.
if have large array, , found median of medians of medians of medians of medians of medians. in worst case pivot found can still close bound , time complexity in case?
i made dataset of 125 elements. result 9?
0.8 0.9 1 inf inf 1.8 1.9 2 inf inf 6.8 6.9 7 inf inf inf inf inf inf inf inf inf inf inf inf 2.8 2.9 3 inf inf 3.8 3.9 4 inf inf 7.8 7.9 8 inf inf inf inf inf inf inf inf inf inf inf inf 4.8 4.9 5 inf inf 5.8 5.9 6 inf inf 8.8 8.9 9 inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf inf
where inf means number large enough.
let's denote median of medians of medians of ... [median of]* = m.
first, believe median of medians algorithm (to select pivot) not recursive. algorithm goes follows:
- split elements in groups of 5
- find median of each group
- find median of medians , use pivot.
median of medians smaller 3n/10 elements , larger 3n/10 elements, not 3n/4. have n/5 numbers after selecting medians. median of median greater/smaller half of numbers, n/10. each of numbers median itself, it's greater/smaller 2 numbers, giving 2n/10 numbers. in total, n/10 + 2n/10 = 3n/10 numbers.
to address second question, after collecting group of 5's in example dataset , calculating medians, have following sequence:
1, 2, 7, inf, inf 3, 4, 8, inf, inf 5, 6, 9, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf.
so median of medians indeed 9.
your proposed [median of]* algorithm's runtime be:
t(n) = o(n * log(n))
now let's try analyze how many numbers have less/greater m. have following groups:
- depth 1: n/5 elements medians
- depth 2: n/25 elements medians
- ...
- depth i: n/(5^i) elements medians
each group less/greater 2 elements of previous depth, less/greater 2 elements of previous depth, , on:
calculating in total, our m greater/less (n * (2^k) + k * n) /((2^k) * (5^k)). depth = 1 median of medians, 3n/10.
now assuming depth [log_5 (n)], i.e. n = 5^k, get:
5^k * (k + 2^k)/(5^k * 2^k) -> 1.
Comments
Post a Comment