Wednesday, September 1, 2010

For sets of integers, lookup speed goes: Array > Hash > Set

You have some set of integers and you need to query it repeatedly to determine if a member is in it or not.

require 'benchmark'
require 'set'

ar = (0...1000).to_a

ar_index = []
hash_index = {}
ar.each do |v|
  ar_index[v] = true
  hash_index[v] = true
end

set_index = Set.new(ar)
access = (0...1000).to_a

TIMES = 10000

Benchmark.bmbm do |r|
  r.report("array index") { TIMES.times { access.each {|v| ar_index[v] } } }
  r.report("hash index") { TIMES.times { access.each {|v| hash_index[v] } } }
  r.report("hash include") { TIMES.times { access.each {|v| hash_index.include?(v) } } }
  r.report("set include") { TIMES.times { access.each {|v| set_index.include?(v) } } }
end
The output (ruby 1.9.1p378 (2010-01-10 revision 26273) [x86_64-linux]):

                   user     system      total        real
array index    0.730000   0.000000   0.730000 (  0.745341)
hash index     0.990000   0.000000   0.990000 (  1.003517)
hash include   1.310000   0.000000   1.310000 (  1.332846)
set include    1.970000   0.000000   1.970000 (  2.024808)

No comments: