0

I have a variable containing a string like this in Ruby 1.9.3

#HELLO
#HELLO
#HELLO
#HELLO
#WORLD
#WORLD
#WORLD
#WORLD
#FOO
#BAR
#WORLD

I'd like it to be transformed to something like :

4 times #HELLO end
4 times #WORLD end
#FOO
#BAR
#WORLD

That's to say, I'd like consecutive duplicate strings to be grouped into one with the amount aside.

Is there a clever way of doing this using Ruby's functional programming power or other techniques ?

Cydonia7
  • 3,744
  • 2
  • 23
  • 32
  • you can start by counting duplicate after spliting('\n') http://stackoverflow.com/questions/1765368/how-to-count-duplicates-in-ruby-arrays – oldergod Jul 03 '12 at 08:50

4 Answers4

1

If you're on a unix like box you can probably pass your output through uniq -c. You may need to clean the output up slightly using sed after that, but it should be relatively simple.

However I'm sure there's a neat pure ruby solution too.

Michael Anderson
  • 70,661
  • 7
  • 134
  • 187
1

Try this:

str = "#HELLO
#HELLO
#HELLO
#HELLO
#WORLD
#WORLD
#WORLD
#WORLD
#FOO
#BAR
#WORLD"

result = ""
identical_lines = 1
str << "\n " # we need a last line to compare

str.lines.each_cons(2) do |line1,line2|
  if line1 == line2
    identical_lines += 1
  elsif identical_lines > 1
    result << "#{identical_lines} times #{line1.chomp} end\n"
    identical_lines = 1
  else
    result << line1
  end
end

puts result

This program outputs

4 times #HELLO end
4 times #WORLD end
#FOO
#BAR
#WORLD
Patrick Oscity
  • 53,604
  • 17
  • 144
  • 168
0

Something like this:

text.each_line.each_with_object(Hash.new(0)).do |e,h|
  h[e.chomp] += 1
end.each.map do |k,v|
  v > 1 ? "#{v} times #{k} end" : k
end.tap do |array|
  File.open(...) { |f| array.each { |e| f.puts e } }
end
megas
  • 21,401
  • 12
  • 79
  • 130
0

You should use compression if big amounts are involved, don't reinvent the wheel, but just for fun:

s = %q{#HELLO
#HELLO
#HELLO
#HELLO
#WORLD
#WORLD
#WORLD
#WORLD
#FOO
#BAR
#WORLD}

s.split.inject([[]]) { |m, s| !s.empty? && (m[-1][0] != s) ? (m << [s,1]) :  m[-1][1] += 1;m }.drop 1
#=>[["#HELLO", 4], ["#WORLD", 4], ["#FOO", 1], ["#BAR", 1], ["#WORLD", 1]]

I start by splitting the string into an array which i do a fold (inject) on , eliminating the consecutive doubles and putting the result in a 2 dimensional array

peter
  • 41,770
  • 5
  • 64
  • 108