0

I have a file where I search for specific lines, like this:

<ClCompile Include="..\..\..\Source\fileA.c" />
<ClCompile Include="..\..\..\Tests\fileB.c" />

In my script I can find this lines and extract only the path string between the double qoutes . When I find them, I save it to an array (which I use later in my code). It looks like this:

source_path_array = []

File.open(file_name) do |f|
    f.each_line {|line|
     if line =~ /<ClCompile Include="..\\/
      source_path = line.scan(/".*.c"/)

      ###### Add path to array ######
      source_path_array << source_path 
    end
  }
end

So far, everything OK. Later in my script I output the array within an other file to a line "Source Files":

f.puts "Source Files= #{source_path_array.flatten.join(" ")}"

The result is than like this:

Source Files= "..\..\..\Source\fileA.c" "..\..\..\Tests\fileB.c"

I would like to have the output in this form:

Source Files=..\..\..\Source\fileA.c
Source Files=..\..\..\Tests\fileB.c

As you can see, each path in an separate line with the string "Source Files" before and also without double quotes. Any idea? Maybe my concept with the array is also not the best.

JohnDoe
  • 825
  • 1
  • 13
  • 31
  • If your file is XML, you might be [better served](http://stackoverflow.com/a/1732454/231788) with an actual XML parser like [Nokogiri](http://nokogiri.org/). – Félix Saparelli Nov 30 '15 at 11:21
  • You are right, the file is xml. I didn't used the xml parser because there is not so much "parsing work" ... there are only some simple strings which I have to find... – JohnDoe Nov 30 '15 at 12:19
  • As my edited answer demonstrates, using an actual XML parser can be _simpler_ than using the "simple regex" method. – Félix Saparelli Nov 30 '15 at 13:07

2 Answers2

1

Don't use #join, then. Use #each or #map. Also, you can use #gsub to remove the quotes:

source_path_array.flatten.each do |path|
  f.puts "Source Files=#{path.gsub(/(^"|")$/, '')}"
end

or

f.puts source_path_array.flatten.map do |path|
  "Source Files=#{path.gsub(/(^"|")$/, '')}"
end.join("\n")

The second version is probably more I/O efficient.


For this to work (and as an answer to the second part of your question), the source_path_array should contain strings. Here's a way to obtain this:

regex = /<ClCompile Include="(\.\.\\[^"]+)/
File.open(file_name) do |f|
  f.each_line do |line|
    regex.match(line) do |matches|
      source_path_array << matches[1] 
    end
  end
end

If you don't mind reading the entire file in memory at once, this is slightly shorter:

regex = /<ClCompile Include="(\.\.\\[^"]+)/
File.read(file_name).split(/(\r?\n)+/).each do |line|
  regex.match(line) do |matches|
    source_path_array << matches[1] 
  end
end

Finally, here's an example using Nokogiri:

require 'nokogiri'
source_path_array = File.open(file_name) do |f|
  Nokogiri::XML(f)
end.css('ClCompile[Include^=..\\]').map{|el| el['Include']}

All of these parse out the quotes, so you can remove the #gsub from the first portion.


All together now:

require 'nokogiri'
f.puts File.open(file_name) do |source|
  Nokogiri::XML(source)
end.css('ClCompile[Include^=..\\]').map do |el|
  "Source Files=#{el['Include']}"
end.join("\n")

and let's not loop twice (#map then #join) when once (a single #reduce) is doable:

require 'nokogiri'
f.puts File.open(file_name) do |source|
  Nokogiri::XML(source)
end.css('ClCompile[Include^=..\\]').reduce('') do |memo, el|
  memo += "Source Files=#{el['Include']}\n"
end.chomp
Félix Saparelli
  • 8,424
  • 6
  • 52
  • 67
  • Have tried the first version. The result is like this: Source Files=# Source Files=# – JohnDoe Nov 30 '15 at 12:21
  • I have it, instead of gsub i used delete: f.puts "Source Files=#{path.delete('"')}" ..... Your idea helped me, thx! :) – JohnDoe Nov 30 '15 at 12:42
  • Ah yes. That's because of the result of `#scan`. I erroneously assumed the array contained strings, not `Enumerator`s. I have added three refactors of your initial parser that correct this. – Félix Saparelli Nov 30 '15 at 13:05
0

Thanks to @Félix Saparelli:

The following worked for me:

  source_path_array.flatten.each do |path|
    f.puts "Source Files=#{path.delete('"')}"
   end
JohnDoe
  • 825
  • 1
  • 13
  • 31