How can I find a particular string among a list of file names?

Question

I need to search through files that end in .feature for certain words (POST, GET, PUT, and DELETE). I have been able to isolate the .feature files, but I can't find the files that have the above words.

I used the following to show the .feature files:

featureLocation = puts Dir.glob "**/*.feature" # show all feature files
 puts featureLocation

I tried numerous ways to iterate through each file to find the words but no luck yet.

puts returns nil, so assigning the return of puts will make "featureLocation" nil. Also you should use "feature_locations" in ruby (snake_case, not camelCase) — Pascal, Mar 28 '16 at 18:49
What "numerous ways" did you try? Why didn't they work. Please read "[mcve]". We need to see your effort. Also, Ruby variables aren't camelCase, they're snake_case. Ruby isn't Java. — the Tin Man, Mar 28 '16 at 20:16
It is not clear what you want. Do you want to find the words (among the four) included in the files, or the files that have those words? And your title can even imply that you want to find files names that include `POST`, `GET`, etc. — sawa, Mar 28 '16 at 20:17
How big are the directories you're searching? `"**"` in a `glob` path can kill your script's performance; `Find.find` is much more scalable. If you're searching *inside* files for the words, how big are the files? — the Tin Man, Mar 28 '16 at 20:19
Thanks. The files are upwards of 15kb, a handful bigger but most 8-15kb. I need to update the text in the file based on my search. btw, I'm super, super new to this so I'm grasping at straws here. :/ — Jo-Anne, Mar 28 '16 at 20:30

Pascal · Accepted Answer · 2016-03-28T20:12:47.633

1

files = Dir.glob('**/*.feature')

files.each do |name|
  if File.read(name).include?('CERTAIN WORD')
    # DO SOMETHING WITH THE FAILE
  end
end

You can change the content like this (reading, replacing, saving with new content):

content = File.read(name)
new_content = content.gsub(/SOME PATTERN/, 'REPLACEMENT')
File.open(name, 'w') { |file| file << new_content }

edited Mar 28 '16 at 20:12

answered Mar 28 '16 at 18:48

Pascal

8,464
1
20
31

Thank you so much, perfect and I also figured out how to find a particular word in my file but it is only showing me the first instance of the word "scenario" and there are tons of instances of it, I want to edit each one (the goal in the end)...is there a simple way to do that?: str = File.read("...POST_Win.feature") #first instance of the word "Scenario" match = str.scan(/[0-9A-Za-z]{8,8}/) puts match[1] Thanks a ton! – Jo-Anne Mar 28 '16 at 19:15
So the update. If you want to find multiple matches then use 'scan' instead of 'match' – Pascal Mar 28 '16 at 20:13
`"but it is only showing me the first instance of the word "scenario" and there are tons of instances of it"` This criteria needs to be in the question, not a comment. Please put all the requirements in your question when you ask it. Not doing that makes it harder to help you. Please read "[ask]". – the Tin Man Mar 28 '16 at 20:22
Please read http://meta.stackoverflow.com/a/297598/128421. Remember that it takes 24 hours for the world to turn, and there will be answers coming in for a while. Prompting for an up vote or being selected can discourage further answering, which can include answers that are more thorough or to the point. If you see that a new user hasn't selected an answer in 24 hours, then a comment under their question would be a bit more acceptable. – the Tin Man Mar 28 '16 at 20:42

score 0 · Answer 2 · edited May 23 '17 at 11:45

There are multiple ways to do this. This is untested, but here's the gist of how I'd go about it:

require 'find'

REPLACEMENTS = {
  'target1' => 'replacement1',
  'target2' => 'replacement2'
}

Find.find('./somedir') do |path|
  next if File.directory?(path)
  next unless File.extname(path) == '.feature'

  new_path = path + '.new' 
  File.open(new_path, 'w') do |fo|
    File.foreach(path) do |line|
      REPLACEMENTS.each do |t, r|
        line.gsub!(t, r)
      end
      fo.puts line
    end
  end

  old_path = path + '.old'
  File.rename(path, old_path)
  File.rename(new_path, path)
  # File.unlink(old_path)
end

There are several things I consider important when writing production code in an enterprise:

Write code that is scalable. This means that it won't die or go to a crawl if a file is much larger than you expect. File.foreach reads files line-by-line, which a lot of people assume means it runs more slowly or involves more work. Testing File I/O, specifically slurping files vs. line-by-line shows that as the file sizes grow, read (AKA slurping) slows drastically. Unless you are absolutely, totally sure, that your files will never get over 1MB, use foreach.
Use a nice starting structure to contain your target words and their replacements. A Hash is a good starting point as the targets/keys can be sub-strings or regular expressions. Explaining regular expressions is off-topic for this question but Ruby and gsub and regular expressions as keys in a hash are a great combination. There are lots of answers here showing how to do it plus the documentation has examples.
Find.find isn't well known; People tend to jump to Dir[...] or Dir.glob but I prefer Find. The documentation has a nice example of how to use it. Find seems to scale better, especially when you have to walk huge directories.
Modifying files then saving them is usually not done safely because people assume their code or system will never act up. That's not a good assumption. This code opens a new file, then reads the old line-by-line. For each line the targets are searched for and replaced, then the line is written to the new file. Once the old file is processed the new file is closed (as a by-product of using a block with open). Then the old file is renamed, the new file is renamed to the name of the old file. That leaves a backup in place in case there was a failure. Then, optionally, you could delete the old file.

This results in very low overhead, will run very fast and should scale/grow nicely.

This task can also be easily accomplished using the command-line tools find and sed in *nix, and it'll run very fast and be extremely scalable, so it's good to research that path too as it's surprising how easily some file tasks can be done at the prompt or in a shell script.

How can I find a particular string among a list of file names?

2 Answers2