1

-Edit- This isn't a duplicate because I'm using one file to compare another everything I found on SO is looping through one file without an input file to compare.

A list of filenames.csv looks like this:

"aaabbccdd-3ksdfs"
"asdfdsbh-kkdkdsd"
"asdfds123221sssa"

I have another onelongstring.txt that that only has one massive string:

asfdsafsdafs//sdfasdschasdjs//akdasdfshcie//asdfdsbh-kkdkdsd...

What I would like to know is if each of the values in filesnames.csv exist in onelongstring.txt.

I tried something like this:

for i in filename.csv; do grep $i onelongstring.txt > countiffound.txt

But it wasn't working, then I realized if I just tried to do a loop like this:

for i in filename.csv; do echo $i;done

My output would just be the name of the file rather than each line of the content.

Is there a way to do this on the bash command line (osx) instead of having to write a script in bash?

chowpay
  • 1,515
  • 6
  • 22
  • 44
  • 1
    Can you add proper input/output and remove the `...` from the `onelongstring.txt` – Inian Jan 30 '18 at 07:17
  • Are the strings between `//` supposed to be found in their entirety or do you look for a solution which finds a match in a substring anywhere between `//` separators? (Why are you using such a ludicrous file format anyway?) – tripleee Jan 30 '18 at 08:04
  • Have you googled `iterate over lines of file bash`? – hek2mgl Jan 30 '18 at 08:04
  • 1
    Writing a one-liner and writing a script is basically the same thing. In a one-liner you can't use line breaks but that's a superficial constraint as far as Bash is concerned. The text in a script can be pasted at the command line and vice versa. – tripleee Jan 30 '18 at 08:05
  • 1
    I don't think this Q is a duplicate of [Looping through the content of a file in Bash](https://stackoverflow.com/questions/1521462/looping-through-the-content-of-a-file-in-bash), since that Q involves one file, but this question has *two* files. – agc Jan 30 '18 at 08:32
  • 1
    I don't think this should be marked as duplicate because that's not only about looping through the file, but also finding each string. This is a one-liner: allfound=1; onelongstring=$(cat onelongstring.txt); while read line; do if [[ ! "//$onelongstring//" =~ ^.*\/\/${line//\"/}\/\/.*$ ]]; then allfound=0; break; fi; done – Paulo Amaral Jan 30 '18 at 08:34
  • Pending reopen, try this: `tr -s '/' '\n' < onelongstring.txt | grep -f - filename.csv > countiffound.txt` – agc Jan 30 '18 at 08:36
  • hey @chowpay if you replace your for loop with the while loop in the answer that people were linking you to, then that fixes your problem – Hans Z Jan 30 '18 at 22:33
  • @HansZ it got too confusing, ended up writing a python script that just looked through all the variables in one file and checked if it existed in another. I was hoping there was just a simple in line bash I could do instead. Thanks though! – chowpay Jan 31 '18 at 21:51

1 Answers1

0

The problem is mostly in cleaning and separating the strings in the files. So, using awk with your data but with ... removed from onelongstring.txt:

$ awk 'NR==FNR {
    a[$0]
    next
}
gsub(/^"|"$/,"")&&($0 in a==0) { print $0 }
' RS="(//|\n)" onelongstring.txt filenames.csv
aaabbccdd-3ksdfs
asdfds123221sssa
James Brown
  • 36,089
  • 7
  • 43
  • 59