-4

I have got two files:

  • file_a = list of strings
  • file_b = data {or file_b could be a directory where all these files are}

What is the best option to accomplish the following task:

{save output} scan and display all lines found in file_b containing @ least 1 string found in file_a.

e.g file_a contains the following strings (in my case the list is very long)

01101

11001

11101

file_b

01101:11100:10001

11111:11100:10001

01111:11100:11001

11101:11111:11110

based on this example, line 1 line 3 and line 4 does contains the strings

Ilario Pierbattista
  • 3,175
  • 2
  • 31
  • 41
Enrik S
  • 7
  • 6
  • you can first scan file_a and store it in a list (or hash for better lookup performance) and than iterate over file_b and check for each line if it contained in the list, what have you got so far? – shahaf Apr 28 '18 at 09:19
  • so far been working on a python / pickle script to accomplish that task but kinda stuck on how to pull those strings from that list – Enrik S Apr 28 '18 at 09:30
  • it's a simple `if element in list` statement e,g `if 'a' in ['a','b'.c']`, post your code with sufficient input and desired output, more peoples could come and help... – shahaf Apr 28 '18 at 09:33

1 Answers1

0

You can read the lines of both files with the readlines() method of a file handle from open method and iterate over each line to find whether strings of file_a intersect with strings in line_b. Since you haven't provided us with more information about the format of your files and what you have done so far, I'll just put a pseudocode.

with open('file_a','r') as f1:
    strings=f1.readlines()  #note I suppose each string is on one line

with open('file_b','r') as f2:
    lines=f2.readlines()

# iterate to find intersection of strings in line
for line in lines:
    tmp=line.strip()
    print list(set(strings) & set(line))

see Find intersection of two nested lists? for the intersection of 2 lists

lefloxy
  • 181
  • 7
  • thank you for your response lefl. Find intersection of two nested lists is way too complicated for that tiny task. Basically, been twerking some codes to run egrup and its working just fine. – Enrik S Apr 28 '18 at 13:12
  • @EnrikShaumann it's true that it can be handy to find the intersection of two nested lists. You can do it in a less optimized way using the `if` condition as @shahaf said in his comment. – lefloxy Apr 30 '18 at 11:24
  • @EnrikShaumann If my answer helped, you can consider accepting it so that the question can be closed. – lefloxy Apr 30 '18 at 11:25
  • Find intersection of two nested lists, been doing some extensive research on that, if you got any useful links on that, please do throw some. thanks again for your help – Enrik S May 03 '18 at 20:47
  • @EnrikS sorry for the late response (I was off for a moment). Intersections are mostly based on sort and filter algorithms. With simple loops over the iterables you can do it. What usually varies between different algorithm is the computational cost (recursive algorithms tend to be faster). You can look at [this](https://stackoverflow.com/questions/497338/efficient-list-intersection-algorithm?utm_medium=organic&utm_source=google_rich_qa&utm_campaign=google_rich_qa) and [this](https://eugene-eeo.github.io/blog/intersection-algorithms.html) – lefloxy May 07 '18 at 09:56