3

I'm pretty stuck here. Let's say I have a text file (example.txt) that looks like this:

Generic line 1() 46536.buildsomething  
Generic line 2() 98452.constructsomething  
Something I'm interested in seeing  
Another common line() blablabla abc945  
Yet another common line() runningoutofideashere.923954  
Another line I'm interested in seeing  
Line I don't care about 1() yaddayaddayadda  
Line I don't care about 2() yaddayaddayadda  
Generic line 3() 23485.buildsomething  
Yet some other common line  

I now have an exclusion text file (exclusions.txt) containing portions of lines to not print:

Generic  
common  
don't care about

The idea is I want to open up the example.txt file, open up the exclusions.txt file, then print any line in example.txt that does not contain any line in exclusions.txt.

What I've tried so far (without any success whatsoever):

textfile = open("example.txt", "r")
textfile = textfile.readlines()

exclusionslist = []
exclusions = open("exclusions.txt", "r")
exclusions = exclusions.readlines()
for line in exclusions:
    exclusionslist.append(line.rstrip('\n'))

for excline in exclusions:
    for line in textfile:
        if exline not in line:
            print line

I think I know what the problem is, but I have no idea how to fix it. I think I just need to tell Python that if a line in textfile contains any line in exclusions, do not print it.

kooper
  • 97
  • 2
  • 9
  • Do you *have* a problem? What is your question? –  Oct 04 '12 at 10:50
  • I think I explained what my problem is pretty well in the last sentence there, even though it doesn't have a question mark. See answer below for the solution to that problem. – kooper Oct 05 '12 at 06:58

2 Answers2

4

You're making it needlessly complicated:

with open("example.txt", "r") as text, open("exclusions.txt", "r") as exc:
    exclusions = [line.rstrip('\n') for line in exc]
    for line in text:
        if not any(exclusion in line for exclusion in exclusions):
            print line
Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
  • The with open() bit doesn't seem to be too happy with my Python (using 2.5, probably should've mentioned that originally), but the rest of it works absolutely beautifully. Thanks for the quick help with this! – kooper Oct 04 '12 at 11:04
  • @user1719723: If you can, upgrade to Python 2.7 (or 3.3 if you dare), if you can't do that, add `from __future__ import with_statement` at the top of your script, and never worry about having to close a file again, even if your script aborts. I think in Python 2.5, you will have to use two nested `with` blocks, though; the concatenation using a comma has probably not been backported: http://stackoverflow.com/questions/893333/multiple-variables-in-python-with-statement – Tim Pietzcker Oct 04 '12 at 11:23
  • Are there any pitfalls to using `with` instead of a `try:finally` ? – Mark Oct 04 '12 at 11:27
  • @MarkRibau: I don't think so; I find the use of `with` clearer and easier. – Tim Pietzcker Oct 04 '12 at 11:31
  • @TimPietzcker It would seem perhaps that the file used in the `with` only gets closed at the next garbage collection? At least this seems to be the behavior we are seeing. Attempting to re-use a file after a a with block, was intermittently failing, but doing an explicit `gc.collect()` after the with block made it stop failing. [Stand Alone Python v2.7.1, SCons v2.1.0] – Mark Oct 11 '12 at 10:14
  • @MarkRibau: No, according to the docs, the file is closed when the `with` block is left. As to why you still might be able to have had access to it intermittently, I can only speculate that it's a similar problem as described in [this famous analogy by Eric Lippert](http://stackoverflow.com/q/6441218/20670). Also, I'd love to see the code that you used to test that. This would make an excellent question for this site: "How come I can still access my file sometimes after I've left the `with` block?" – Tim Pietzcker Oct 11 '12 at 12:30
1

Seems like you would want:

textfile = open("example.txt", "r")
textfilelines = textfile.readlines()

exclusions = open("exclusions.txt", "r")
exclusionlines = exclusions.readlines()
for x in range(len(exclusionlines)):
    exclusionlines[x] = exclusionlines[x].strip("\n")

for line in textfilelines:
    found = False
    for exclude in exclusionlines:
        if exclude in line:
            found = True
    if not found:
        print line

This probably could be compressed using some magic syntax, but that'd be a lot harder to read. Depending on your output desires, you might need to strip \n from your textfilelines.

Mark
  • 1,639
  • 1
  • 15
  • 20