0

Possible Duplicate:
Compare two different files line by line and write the difference in third file - Python

The logic in my head works something like this... for line in import_file check to see if it contains any of the items in Existing-user-string-list if it contains any one of the items from that list then delete that line for the file.

filenew = open('new-user', 'r')
filexist = open('existing-user', 'r')
fileresult = open('result-file', 'r+')
xlines = filexist.readlines()
newlines = filenew.readlines()
for item in newlines:
    if item contains an item from xlines
        break
    else fileresult.write(item)
filenew.close()
filexist.close()
fileresult.close()

I know this code is all jacked up but perhaps you can point me in the right direction.

Thanks!

Edit ----

Here is an example of what is in my existing user file....

allyson.knanishu
amy.curtiss
amy.hunter
amy.schelker
andrea.vallejo
angel.bender
angie.loebach

Here is an example of what is in my new user file....

aimee.neece,aimee,neece,aimee.neece@faculty.asdf.org,aimee neece,aimee neece,"CN=aimee neece,OU=Imported,dc=Faculty,dc=asdf,dc=org"
alexis.andrews,alexis,andrews,alexis.andrews@faculty.asdf.org,alexis andrews,alexis andrews,"CN=alexis andrews,OU=Imported,dc=Faculty,dc=asdf,dc=org"
alice.lee,alice,lee,alice.lee@faculty.asdf.org,alice lee,alice lee,"CN=alice lee,OU=Imported,dc=Faculty,dc=asdf,dc=org"
allyson.knanishu,allyson,knanishu,allyson.knanishu@faculty.asdf.org,allyson knanishu,allyson knanishu,"CN=allyson knanishu,OU=Imported,dc=Faculty,dc=asdf,dc=org"

New code from @mikebabcock ... thanks.

outfile = file("result-file.txt", "w")
lines_to_check_for = [ parser(line) for line in file("existing-user.txt", "r") ]
for line in file("new-user.txt", "r"):
    if not parser(line) in lines_to_check_for:
        outfile.write(line)

Added an import statement for the parser... I am receiving the following error...

C:\temp\ad-import\test-files>python new-script.py
Traceback (most recent call last):
  File "new-script.py", line 7, in <module>
    lines_to_check_for = [ parser(line) for line in file("existing-user.txt", "r
     ") ]
  TypeError: 'module' object is not callable

Thanks!

Community
  • 1
  • 1
mpmackenna
  • 403
  • 1
  • 6
  • 20
  • 2
    we need to see the file content and the list? – Ashwini Chaudhary Oct 15 '12 at 18:57
  • 2
    Code? Looks like some English in there. If only computers took direct commands. – TheZ Oct 15 '12 at 18:57
  • You are going to have to do a better job telling us what an "item" is, and how those files are formatted. Examples would help. – Larry Lustig Oct 15 '12 at 19:01
  • Consider http://stackoverflow.com/questions/7757626/compare-two-different-files-line-by-line-and-write-the-difference-in-third-file as a possible duplicate for alternate answers. – mikebabcock Oct 15 '12 at 19:21
  • Apparently now my issue is that I don't know how to use the parser module... guess I will take a few extra moments and do some reading... Thanks! – mpmackenna Oct 15 '12 at 20:12
  • My function 'parser' is fictional and does whatever you need to do to parse data from your strings; eg "string.split(',')" or other. – mikebabcock Oct 16 '12 at 13:05
  • Also it would be nice if you'd upvote or accept answers and comments you appreciated. – mikebabcock Oct 17 '12 at 00:03
  • I am unable to upvote because my reputation is too low. I am a new member. I am still working on understanding all the formalities of the site. Thank you for all of your help. – mpmackenna Oct 17 '12 at 13:13

3 Answers3

3

assuming I understand what you want to do .... use set intersection :)

for line in newlines:
    if set(line.split()) & set(xlines): #set intersection
        print "overlap between xlines and current line"
        break
    else:
        fileresult.write(item)
Joran Beasley
  • 110,522
  • 12
  • 160
  • 179
  • My apologies for being so vague in my initial post. This method seems to return all entries from new lines as it tries to match exactly where as I would like to match on a newline containing an entry from existing. Thanks for taking the time to post your ideas on the subject. I will try to be more specific in the future. – mpmackenna Oct 15 '12 at 20:27
1

If the input files format is that you have one item per line (so that the check for existing element in readlines lists is ok), you are looking for list membership test:

if item in xlines:
    break

To point some some more python stuff: make a set from the list you test for membership (because the tests will be logarithmic time instead of linear as in the case of list):

xlines = set(filexists.readlines())

Also, you can use the with statement to avoid closing the files and provide clearer code (like the first example here).

Joe M.
  • 609
  • 3
  • 11
0

I presume this is what you want to do:

outfile = file("outfile.txt", "w")
lines_to_check_for = [ line for line in file("list.txt", "r") ]
for line in file("testing.txt", "r"):
    if not line in lines_to_check_for:
        outfile.write(line)

This will read all the lines in list.txt into an array, and then check each line of testing.txt against that array. All the lines that aren't in that array will be written out to outfile.txt.

Here's an updated example based on your new question:

newusers = file("newusers.txt", "w")
existing_users = [ line for line in file("users.txt", "r") ]
for line in file("testing.txt", "r"):
    # grab each comma-separated user, take the portion left of the @, if any
    new_users = [ entry.split["@"](0) for entry in line.split(",") ]
    for user in new_users:
        if not user in existing_users:
            newusers.write(user)
mikebabcock
  • 791
  • 1
  • 7
  • 20
  • If memory usage is a problem for that first array, you could make a much less quick but more memory efficient nested loop that reads list.txt one line at a time for each line of testing.txt. – mikebabcock Oct 15 '12 at 19:04
  • 1
    Your top example is really close. I think the issue is that I need to check to see if the line from my new-users file contains one of the lines from my existing users file not matches it exactly. Here is the code I derived from your assistance. I am not sure about the parser command I received an error that parser was not defined. Do I need an import statement to use "parser"? Sorry my first post was so vague and my "code" so terrible. – mpmackenna Oct 15 '12 at 19:52
  • guess you cant put code in comments... I am going to edit my original post and insert my new code... Please tell me if I am missing something. – mpmackenna Oct 15 '12 at 19:52
  • My second example is very close to what you want, but I'll edit ... – mikebabcock Oct 16 '12 at 02:16