Compare two linux group files with Python

Question

I need to compare two Linux group files with Python and find a missing user in the group. I used the below code, but it failed if users are in a different order.

with open('group1', 'r') as file1:
    with open('group2', 'r') as file2:
        same = set(file1).difference(file2)

same.discard('\n')

with open('some_output_file.txt', 'w') as file_out:
    for line in same:
        file_out.write(line)

For example,

group1:
test:x:1234:mike,john,scott
test2:x:1234:mike,john
test3:x:1234:tim,dustin,Alex

group2:
test:x:1234:mike,scott,john
test2:x:1234:mike,john,scott
test3:x:1234:dustin,tim

the ideal output would be,

missing group1:
test2:scott

missing group2:
test3:Alex

Should I take each user and compare it? What would be the best way to compare two files?

Your posted code merely compares whole lines, finding missing lines from one of the two files. — Prune, Aug 21 '19 at 19:19
When asking about homework (1) **Be aware of your school policy**: asking here for help may constitute cheating. (2) Specify that the question is homework. (3) **Make a good faith attempt** to solve the problem yourself first (include your code in your question). (4) **Ask about a specific problem** with your existing implementation; see [Minimal, complete, verifiable example](https://stackoverflow.com/help/minimal-reproducible-example). Also, [here](https://meta.stackoverflow.com/questions/334822/how-do-i-ask-and-answer-homework-questions) is guidance on asking homework questions. — Prune, Aug 21 '19 at 19:19

score 2 · Accepted Answer · answered Aug 21 '19 at 19:33

This should work:

def create_dict_from_file(filename):
    """Read one file and extract from it the group name put as key and the user
    in it as values"""
    with open(filename, 'r') as file1:
        all_groups = file1.read().split('\n')
    return {
        one_line.split(':')[0]: one_line.split(':')[-1].split(',')
        for one_line in all_groups
    }


def create_missing_element(reference, other, key):
    """Create a dict with the missing elements if it exists"""
    missing_in_reference = set(reference) - set(other)
    if missing_in_reference:
        return {key: missing_in_reference}
    return {}


file_1_groups = create_dict_from_file('group1')
file_2_groups = create_dict_from_file('group2')

all_missing_group1 = {}
all_missing_group2 = {}
for key in file_1_groups:
    all_missing_group1.update(
        create_missing_element(file_1_groups[key], file_2_groups[key], key)
    )
    all_missing_group2.update(
        create_missing_element(file_2_groups[key], file_1_groups[key], key)
    )

print (all_missing_group1)
print (all_missing_group2)

I let you write the result in a file.

set is a Python structure where you cannot have duplicates and easy to manipulate in order to find missing elements.

I use a dict comprehension in order to create the dictionary with the group name as key (first element in the line when splitting with :) and the user as value (last element in the line when splitting with :). The user value is split again with , as seperator in order to have the users as a list which can be handle easily in Python.

Didn't expect to get a final product. It worked perfectly! – Mike Aug 21 '19 at 19:42 — Mike, Aug 21 '19 at 19:42

score 1 · Answer 2 · answered Aug 21 '19 at 19:20

Parse each list of names you are comparing into a set then do the set difference.

Here is an example of how you can compare sets of names.

s1 = set(['jay', 'kevin', 'billy'])
s2 = set(['billy', 'jay'])
s3 = set(['billy', 'jay', 'kevin'])
print(s1 - s2)
# {'kevin'}
print(s3 - s1)
# set()

Parsing the names into a set I'll leave up to you to figure out.

Compare two linux group files with Python

2 Answers2