0

I have the following code:

import csv
import subprocess

from subprocess import check_output

# Writing the pacman command output to file in csv format

sysApps = check_output(["pacman", "-Qn"])
sysAppsCSV = csv.DictReader(sysApps.decode('ascii').splitlines(),
                        delimiter=' ', skipinitialspace=True,
                        fieldnames=[ 'name', 'version'])  # Thanks to https://stackoverflow.com/a/8880768/5565713  jcollado

with open('pacman.csv', 'w') as csvfile:
    rows_sys = csv.writer(csvfile)
    rows_sys.writerow(sysAppsCSV)

# Writing the pip command output in csv format

pipApps = check_output(["pip", "list"])
pipAppsCSV = csv.DictReader(pipApps.decode('ascii').splitlines(),
                        delimiter=' ', skipinitialspace=True,
                        fieldnames=[ 'name', 'version'])  # Thanks to https://stackoverflow.com/a/8880768/5565713  jcollado

with open('pip.csv', 'w') as csvfile:
    rows_pip = csv.writer(csvfile)
    rows_pip.writerow(pipAppsCSV)

# Comparing the files

I want to compare the two files, not necessary the files it can also be the content of the variables already created, and get the result as diffs from pip.csv file, practically I want to know what is in pip.csv and is not in pacman.csv. The example from here doesn't applies to my situation, but I will output the result in a similar way by listing the name and version.

EDIT: @Greg Sadetsky Thanks for the suggestion I used your example to simplify my code, but doesn't resolves my problem, I can't compare lists that way. I made some progress but I'm still not getting the desired output:

import csv
import subprocess

from subprocess import check_output

#Initializing variables

results_sys = ""
results_pip = ""

# Running the linux commands

sys_apps = set(check_output(["pacman", "-Qn"]).splitlines())
pip_apps = set(check_output(["pip", "list"]).splitlines())

# Saving the outputs of the commands in to a CSV format

for row in sys_apps:
    result = row.decode('ascii').split(sep=" ")
    with open('pacman.csv', 'a') as csvfile:
        rows_sys = csv.writer(csvfile)
        rows_sys.writerow(result)

for row in pip_apps:
    result = row.decode('ascii').split(sep=" ")
    with open('pip.csv', 'a') as csvfile:
        rows_sys = csv.writer(csvfile)
        rows_sys.writerow(result)

# Opening the files and comparing the results

with open('pacman.csv', 'r') as pacmanCSV:
    sys_apps = pacmanCSV.readlines()
    for row in sys_apps:
        apps = row.split(",")
        results_sys = results_sys + " " + apps[0]

with open('pip.csv', 'r') as pipCSV:
    pip_apps = pipCSV.readlines()
    for row in pip_apps:
        apps = row.split(",")
        results_pip = results_pip + " " + apps[0]


results_final = "List of apps installed from pip:\n################################"
for val in results_pip:
    if val not in results_sys:
        results_final = results_final + "\n" + val


print(results_final)

When I run this code I'm getting some capital letters, example: Imgur

cropped version

ok so after reading about set I did this:

r1 = set(results_pip)
r2 = set(results_sys)

print(r1 - r2)

But I get similar results, only the first letters in caps appear.

Community
  • 1
  • 1
Sergiu
  • 21
  • 2
  • 8
  • http://stackoverflow.com/questions/15864641/python-difflib-comparing-files http://stackoverflow.com/questions/977491/comparing-2-txt-files-using-difflib-in-python – Sam Mar 07 '16 at 21:51
  • The problem is that both `results_sys` and `results_pip` are strings to which you continuously append bits of string (i.e., `results_sys + " " + apps[0]`). If you iterate over a string as you do in `for val in results_pip`, you will be iterating over the letters in that string one by one... which is not what you want to do. I'll edit my answer with a solution for your new version – Greg Sadetsky Mar 08 '16 at 22:01

1 Answers1

1

You can compare the two package lists using sets and easily figure out what packages are in one list and absent from the other.

Do you absolutely need to go through CSV files? Are you simply looking for the difference of output between pacmac and pip? If so, I've created a simpler example below.

Note: I don't have pacman on my machine, but I'll suppose that its output format is similar to pip's. If not, you'll have to adjust the code.

from subprocess import check_output

sys_apps = set(check_output(["pacman", "-Qn"]).splitlines())
pip_apps = set(check_output(["pip", "list"]).splitlines())

# show packages present in sys_apps that are absent from pip_apps
print sys_apps - pip_apps

EDIT:

1- Why go through the trouble of writing the CSV file and then reading them back and then only comparing the sets? Why not simply check the difference between the sys_apps and pip_apps? I'll suppose that you need to write to CSV files and that you need to read back from those files and then to compare their content.

2- I see that you're mixing Python 2 and Python 3 code (you have a "sep" argument to split, but you're also calling "decode" on a string). Which version of Python are you using?

3- I see that you've changed your code a bit. As I explained in my comment to your question, by doing for val in results_pip you are iterating over that string's characters, which is probably not what you want to do (you probably wanted to iterate over elements of a list).

I'll only post another version of the lower half of your code:

# Opening the files and comparing the results

with open('pacman.csv', 'r') as pacmanCSV:
    sys_apps = pacmanCSV.readlines()

with open('pip.csv', 'r') as pipCSV:
    pip_apps = pipCSV.readlines()

print "List of apps installed from pip:\n################################"

print set(pip_apps) - set(sys_apps)

As you'll see, I'm not splitting the lines from the CSV files on commas since you can compare full package names including the versions (I think it'd be important to check if you have different versions of packages installed via pip). If you absolutely want to compare the package names only (not the versions), you can change the two with blocks to the following:

with open('pacman.csv', 'r') as pacmanCSV:
    sys_apps = [app.split(',')[0] for app in pacmanCSV.readlines()]

with open('pip.csv', 'r') as pipCSV:
    pip_apps = [app.split(',')[0] for app in pipCSV.readlines()]

this extracts the package name using a split, then keeps the package name only, and builds a list of all of the packages which becomes sys_apps ans pip_apps.

Let me know if this helps!

Greg Sadetsky
  • 4,863
  • 1
  • 38
  • 48