Python list through text file and removing lines that match a duplicated value

Question

I'm currently iterating through a text file and getting back the following output, to make my script effective I would like to delete the duplicate strings containing e.g. 181 and just keep one, see the example below.

Log file to be parsed.

{"id": "242", "status": 61313850, "time": "2015-02-26T08:46:14.070298", "item": 181, }
{"id": "242", "status": 61313850, "time": "2015-02-26T08:46:14.070298", "item": 181, }
{"id": "242", "status": 61313850, "time": "2015-02-26T08:46:14.070298", "item": 181, }
{"id": "242", "status": 61313850, "time": "2015-02-26T08:46:14.070298", "item": 181, }
{"id": "242", "status": 61313850, "time": "2015-02-26T08:46:14.070298", "item": 181, }
{"id": "242", "status": 61313851, "time": "2015-02-26T08:46:14.070298", "item": 180, }

Python code.

#!/usr/bin/env python

with open("tras.json") as infile:
    for line in infile:

    if "time" in line:
        time=line.split()[4:6]

    if "item" in line:
        item=line.split()[6:8]
        print time + item

Current output.

['"time":', '"2015-02-26T08:46:14.070298",', '"item":', '181,']
['"time":', '"2015-02-26T08:46:14.070298",', '"item":', '181,']
['"time":', '"2015-02-26T08:46:14.070298",', '"item":', '181,']
['"time":', '"2015-02-26T08:46:14.070298",', '"item":', '181,']
['"time":', '"2015-02-26T08:46:14.070298",', '"item":', '181,']
['"time":', '"2015-02-26T08:46:14.070298",', '"item":', '180,']

Desired output.

['"time":', '"2015-02-26T08:46:14.070298",', '"item":', '181,']
['"time":', '"2015-02-26T08:46:14.070298",', '"item":', '180,']

Cheers,

Phillip

Your forgot to include the Python code you use to solve this problem. You also forgot to describe what problem you have with this code. — , Feb 27 '15 at 15:51
Sorry for that, I've just added the code used to iterate. Phillip — Phillip Bailey, Feb 27 '15 at 16:00
1) This Python code will not produce this output. 2) What does not work with your code? — , Feb 27 '15 at 16:00
`print time + item` will not produce output like `['"time":', '"2015-02-26T08:46:14.070298", '"item":', '181,']`. — , Feb 27 '15 at 16:15
The if statements probably need to be indented. I'm guessing this is a typo in your question. This might be instructive: http://stackoverflow.com/questions/19483351/converting-json-string-to-dictionary-not-list-python — joel goldstick, Feb 27 '15 at 17:08

score 1 · Answer 1 · answered Feb 27 '15 at 15:56

A complete answer would require more knowledge of your domain, but I hope this example code is helpful:

foundNumbers=set()
clearedData=list()
for dataItem in dataList:
    if dataItem[-1] not in foundNumbers:
        foundNumbers.add(dataItem[-1])
        clearedData.append(dataItem)

Python list through text file and removing lines that match a duplicated value

1 Answers1