0

Right now I have a python list that looks like this:

['',     '2015-10-21 00:00:03', 'jp/ja/fedex/inet/label/international' ]
[398798, '2015-10-21 00:00:10', 'us/en/fedex/inet/label/domestic'      ]
[878787, '2015-10-21 00:00:16', 'us/en/fedex/fedexcares/home'          ]
['87878', '',                   'cn/zhs/fedex/inet/label/international']
['',     '2015-10-21 00:00:18', ''                                     ]
[5454,   '2015-10-21 00:00:19', 'us/en/fedex/sameday/main tracking'    ]
['',     '2015-10-21 00:00:21', 'sg/en/fedex/inet/label/international' ]

This 2D list has 3 columns and more than ten thousands rows. As you can see, some rows are missing elements at [0], and some are missing elements at [1], some are missing elements at [2]. Some have all three elements. I need to delete all those rows, which do not have three elements.

That being said, as long as a row misses one element, it needs to be deleted. So, for the list above, row[0][3][4][5][6] need to be deleted.

After perform the delete function, the list should look like this:

[398798, '2015-10-21 00:00:10', 'us/en/fedex/inet/label/domestic'      ]
[878787, '2015-10-21 00:00:16', 'us/en/fedex/fedexcares/home'          ]

I'm thinking about this:

for i in range(len(D)):          //D is the name of my list
    if D[i][0] =='' or D[i][1]=='' or D[i][2] =='':
        del D[i]

But this does not work, because as you are truncating the list, len(D) is changing, you will not be able to iterate through the whole list.

I also thought about this:

for item in D:
    if item[0]=='' or item[1]=='' or item[2] =='':
        del item

This also does not at all.

I would really appreciate it if you could come up with something.

pp_
  • 3,435
  • 4
  • 19
  • 27
JY078
  • 393
  • 9
  • 21
  • Why should `row[5]` be deleted? – Robᵩ Mar 04 '16 at 18:58
  • Possible duplicate: http://stackoverflow.com/questions/1207406/remove-items-from-a-list-while-iterating-in-python . The answer you seek may be found in this other question. – Robᵩ Mar 04 '16 at 19:04

2 Answers2

3

I would use D = filter(all, D) or perhaps D = filter(lambda x: '' not in x, D), depending upon your exact definition of "empty".

Consider this program:

from pprint import pprint

D = [
    ['',     '2015-10-21 00:00:03', 'jp/ja/fedex/inet/label/international' ],
    [398798, '2015-10-21 00:00:10', 'us/en/fedex/inet/label/domestic'      ],
    [878787, '2015-10-21 00:00:16', 'us/en/fedex/fedexcares/home'          ],
    ['87878', '',                   'cn/zhs/fedex/inet/label/international'],
    ['',     '2015-10-21 00:00:18', ''                                     ],
    [5454,   '2015-10-21 00:00:19', 'us/en/fedex/sameday/main tracking'    ],
    ['',     '2015-10-21 00:00:21', 'sg/en/fedex/inet/label/international' ],
]

D2 = filter(all, D)
D3 = filter(lambda x: '' not in x, D)
assert D2 == D3

pprint(D2)
pprint(D3)
Robᵩ
  • 163,533
  • 20
  • 239
  • 308
1

For the record, it would have been helpful if you'd shown your sample data as an actual list that I could have copied and pasted.

The all function returns True only if all elements of it argument are true. For example:

>>> all([1, 2, 3])
True
>>> all(['', 2, 3])
False
>>> all([1, 2, 0])
False

By iterating over your list-of-lists in a list comprehension it's relatively easy to produce what you want.

tlist = [
    ['',     '2015-10-21 00:00:03', 'jp/ja/fedex/inet/label/international' ],
    [398798, '2015-10-21 00:00:10', 'us/en/fedex/inet/label/domestic'      ],
    [878787, '2015-10-21 00:00:16', 'us/en/fedex/fedexcares/home'          ],
    ['87878', '',                   'cn/zhs/fedex/inet/label/international'],
    ['',     '2015-10-21 00:00:18', ''                                     ],
    [5454,   '2015-10-21 00:00:19', 'us/en/fedex/sameday/main tracking'    ],
    ['',     '2015-10-21 00:00:21', 'sg/en/fedex/inet/label/international' ]]
result = [r for r in tlist if all(x for x in r)]

result will now contain

[[398798, '2015-10-21 00:00:10', 'us/en/fedex/inet/label/domestic'],
 [878787, '2015-10-21 00:00:16', 'us/en/fedex/fedexcares/home'],
 [5454, '2015-10-21 00:00:19', 'us/en/fedex/sameday/main tracking']]
holdenweb
  • 33,305
  • 7
  • 57
  • 77