0

So first of all I have a list which contains a variety of links. For example:

imagelinks = [
'http://24.media.tumblr.com/e13983b2fcfa441eb18861cf3e9bc0e9/tumblr_mzzdmmkoS81r2gyhfo1_500.jpg',
'A%2F%2F31.media.tumblr.com%2Favatar_c3eb4dbb6150_64.png'
'http://31.media.tumblr.com/avatar_c3eb4dbb6150_16.png'
'http://24.media.tumblr.com/tumblr_lyrqzcl2Mf1rnn3koo1_1280.jpg'

and so on. What I want to do is to leave only the link, which end with 1280. So I wrote this pice of code to help clean up the list:

def cleanImageLinks():
global imagelinks
removed = 0
for link in imagelinks:
    if link[27:33] == 'avatar':
        imagelinks.remove(link)
        removed += 1
    elif link[len(link)-6:len(link)-4] == '16':
        imagelinks.remove(link)
        removed += 1
    elif link[len(link)-6:len(link)-4] == '40':
        imagelinks.remove(link)
        removed += 1
    elif link[len(link)-6:len(link)-4] == '00':
        imagelinks.remove(link)
        removed += 1
    elif link[len(link)-6:len(link)-4] == '28':
        imagelinks.remove(link)
    elif link[0] == "A":
        imagelinks.remove(link)
        removed += 1
    else:
        pass
print str(removed) + " entries removed!"

So at the end I get "436 entries removed", but when I print the list, I can still find link, which I do not want everywhere. Since the list is more that 2000 entries, 436 are not many. What can I do?

emilanov
  • 362
  • 4
  • 15

1 Answers1

0

Don't modify the original, create a new one using list comprehension:

In [1036]: imagelinks = [
      ...: 'http://24.media.tumblr.com/e13983b2fcfa441eb18861cf3e9bc0e9/tumblr_mzzdmmkoS81r2gyhfo1_500.jpg',
      ...: 'A%2F%2F31.media.tumblr.com%2Favatar_c3eb4dbb6150_64.png',
      ...: 'http://31.media.tumblr.com/avatar_c3eb4dbb6150_16.png',
      ...: 'http://24.media.tumblr.com/tumblr_lyrqzcl2Mf1rnn3koo1_1280.jpg']

In [1043]: newlinks=[i for i in imagelinks if i.split('.')[-2].endswith('1280')]
      ...: print newlinks
      ...: print '%d links are removed.'%(len(imagelinks)-len(newlinks))
#outputs:
['http://24.media.tumblr.com/tumblr_lyrqzcl2Mf1rnn3koo1_1280.jpg']
3 links are removed.
zhangxaochen
  • 32,744
  • 15
  • 77
  • 108