3

I have bunch of lists containing strings, I wanna remove some string dates such as '2017-09-11', '2017-09-17', '2015-09-11' from these dictionaries. How can I do it?

WNT5 = ['RBPMS', 'TRIM2', 'GPM6A', 'TACC1', '2017-09-06', 'PARVA', 'RPS28', 'MAN1C1', 'LOXL2', 'PTPRB', 'STAG2', 'SFRS15', 'PDS5B', 'SWAP70', 'ZMIZ2', 'TPD52', 'OGT', 'RSU1', 'TGFBR3', 'NFAT5', 'ANGPT1', 'SLC25A36', 'NFIB', 'FBXO9', 'N4BP2L2', 'CCDC69', 'MYH11', 'LPP', 'USP34', 'ITIH5', 'GLS', 'SORBS2', 'TMEM43', 'ANK3', 'PSIP1', 'SYNPO2', 'C9orf5', 'BCL2', 'NSMAF', 'MLXIP', 'PDE8B', 'RABGAP1', 'RPS15A', 'NLRP12', 'AKAP1', 'PLK1S1', 'SLC4A4', 'COBLL1', 'ARHGEF7', 'CD47', 'TMEM132A', 'TNK2', 'WWC1', 'RPL22', 'NMT2', 'TNXB', 'SCPEP1', 'TTLL5', 'MAGI1', 'GOLGA2B', 'TIMELESS', 'ITPR1', 'ALMS1', 'TLE2', 'MAPT', 'DIP2A', 'PCGF3', 'CYP3A4', 'RALGPS1', 'N4BP2L1', 'DIO2', 'PPP1R3C', 'LRIG1', 'NSMCE4A', 'GPX2', 'SETBP1', 'SLC6A16', 'ARL5A']
Ajml
  • 379
  • 1
  • 2
  • 13

6 Answers6

2

Using list comprehension, you will get a new list without date string:

>>> def is_date_string(s):
...     # return re.search(r'^\d{4}-\d{2}-\d{2}$')
...     return '-' in s and s[:4].isdigit()  # NOTE not perfect, change as you need
... 
>>> [s for s in WNT5 if not is_date_string(s)]
['RBPMS', 'TRIM2', 'GPM6A', 'TACC1', 'PARVA', 'RPS28',
 'MAN1C1', 'LOXL2', 'PTPRB', 'STAG2', 'SFRS15', 'PDS5B', 'SWAP70',
 'ZMIZ2', 'TPD52', 'OGT', 'RSU1', 'TGFBR3', 'NFAT5', 'ANGPT1',
 'SLC25A36', 'NFIB', 'FBXO9', 'N4BP2L2', 'CCDC69', 'MYH11', 'LPP',
 'USP34', 'ITIH5', 'GLS', 'SORBS2', 'TMEM43', 'ANK3', 'PSIP1',
 'SYNPO2', 'C9orf5', 'BCL2', 'NSMAF', 'MLXIP', 'PDE8B', 'RABGAP1',
 'RPS15A', 'NLRP12', 'AKAP1', 'PLK1S1', 'SLC4A4', 'COBLL1', 'ARHGEF7',
 'CD47', 'TMEM132A', 'TNK2', 'WWC1', 'RPL22', 'NMT2', 'TNXB',
 'SCPEP1', 'TTLL5', 'MAGI1', 'GOLGA2B', 'TIMELESS', 'ITPR1', 'ALMS1',
 'TLE2', 'MAPT', 'DIP2A', 'PCGF3', 'CYP3A4', 'RALGPS1', 'N4BP2L1',
 'DIO2', 'PPP1R3C', 'LRIG1', 'NSMCE4A', 'GPX2', 'SETBP1', 'SLC6A16',
 'ARL5A']

To replace the WNT5, assign back the list comprehension:

WNT5 = [s for s in WNT5 if not is_date_string(s)]

or using slice (to replace items in-place):

WNT5[:] = [s for s in WNT5 if not is_date_string(s)]
falsetru
  • 357,413
  • 63
  • 732
  • 636
  • 1
    The 5th element of your output is a date string, so I'm going to venture to say something might be a bit off – Brad Solomon Jul 02 '17 at 03:39
  • @BradSolomon, I pasted wrong output. I fixed it. Thank you for your feedback. – falsetru Jul 02 '17 at 03:40
  • better to assign it to a list slice. read here: https://stackoverflow.com/a/1208792/4385319 – technusm1 Jul 02 '17 at 03:42
  • @falsetru I think the list comprehension looks good otherwise, although will this work for mm-dd-yyyy format? – Brad Solomon Jul 02 '17 at 03:44
  • @BradSolomon, Replacing `return '-' in s and s[:4].isdigit()` with `return '-' in s and s[:2].isdigit()` will handle such case. But current one should be enough to handle *'2017-09-11', '2017-09-17', '2015-09-11'*. – falsetru Jul 02 '17 at 03:46
  • @downvoter, Any reason for downvote? Please let me know how to improve the answer. – falsetru Jul 02 '17 at 04:09
  • @falsetru I tried, it did not eliminate the date string. `link = ['RBPMS', 'TRIM2', 'GPM6A', 'TACC1', '2017-09-06', 'PARVA', 'RPS28', 'MAN1C1']` `def is_date_string(s):` `# return re.search(r'^\d{4}-\d{2}-\d{2}$')` `return '-' in s and s[:4].isdigit() # NOTE not perfect, change as you need [s for s in link if not is_date_string(s)]` `print(link)` `['RBPMS', 'TRIM2', 'GPM6A', 'TACC1', '2017-09-06', 'PARVA', 'RPS28', 'MAN1C1']` – Ajml Jul 03 '17 at 02:16
  • 1
    @Nguyen, list comprehension does not change the list, but returns a new list. You should assign the result back to `link`. `link = [s for s in link if not is_date_string(s)]` or `link[:] = [s for s in link if not is_date_string(s)]` – falsetru Jul 03 '17 at 02:33
1

To remove from list, you can use the remove statement like so:

WNT5.remove('b')

This will delete the first occurrence of that element ('b'). To delete all elements, you can use list comprehension.

>>> WNT5 = [x for x in WNT5 if len(x) != 10]
>>> print(WNT5)

This assumes the only strings of length 10 are the date strings.

Hope it helps!

EDIT

I answered a little late, and everyone had better answers, but I also stumbled accross this function on another SO question that might be useful:

from dateutil.parser import parse
def is_date(string):
    try: 
        parse(string)
        return True
    except ValueError:
        return False

Then you have a function you can run with to make sure that the strings you are excluding are only dates (in any format)

EX:

>>> is_date("1990-12-1")
    True
>>> is_date("xyznotadate")
    False
>>> WNT5 = [x for x in WNT5 if not is_date(x)]
>>> print(WNT5)
cosinepenguin
  • 1,545
  • 1
  • 12
  • 21
0

The question is not completely specified, but I think it might suffice to explain how to manipulate a dictionary like a list, even though you specified a list in your question.

mydict = {'2017-04-11':22, '2017-04-12':23, '2017-04-13': 128}
newkeys = list(mydict.keys())
newkeys.remove('2017-04-12')
newvals = [mydict[keptkey] for keptkey in newkeys]
newdict = dict(zip(newkeys, newvals))

Once you have the newkeys list, you can truncate elements from it any way you'd like.

John Haberstroh
  • 465
  • 4
  • 11
0
import datetime
nwnt = len(WNT5)
for k, s in enumerate(reversed(WNT5)):
    try:
        datetime.datetime.strptime(s, '%Y-%m-%d') # adjust format to your liking
        del WNT5[nwnt - k - 1]
    except ValueError:
        pass
AGN Gazer
  • 8,025
  • 2
  • 27
  • 45
-1

Iterating the list (as others pointed out) is not the best option while calling remove(). So you can do the following:

Iterate a copy using list(original_list):

# makes a copy of the list to iterate rather than original
for item in list(WNT5):
    # assumes dates are yyyy-mm-dd and all contain the '-'
    # split() returns a list object
    # it will only split the '-' if its there, wont error
    if (len(item) == 10) and (len(item.split('-')) == 3):
        WNT5.remove(item)

Make a filtered list through list comprehension:

def is_not_date(WNT5):
    for item in WNT5:
        if not ((len(item) == 10) and (len(item.split('-')) == 3)):
            yield item    

new_WNT5 = [x for x in is_not_date(WNT5)]

There could be a more pythonic way of doing this (maybe with datetime?)

Really need some more information to provide a solution honestly:

  • Are they all the same format?
  • Are they all strings?
  • Whats the scope of the problem?
pstatix
  • 3,611
  • 4
  • 18
  • 40
-1

You can try regular expressions approach also:

import re
result_list = [element for element in WNT5 if re.search("[0-9]{4}\-[0-9]{2}\-[0-9]{2}", element) is None]

You can add more patterns of date if you want with this approach.

Yaman Jain
  • 1,254
  • 11
  • 16