2

I have an array in csv:

    date        group
0   2015-01-02  WODKA
1   2015-01-02  PIWO
2   2015-01-02  2015-01-02
3   2015-01-03  WODKA
4   2015-01-03  PIWO
5   2015-01-03  2015-01-03
6   2015-01-03  WODKA
7   2015-01-03  PIWO

And I would like to convert all the dates from the column "group" to the word "sum". But my code does not work...

import pandas as pd
import numpy as np
from datetime import datetime as dt

x = pd.read_csv("C:\\Users\dell\\Desktop\\list_1.csv", sep=';')
x.group = x.group.replace(dt, 'sum')
Tomasz Przemski
  • 1,127
  • 9
  • 29
  • Why do you *think* that would work? `dt` is a `module` object, do you have a bunch of references to the `dt` module -object in your `group` column? – juanpa.arrivillaga Oct 25 '17 at 20:54

2 Answers2

5

we can update those rows where we could convert group to datetime:

In [40]: df.loc[pd.to_datetime(df['group'], errors='coerce').notnull(), 'group'] = 'sum'

In [41]: df
Out[41]:
         date  group
0  2015-01-02  WODKA
1  2015-01-02   PIWO
2  2015-01-02    sum
3  2015-01-03  WODKA
4  2015-01-03   PIWO
5  2015-01-03    sum
6  2015-01-03  WODKA
7  2015-01-03   PIWO

or using RegEx (NOTE: first solution is much more flexible as it'll support different date formats):

In [46]: df['sum'] = df['group'].str.replace(r'^\d{4}-\d{2}-\d{2}', 'sum')

In [47]: df
Out[47]:
         date       group    sum
0  2015-01-02       WODKA  WODKA
1  2015-01-02        PIWO   PIWO
2  2015-01-02  2015-01-02    sum
3  2015-01-03       WODKA  WODKA
4  2015-01-03        PIWO   PIWO
5  2015-01-03  2015-01-03    sum
6  2015-01-03       WODKA  WODKA
7  2015-01-03        PIWO   PIWO
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
4

Or do some trick with special mark -( Notice , I will recommend MaxU's answer)

df.group.replace({'-':np.nan},regex=True).fillna('sum')
Out[449]: 
0    WODKA
1     PIWO
2      sum
3    WODKA
4     PIWO
5      sum
6    WODKA
7     PIWO
Name: group, dtype: object
BENY
  • 317,841
  • 20
  • 164
  • 234
  • This reminds me of our last discussion that prompted a question in itself :). Where _are_ you learning these tricks, I've not seen anything like them when I'm searching for my own problems? – roganjosh Oct 25 '17 at 21:01
  • Source code, not docs, then? Do they explicitly state as comments in the source code that such approaches will give the desired results, or do you identify these approaches yourself from understanding the code? – roganjosh Oct 25 '17 at 21:06
  • @roganjosh https://github.com/pandas-dev/pandas/blob/v0.20.3/pandas/core/generic.py#L3678-L3926 – BENY Oct 25 '17 at 21:08
  • I've found both examples really cool in that they make me question the inner workings, but I have to be honest that I would find the intended function of both very ambiguous if I ever inherited the code base. The answer by MaxU is, in terms of the API to me, explicit in what it wants to achieve. – roganjosh Oct 25 '17 at 21:10
  • @roganjosh BTw see the comment I left under MaxU's answer :-) – BENY Oct 25 '17 at 22:14
  • I did and upvoted it :) I didn't want to get involved but I did think that was the answer to be accepted. I do find your answer interesting though, it gives me something to read up on, if nothing else but to spot a "gotya" when my code does something unusual :) – roganjosh Oct 25 '17 at 22:17
  • 1
    @roganjosh Man , if you know me more, you will find I never ever fight for any point(reputation) related thing in SO. :-) – BENY Oct 25 '17 at 22:18
  • 1
    @roganjosh I am asking it as an question :-) https://stackoverflow.com/questions/46944650/replace-value-by-using-regex-to-np-nan – BENY Oct 26 '17 at 02:02