0

I have a dataframe in which I am trying to convert the values in "LoginTime" to a 24HR format based on whether the "Timing" contains "am" or "pm".

data = """
LoginDate  LoginTime Timing  StudentId
2021-03-23   12       am      3574
2021-03-23   12       am      3574
2021-03-23   12       am      2512
2021-03-23   12       am      2692
2021-03-23   12       am      3064
"""

df = pd.read_csv(StringIO(data.strip()), sep='\s+')

I am using the following logic to convert the values:

for index in df.index:
    if (df.loc[index,"Timing"] == "pm"):
        df.loc[index, "LoginTime"] = df.loc[index, "LoginTime"] + 12

However, this gives me the following error:

    ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_11688/1623466071.py in <module>
      1 for index in df.index:
----> 2     if (df.loc[index,"Timing"] == "pm"):
      3         df.loc[index, "LoginTime"] = df.loc[index, "LoginTime"] + 12

c:\users\admin\appdata\local\programs\python\python39\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
   1535     @final
   1536     def __nonzero__(self):
-> 1537         raise ValueError(
   1538             f"The truth value of a {type(self).__name__} is ambiguous. "
   1539             "Use a.empty, a.bool(), a.item(), a.any() or a.all()."

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

It is worth noting that I have set the index of the Dataframe as "LoginDate" which is of datetime format. However, when I change the index to normal integer values (0,1,2,3,...) and keep "LoginDate" as a normal column label, the above error disappears and the code executes properly.

How do I make the code work while keeping the index as "LoginDate" ?

heretolearn
  • 6,387
  • 4
  • 30
  • 53
mukunda
  • 301
  • 1
  • 4
  • 13
  • Welcome to Stackoverflow. Images are not the best way to ask a question and it is difficult to read and reproduce the scenario. Can you instead add the sample data as text or dataframe for easy reproducibility. – heretolearn Aug 28 '21 at 04:49
  • Please post a running example. Instead of an image, just initialize your df in code so that we don't have to do that in the answers. – tdelaney Aug 28 '21 at 04:49
  • 2
    Seems like `df.loc[df["Timing"].eq('pm'), 'LoginTime'] += 12` instead of the shown code would work. – Henry Ecker Aug 28 '21 at 04:52
  • 2
    I think this is a diup of https://stackoverflow.com/questions/45313889/how-to-add-value-to-column-conditional-on-other-column – tdelaney Aug 28 '21 at 04:54
  • The code @HenryEcker shows is better, because it's vectorized. `df.loc[df["Timing"].eq('pm'), 'LoginTime']` will reference all cells in that columns where that condition is true. – smci Aug 28 '21 at 04:54
  • Nominally, the problem is that your `==` comparison created a series of True/False values. Should that series be considered True if it has a single True, or perhaps all of them need to be True, or maybe its that the series isn't empty? That's the ambiguity the error mentions. But there is a faster way to perform the operation on the entire dataframe. – tdelaney Aug 28 '21 at 04:57
  • @heretolearn I apologize for the image. I couldn't figure out how to reproduce the dataframe output from Jupyter notebook into code that I could paste here. – mukunda Aug 28 '21 at 05:46

2 Answers2

0

You could try this :

df["LoginTime"] = np.where(df["Timing"] == "pm", df["LoginTime"] + 12, df["LoginTime"])
heretolearn
  • 6,387
  • 4
  • 30
  • 53
0

Do not use a loop for your operation, use a vector approach:

df['LoginTime'] = df['LoginTime'].where(df['Timing'].ne('pm'), df['LoginTime']+12)

This is simpler to read and more efficient

mozway
  • 194,879
  • 13
  • 39
  • 75