0

I am trying to apply changes to a dataframe for values only returned (to the best of my knowledge) by using groupby. So what I want is to find the minimum date values for each company so that I can apply the number 0 to first value in several columns (in this case df2['Research and Development Expense Lag'] and df2['Capital Expenditures Lag']). Here is what I have so far, a groupby that returns those minimum date values for each company:

df2.groupby('Ticker Symbol').apply(lambda d: \
            d[d['Data Date'] == d['Data Date'].min()])
Deke Marquardt
  • 111
  • 1
  • 9
  • 1
    you can use `transform` instead of `apply` – Pablo C May 17 '21 at 00:02
  • I provided an answer below, but am not sure exactly what you mean by "apply to *first* value". Are there multiple rows where date is the minimum? If so, how do you plan to identify the *first* row? A good question on Stack Overflow should include a [minimal reproducible example](https://stackoverflow.com/help/minimal-reproducible-example), with sample input and expected output. – AlexK May 17 '21 at 00:22
  • See also [How to make good reproducible pandas examples](https://stackoverflow.com/q/20109391/15497888). – Henry Ecker May 17 '21 at 00:26

1 Answers1

0

You are on the right track. You can get the index values for those rows and then use them with .loc[] to change values in those two columns:

df2.loc[df2.groupby('Ticker Symbol').apply(
        lambda d: d[d['Data Date'] == d['Data Date'].min()]
    )
    .index
    .get_level_values(1),
    ['Research and Development Expense Lag', 'Capital Expenditures Lag']
] = 0

The .get_level_values(1) function serves to extract the second level of the MultiIndex. The first level will contain Ticker Symbol values.

AlexK
  • 2,855
  • 9
  • 16
  • 27