0

I have a df in which is indexed by date and has many columns. I am working with one row at a specific date inside a function, and then I will iterate over various dates selecting one row to modify at a time.

There are many calculations for the row and I'm finding using

df.loc[current_date, 'select_columns']` #messy. 

I changed the entire row to:

r = pd.Series (df.loc[current_date, :])

And that way could just work with say:

r[field_name]

I am able to view and update data doing this method and then reassign the series to the df row when the calculations are done.

While this does work, my question is ... is there a better more pythonic way to access one row in a dataframe for many calculations?

anky
  • 74,114
  • 11
  • 41
  • 70
run-out
  • 3,114
  • 1
  • 9
  • 25
  • The answer depends on the nature of the transformation(s) you're applying to the individual values. If these transformations can be "vectorized" into functions that accept and return entire series at a time, you're in luck. Could you edit into your answer the specific function(s) you're applying to each row? – Peter Leimbigler Feb 16 '19 at 03:11
  • Yes I will update the question in about an hour. Thank you – run-out Feb 16 '19 at 03:16
  • There are 4275 rows and 44 columns in the dataframe. I will probably work with about 10-15 rows. I'm rebalancing an investment portfolio as part of an analysis, and each row represents a day where the maximum or minimum stock/fixed income limits are exceeded. For each row, I must make decisions to buy and sell certain units, update other values in the row like cash and total_value, and then propagate this through the dataframe. Slicing the df is messay. Copying the row to pd.Series works, I was just wondering it there was a better more pythonic way to do this that I'm not aware of. – run-out Feb 16 '19 at 04:14

1 Answers1

0

IIUC, you are trying to find a nice way to iterate over rows and getting certain values from each row. You can iterate over rows using iterrows and retrieve values as you mentioned:

for ix, row in df.iterrows():
    row[field_names]
    ...

If you're interested in setting values to the dataframe, row[field_names] = val will not work. This related post may be of interest. Generally, you'll have to use "one operation" (such as df.loc[row, col] = value) to set values. Chained operations won't affect the original dataframe. That is df.loc[row, :][col] will not affect the original df. Same with the example I gave of row{field_names] = val for the same reason.

busybear
  • 10,194
  • 1
  • 25
  • 42
  • Thank you for your comment. I"m actually isolating the first row after filtering the df using a multitude of masks, so iterrows will probably not be appropriate. What I'm trying to do is work with the one row, do scalar type logical conditions and value assigments, and then move on from working with the row. – run-out Feb 16 '19 at 04:18