0

I have a excel file like this:

|Time |Position |Symbol |Type |Volume |Price| |-------|-----------|-------|-------|-------|-----| |2022.04.01 15:39:38 |1111 |US500 |sell |1 |4 |552.40| |2022.04.01 16:34:36 |22222 |US500 |sell |0.5 |4 |543.20| orders

The "orders" is a merged column spanning columns from Time to Price.

At the moment I'm selecting what I want by:

df = df.iloc[:446,:]

but it is not dynamic, and the length of the table file varies every time.

How can I select the table from the beginning until (and excluded) the row with the string "orders"?

Thanks.

Steven
  • 63
  • 2
  • 12

1 Answers1

0

All you need to do is to calculate your magic number 446 dynamically. There are many ways of achieving this. I would start with the solution to this question here:

Numpy first occurrence of value greater than existing value

Solution would be like

stop_at = np.argmax(df["orders_containing_columnname"]=="orders")
df = df.iloc[:stop_at,:]
Patrick H.
  • 168
  • 7
  • thanks that works, if "orders" is under a specific column name, but in my case "orders" is written in a merged multi column row, so I don't have a specific column name I guess – Steven Aug 12 '22 at 12:26
  • 1
    @Steven In a pandas DataFrame each row must have the same number of columns, so it should be a regular column. You might find the right name by `print(df.columns)`. Alternatively use iloc to get the correct columns series if you know the id. (Might be 6 from the table in your question `df.iloc[:,6]`) – Patrick H. Aug 12 '22 at 12:34