-1

I have this pandas data frame which has 65 columns Full month:

ID  Name    Date    2022-12-1-IN    2022-12-1-OUT   2022-12-2-IN    2022-12-2-OUT ... 2022-12-31-IN 2022-12-31-OUT


n_cols = df.shape[1]
    
    # Create a list of the new column names
    new_col_names = []
    for i in range(3, n_cols):
        if (i - 3) % 2 == 0:
            new_col_names.append('2022-12-0' + str(((i - 4) // 2) + 1) + '-IN')
        else:
            new_col_names.append('2022-12-0' + str(((i - 4) // 2) + 1) + '-OUT')
    
    df.columns = new_col_names

My expected results:

ID  Name    Date    2022-12-01-IN    2022-12-01-OUT   2022-12-02-IN    2022-12-02-OUT ... 2022-12-31-IN 2022-12-31-OUT




lif new_len != old_len:
---> 70     raise ValueError(
     71         f"Length mismatch: Expected axis has {old_len} elements, new "
     72         f"values have {new_len} elements"
     73     )

ValueError: Length mismatch: Expected axis has 65 elements, new values have 62 elements

What I'm doing wrong?

Or any better solution to replace all my dates with this format yyyy-mm-dd-IN and yyyy-mm-dd-OUT.

Ex: 2022-12-01-IN, 2022-12-01-OUT,... 2022-12-02-IN ...

I have a range of dates from 2022-1-1 to 2022-12-1

Simon GIS
  • 1,045
  • 2
  • 18
  • 37
  • Did you understand what the error message is saying? `new_col_names` is only 62 elements. The reason is because you're skipping the first three. – wjandrea Feb 09 '23 at 18:34
  • Please read [How to ask a good question](/help/how-to-ask) for tips like how to write a good title and to make a [mre]. For specifics, see [How to make good reproducible pandas examples](/q/20109391/4518341). – wjandrea Feb 09 '23 at 18:37

1 Answers1

1

you can use regex to pad single digits found between two - with a zero:

import re
df.columns = [re.sub(r'(?<=-)(\d)(?=-)', r'0\1', x) for x in df.columns]

or directly:

df.columns = df.columns.str.replace(r'(?<=-)(\d)(?=-)', r'0\1', regex=True)
Tranbi
  • 11,407
  • 6
  • 16
  • 33