0

I have a set of csv files I upload into a data frame. each file has a different length.

I want to equalize the length of all columns to the mean columns length. the thing I don't want to just add values in the end but I want to stretch or compress (if its shorter or longer than the mean) the column linearly - meaning I want to insert (or remove) values inside the column itself.

Any suggestions?

Thank you

1 Answers1

0

Try the code below and check this answer to know how to extrapolate your values:

import pandas as pd
import pathlib

data = []
for filename in pathlib.Path('.').glob('*.csv'):
    df = pd.read_csv(filename)
    data.append(df)

mean_len = sum(len(df) for df in data) // len(data)

for idx, df in enumerate(data):
   if len(df) > mean_len
       data[idx] = df[:mean_len]
   else:
       # do stuff here to extrapolate your data
       # check https://stackoverflow.com/a/35959909/15239951
Corralien
  • 109,409
  • 8
  • 28
  • 52