Difference between elements when reading from multiple files

Question

I am trying to get the difference between each element after reading multiple csv files. Each csv file has 13 rows and 128 columns. I am trying to get the column-wise difference

I read the files using

data = [pd.read_csv(f, index_col=None, header=None) for f in _temp]

I get a list of all samples.

According to this I have to use .diff() to get the difference. Which goes something like this

data.diff()

This works but instead of getting the difference between each row in the same sample, I get the difference between each row of one sample to another sample.

Is there a way to separate this and let the difference happen within each sample?

Edit

Ok I am able to get the difference between the data elements by doing this

_local = pd.DataFrame(data)

_list = []
_a = _local.index

for _aa in _a:
    _list.append(_local[0][_aa].diff())

flow = pd.DataFrame(_list, index=_a)

I am creating too many DataFrames, is there a better way to do this?

could you give a minimum complete sample of what `data` looks like after you've read it in? — michael_j_ward, Jun 08 '16 at 02:48
If I understand you correctly, you want to find the difference between adjacent columns of the dataframe. Say `column[1] - column[0]` elementwise and so on? — Nickil Maveli, Jun 08 '16 at 09:26

score 1 · Answer 1 · answered Jun 08 '16 at 02:50

1

Here is a relatively efficient way to read you dataframes one at a time and calculate their differences which are stored in a list df_diff.

df_diff = []
df_old = pd.read_csv(_temp[0], index_col=None)
for f in _temp[1:]:
    df = pd.read_csv(f, index_col=None)
    df_diff.append(df_old - df)
    df_old = df

answered Jun 08 '16 at 02:50

Alexander

105,104
32
201
196

OK. What is the output of `df.shape`. I first want to ensure that you are reading the files correctly. It should be (13, 128). – Alexander Jun 08 '16 at 03:52

score 1 · Answer 2 · edited Apr 13 '17 at 12:40

1

Since your code work you should real post on https://codereview.stackexchange.com/

(PS. The leading "_" is not really pythonic. pls avoid. It makes your code harder to read. )

_local = pd.DataFrame(data)
_list  = [ _local[0][_aa].diff() for _aa in _local.index ]
flow   = pd.DataFrame(_list, index=_local.index )

edited Apr 13 '17 at 12:40

Community

1
1

answered Jun 08 '16 at 05:27

Merlin

24,552
41
131
206

Difference between elements when reading from multiple files

2 Answers2