0

I am very new to Python and have very little experience. I've managed to get some code working by copying and pasting and substituting the data I have, but I've been looking up how to select data from a dataframe but can't make sense of the examples and substitute my own data in.

The overarching goal: (if anyone could actually help me write the entire thing, that would be helpful, but highly unlikely and probably not allowed)

I am trying to use scipy to fit the curve of a temperature change when two chemicals react. There are 40 trials. The model I am hoping to use is a generalized logistic function with six parameters. All I need are the 40 functions, and nothing else. I have no idea how to achieve this, but I will ask another question when I get there.

The current issue:

I had imported 40 .csv files, compiled/shortened the data into 2 sections so that there are 20 trials in 1 file. Now the data has 21 columns and 63 rows. There is a title in the first row for each column, and the first column is a consistent time interval.

However, each trial is not necessarily that long. One of them does, though. So I've managed to write the following code for a dataframe:

import pandas as pd

df = pd.read_csv("~/Truncated raw data hcl.csv")
print(df)

It prints the table out, but as expected, there are NaNs where there exists no data.

So I would like to know how to arrange it into workable array with 2 columns , time and a trial like an (x,y) for a graph for future workings with numpy or scipy such that the rows that there is no data would not be included.

Part of the .csv file begins after the horizontal line. I'm too lazy to put it in a code block, sorry. Thank you.


time,1mnaoh trial 1,1mnaoh trial 2,1mnaoh trial 3,1mnaoh trial 4,2mnaoh trial 1,2mnaoh trial 2,2mnaoh trial 3,2mnaoh trial 4,3mnaoh trial 1,3mnaoh trial 2,3mnaoh trial 3,3mnaoh trial 4,4mnaoh trial 1,4mnaoh trial 2,4mnaoh trial 3,4mnaoh trial 4,5mnaoh trial 1,5mnaoh trial 2,5mnaoh trial 3,5mnaoh trial 4
0.0,23.2,23.1,23.1,23.8,23.1,23.1,23.3,22.0,22.8,23.4,23.3,24.0,23.0,23.8,23.8,24.0,23.3,24.3,24.1,24.1
0.5,23.2,23.1,23.1,23.8,23.1,23.1,23.3,22.1,22.8,23.4,23.3,24.0,23.0,23.8,23.8,24.0,23.4,24.3,24.1,24.1
1.0,23.2,23.1,23.1,23.7,23.1,23.1,23.3,22.3,22.8,23.4,23.3,24.0,23.0,23.8,23.8,24.0,23.5,24.3,24.1,24.1
1.5,23.2,23.1,23.1,23.7,23.1,23.1,23.3,22.4,22.8,23.4,23.3,24.0,23.0,23.8,23.8,23.9,23.6,24.3,24.1,24.1
2.0,23.3,23.2,23.2,24.2,23.6,23.2,24.3,22.5,23.0,23.7,24.4,24.1,23.1,23.9,24.4,24.2,23.7,24.5,24.7,25.1
2.5,24.0,23.5,23.5,25.4,25.3,23.3,26.4,22.7,23.5,25.8,27.9,25.1,23.1,23.9,27.4,26.8,23.8,27.2,26.7,28.1
3.0,25.4,24.4,24.1,26.5,27.8,23.3,28.5,22.8,24.6,28.6,31.2,27.2,23.2,23.9,30.9,30.5,23.9,31.4,29.8,31.3
3.5,26.9,25.5,25.1,27.4,29.9,23.4,30.1,22.9,26.4,31.4,34.0,30.0,23.3,24.2,33.8,34.0,23.9,35.1,33.2,34.4
4.0,27.8,26.5,26.2,27.9,31.4,23.4,31.3,23.1,28.8,34.0,36.1,32.6,23.3,26.6,36.0,36.7,24.0,37.7,35.9,36.8
4.5,28.5,27.3,27.0,28.2,32.6,23.5,32.3,23.1,31.2,36.0,37.5,34.8,23.4,30.0,37.7,38.7,24.0,39.7,38.0,38.7
5.0,28.9,27.9,27.7,28.5,33.4,23.5,33.1,23.2,33.2,37.6,38.6,36.5,23.4,33.2,39.0,40.2,24.0,40.9,39.6,40.2
5.5,29.2,28.2,28.3,28.9,34.0,23.5,33.7,23.3,35.0,38.7,39.4,37.9,23.5,35.6,39.9,41.2,24.0,41.9,40.7,41.0
6.0,29.4,28.5,28.6,29.1,34.4,24.9,34.2,23.3,36.4,39.6,40.0,38.9,23.5,37.3,40.6,42.0,24.1,42.5,41.6,41.2
6.5,29.5,28.8,28.9,29.3,34.7,27.0,34.6,23.3,37.6,40.4,40.4,39.7,23.5,38.7,41.1,42.5,24.1,43.1,42.3,41.7
7.0,29.6,29.0,29.1,29.5,34.9,28.8,34.8,23.5,38.6,40.9,40.8,40.2,23.5,39.7,41.4,42.9,24.1,43.4,42.8,42.3
7.5,29.7,29.2,29.2,29.6,35.1,30.5,35.0,24.9,39.3,41.4,41.1,40.6,23.6,40.5,41.7,43.2,24.0,43.7,43.1,42.9
8.0,29.8,29.3,29.3,29.7,35.2,31.8,35.2,26.9,40.0,41.6,41.3,40.9,23.6,41.1,42.0,43.4,24.2,43.8,43.3,43.3
8.5,29.8,29.4,29.4,29.8,35.3,32.8,35.4,28.9,40.5,41.8,41.4,41.2,23.6,41.6,42.2,43.5,27.0,43.9,43.5,43.6
9.0,29.9,29.5,29.5,29.9,35.4,33.6,35.5,30.5,40.8,41.8,41.6,41.4,23.6,41.9,42.4,43.7,30.8,44.0,43.6,43.8
9.5,29.9,29.6,29.5,30.0,35.5,34.2,35.6,31.7,41.0,41.8,41.7,41.5,23.6,42.2,42.5,43.7,33.9,44.0,43.7,44.0
10.0,30.0,29.7,29.6,30.0,35.5,34.6,35.7,32.7,41.1,41.9,41.8,41.7,23.6,42.4,42.6,43.8,36.2,44.0,43.7,44.1
10.5,30.0,29.7,29.6,30.1,35.6,35.0,35.7,33.3,41.2,41.9,41.8,41.8,23.6,42.6,42.6,43.8,37.9,44.0,43.8,44.2
11.0,30.0,29.7,29.6,30.1,35.7,35.2,35.8,33.8,41.3,41.9,41.9,41.8,24.0,42.9,42.7,43.8,39.3,,43.8,44.3
11.5,30.0,29.8,29.7,30.1,35.8,35.4,35.8,34.1,41.4,41.9,42.0,41.8,26.6,43.1,42.7,43.9,40.2,,43.8,44.3
12.0,30.0,29.8,29.7,30.1,35.8,35.5,35.9,34.3,41.4,42.0,42.0,41.9,30.3,43.3,42.7,43.9,40.9,,43.9,44.3
12.5,30.1,29.8,29.7,30.2,35.9,35.7,35.9,34.5,41.5,42.0,42.0,,33.4,43.4,42.7,44.0,41.4,,43.9,44.3
13.0,30.1,29.8,29.8,30.2,35.9,35.8,36.0,34.7,41.5,42.0,42.1,,35.8,43.5,42.7,44.0,41.8,,43.9,44.4
13.5,30.1,29.9,29.8,30.2,36.0,36.0,36.0,34.8,41.5,42.0,42.1,,37.7,43.5,42.8,44.1,42.0,,43.9,44.4
14.0,30.1,29.9,29.8,30.2,36.0,36.1,36.0,34.9,41.6,,42.2,,39.0,43.5,42.8,44.1,42.1,,,44.4
14.5,,29.9,29.8,,36.0,36.2,36.0,35.0,41.6,,42.2,,40.0,43.5,42.8,44.1,42.3,,,44.4
15.0,,29.9,,,36.0,36.3,,35.0,41.6,,42.2,,40.7,,42.8,44.1,42.4,,,
15.5,,,,,36.0,36.4,,35.1,41.6,,42.2,,41.3,,,,42.4,,,
U13-Forward
  • 69,221
  • 14
  • 89
  • 114
Ilyankor
  • 171
  • 3
  • 8

1 Answers1

1

To convert a whole DataFrame into a numpy array, use

df = df.values()

If i understood you correctly, you want seperate arrays for every trial though. This can be done like this:

data = [df.iloc[:, [0, i]].values() for i in range(1, 20)]

which will make a list of numpy arrays, every one containing the first column with temperature and one of the trial columns.

Flob
  • 898
  • 1
  • 5
  • 14
  • To clarify, am I typing `data = [df.iloc(0, i).values() for i in range(1, 20)]` or putting a value in, like `data = [df.iloc(0, 1).values()]` ? Both produce an error "_call_() takes in 1 or 2 positional arguments but 3 were given". – Ilyankor Dec 24 '18 at 03:27
  • use `[]`, not `()` (look at the exact syntax in my answer) – Flob Dec 24 '18 at 03:29
  • depends on what you want. The first one will store all arrays for all trials in one list, the other one will give you one array for the trialnumber you put in – Flob Dec 24 '18 at 03:30
  • Thank you, but entering `data = [df.iloc[0, 1].values()]` says numpy.float64' object has no attribute 'values'. – Ilyankor Dec 24 '18 at 03:50
  • oopsie, forgot the first argument, should work now – Flob Dec 24 '18 at 03:58
  • Gladly :-) have a nice evening – Flob Dec 24 '18 at 04:06