I have a time series dataset in a .csv file that I want to process with Pandas (using Canopy). The column names from the file are a mix of strings and isotopic numbers.
cycles 40 38.02 35.98 P4
0 1 1.1e-8 4.4e-8 7.7e-8 8.8e-7
1 2 2.2e-8 5.5e-8 8.8e-8 8.7e-7
2 3 3.3e-8 6.6e-8 9.9e-8 8.6e-7
I would like this DataFrame to look like this
cycles 40 38 36 P4
0 1 1.1e-8 4.4e-8 7.7e-8 8.8e-7
1 2 2.2e-8 5.5e-8 8.8e-8 8.7e-7
2 3 3.3e-8 6.6e-8 9.9e-8 8.6e-7
The .csv files won't always have exactly the same column names; they numbers could be slightly different from file to file. To handle this, I've sampled the column names and rounded the values to the nearest integer.This is what my code looks like so far:
import pandas as pd
import numpy as np
df = {'cycles':[1,2,3],'40':[1.1e-8,2.2e-8,3.3e-8],'38.02':[4.4e-8,5.5e-8, 6.6e-8],'35.98':[7.7e-8,8.8e-8,9.9e-8,],'P4':[8.8e-7,8.7e-7,8.6e-7]}
df = pd.DataFrame(df, columns=['cycles', '40', '38.02', '35.98', 'P4'])
colHeaders = df.columns.values.tolist()
colHeaders[1:4] = list(map(float, colHeaders[1:4]))
colHeaders[1:4] = list(map(np.around, colHeaders[1:4]))
colHeaders[1:4] = list(map(int, colHeaders[1:4]))
colHeaders = list(map(str, colHeaders))
I tried df.rename(columns={df.loc[ 1 ]:colHeaders[ 0 ]}, ...), but I get this error:
TypeError: 'Series' objects are mutable, thus they cannot be hashed
I've read this post as well as the pandas 0.17 documentation, but I can't figure out how to use it to selectively replace the column names in a way that doesn't require me to assign new column names manually like this post.
I'm fairly new to Python and I've never posted on StackOverflow before, so any help would be greatly appreciated.