1

So I have to do some data analysis on a data-set. The data is handed to me as a .json file. (I am using Pandas to handle the data)

"columns":["id","timestamp","offset_freq","reprate_freq"]
"index":[0,1,2,3,4 ... ]

I want to be able to handle the data as regular vectors like so.

 id = [ ... ]
 timestamp = [ ... ]
 offset_freq = [ ... ]
 reprate_freq = [ ... ]

But if I import my .json file with.

data=pd.read_json("comb_201601.json", orient='split')

And print one of the columns I get:

print(data['offset_freq'])
0         20000000.495
1         19999998.966
2         20000000.910
3         19999998.539
4         20000000.887
5         19999999.694
              ...
680204    20000000.024
Name: offset_freq, dtype: float64

What can I do to just get

print(data['offset_freq'])
20000000.495
19999998.966
20000000.910
19999998.539
20000000.887
19999999.694
     ...
20000000.024
Nillo
  • 61
  • 1
  • 10
  • Possible duplicate of [how to print dataframe without index](http://stackoverflow.com/questions/24644656/how-to-print-dataframe-without-index) – OneCricketeer Mar 01 '16 at 21:08
  • It is not because I want it to print it differently. I want a new vector which looks like what I printed the second time. – Nillo Mar 01 '16 at 21:10
  • That's an automated comment. Anyways, the answer there looks like exactly what you want to do... – OneCricketeer Mar 01 '16 at 21:11
  • Please try `print(data['offset_freq'].to_string(index=False))` – OneCricketeer Mar 01 '16 at 21:13
  • Then you are missunderstanding. The printing is not important. I really just want to isolate the data into a new vector which I can work with containing only the data and not the index and name. – Nillo Mar 01 '16 at 21:15
  • Btw print(data['offset_freq'].to_string(index=False)) just gives me the error: TypeError: to_string() got an unexpected keyword argument 'index' – Nillo Mar 01 '16 at 21:16
  • Weird.. [the documentation](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_string.html) says that's a valid argument. And you can freely work with that column as-is. The indices and the name are just metadata that aren't part of the column data. – OneCricketeer Mar 01 '16 at 21:26
  • My bad, that's a `Series`, not a `DataFrame` – OneCricketeer Mar 01 '16 at 21:31
  • You could turn your series into a numpy array without index with data['offset_freq'].values – Benjamin Mar 01 '16 at 21:36

1 Answers1

0

So data['offset_freq'] returns a pandas Series.

If you would like to convert that to a Python list, you can follow this example.

>>> import pandas as pd
>>> s = pd.Series([4, 5, 6])
>>> s
0    4
1    5
2    6
dtype: int64
>>> s.tolist()
[1, 2, 3]

i.e. You'll want

data['offset_freq'].tolist() for a Python list

or

data['offset_freq'].values for a Numpy array.

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245