0

I have a problem. I have a csv file which has no "," as delimiter but is built as a common excel file.

# 2016-01-01: Prices/Volumes for Market                 
23-24   24,57
22-23   30,1
21-22   29,52
20-21   33,07
19-20   35,34
18-19   37,41

I am only interested in reading in the second column for e.g. 24,57 in the first line. The data has no header. How could I proceed here?

pd.read_csv(f,usecols = [2])

Does not work because I think there is no column identified. Thanks for your help!

inneb
  • 1,060
  • 1
  • 9
  • 20

2 Answers2

1

Try this:

pd.read_csv(f, delim_whitespace=True, names=['desired_col_name'], usecols=[1])

alternatively you might want to use pd.read_fwf

MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
  • Hey, I figured out it works with sep=None and engine='python' as arguments. Thanks for your help! :) – inneb Aug 14 '17 at 15:19
1

May be it is not suitable to read it as CSV

try to use regular expression, process it line by line

https://docs.python.org/2/library/re.html

for example

import re

>>> re.search('(\d{2})-(\d{2})   (\d{2}),(\d{2})', "23-24   24,57").group(1)
'23'
>>> re.search('(\d{2})-(\d{2})   (\d{2}),(\d{2})', "23-24   24,57").group(2)
'24'
>>> re.search('(\d{2})-(\d{2})   (\d{2}),(\d{2})', "23-24   24,57").group(3)
'24'
>>> re.search('(\d{2})-(\d{2})   (\d{2}),(\d{2})', "23-24   24,57").group(4)
'57'

To read file line by line in python, read this: How to read large file, line by line in python

kkpoon
  • 1,939
  • 13
  • 23