0

I want to put column names to a data set calling from a csv file. My code works perfectly fine follows.

DF2 = pd.DataFrame(data=DF1,index=SKU, 
   columns=['USER1','USER2','USER3','USER4','USER5','USER6'])

for 6 columns.

I have around 50 + columns and I want to read the column names from a csv file named USERID which is stored locally rather than typing the list. How can I do it?

The following code did not work

USERID = pd.read_csv("C:\EVALUATE\USERID.csv")
DF2 = pd.DataFrame(data=DF1,index=SKU, columns=USERID)

Any suggestions?

Francesco
  • 4,052
  • 2
  • 21
  • 29
Anu
  • 197
  • 1
  • 1
  • 14

1 Answers1

2

Does the file have to be in CSV format?-- You could simply pipe the column names from standard input as a stream of whitespace-separated words by splitting the input lines and then chaining them together:

import fileinput
import itertools

USERID = itertools.chain(*(line.split() for line in fileinput.input()))
DF2 = pd.DataFrame(data=DF1,index=SKU, columns=USERID)

Then, given that you have a file USERID.txt which looks like this:

USER1 USER2
USER3 
USER4 USER5
USER6

...you can enter e.g. python DF2.py < USERID.txt either in a POSIX shell or in a Windows shell and list(USERID) would look like ['USER1','USER2','USER3','USER4','USER5','USER6'].

The only downside to this is that you couldn't have column names with whitespace in them but it would be easy to change this code and data format in order to accommodate that requirement.

Lastly, if, for some reason, you really don't want to pipe the data from standard input, you can read it directly in Python like so:

import itertools

with open("C:\EVALUATE\USERID.txt", "r") as USERID_instream:
    USERID = itertools.chain(*(line.split() for line in USERID_instream))
DF2 = pd.DataFrame(data=DF1,index=SKU, columns=USERID)
Community
  • 1
  • 1
errantlinguist
  • 3,658
  • 4
  • 18
  • 41