0

So, i have a file

F1.txt

CDUS,CBSCS,CTRS,CTRS_ID
0,0,0.000000000375,056572
0,0,4.0746,0309044
0,0,0.6182,0971094
0,0,15.4834,075614

I want to insert the column names and its dtype into a dictionary with the column names being the key and the corresponding dtype of the column being the value.

My read statement has to be like this:

csv=pandas.read_csv('F2.txt',dtype={'CTRS_ID':str})

I'm expecting something like this:

data = {'CDUS':'int64','CBSCS':'int64','CTRS':'float64','CTRS_ID':'str'}

Can someone help me with this. Thanks in advance

Marek
  • 245
  • 1
  • 4
  • 15

1 Answers1

1

You can use dtypes to find the type of each column and then transform the result to a dictionary with to_dict. Also, if you want a string representation of the type, you can convert the dtypes output to string:

csv=pandas.read_csv('F2.txt',dtype={'CTRS_ID':str})
csv.dtypes.astype(str).to_dict()

Which gives the output:

{'CBSCS': 'int64', 'CDUS': 'int64', 'CTRS': 'float64', 'CTRS_ID': 'object'}

This is actually the right result, since pandas treats string as object. I have not enough expertise to elaborate on this, but here a couple of references:

pandas distinction between str and object types

pandas string data types "pandas doesn't support the internal string types (in fact they are always converted to object)" [from pandas maintainer @Jeff]

Community
  • 1
  • 1
FLab
  • 7,136
  • 5
  • 36
  • 69
  • 1
    I like this answer - very neat and extremely simple – MaxU - stand with Ukraine May 16 '17 at 11:44
  • Its not working. Please check this code:`csv2 = pandas.read_csv('F1.txt') dict1 = csv2.dtypes.astype(str).to_dict() print dict1` – Marek May 16 '17 at 11:46
  • This gives the output:`{'CTRS': 'object', 'CDUS': 'object', 'BOARD_ID': 'object', 'CBSCS': 'object'}` – Marek May 16 '17 at 11:46
  • I am afraid that might be something unexpected in your file. I tried to save your sample input to txt, import it using pd.read_csv and I obtained the expected output. I would say my solution is working (as it gives a dictionary with types of the columns), but your data don't have the type you were expecting – FLab May 16 '17 at 11:57
  • Ok i got the issue. I was putting `dtype=str` in my read_csv statement. – Marek May 16 '17 at 12:02
  • Please check my edit in the question. Can u suggest a solution? – Marek May 16 '17 at 12:05
  • Just to be clear: you import CTRS_ID as a string, but then you want to convert it to int? (One effect of this would be losing the leading zero, for example) – FLab May 16 '17 at 12:06
  • So Sorry that was a typo – Marek May 16 '17 at 12:10
  • Also to confirm, if you use my solution now you would get 'object' type for CTRS_ID, is that right? – FLab May 16 '17 at 12:15