I am reading a .tsv in pandas with the following commands:
Gene_Data = pd.read_csv(
genes_input_path,
sep='\t+',
header=None,
engine='python',
names=col_names,
usecols=col_to_use,
comment='#')
Gene_Data.Stop = Gene_Data.Stop.astype('float32')
Where the input data looks something like:
chr7 HAVANA gene 117287120 117715971 . + . ID=ENSG00000001626.16;gene_id=ENSG00000001626.16;gene_type=protein_coding;gene_name=CFTR;level=1;hgnc_id=HGN
And the Stop column corresponds to the 4th column when using 0 indexing. When I perform the astype conversion to float32 it ends up changing that Stop column value to 117715968. Rather than returning the actual value of 117715971.
When I disable the type conversion, it keeps the value as int64, and the value is correct. I don't understand why it is changing the inherent value when performing the conversion, does anyone have any thoughts?