1

I have a text file with a lot of values that are in scientific notation. However, instead of writing the scientific notations in terms of E (i.e. 2.0E-05), it is written in terms of D (i.e. 2.0D-05).

LATI    LONGI   AREA    CO2
   -0.5548999786D+01        0.3167600060D+02        0.1000000000D+07        0.1375607300D+08
   -0.1823500061D+02        0.3668500137D+02        0.1000000000D+07        0.6878036500D+07
   -0.6650000215D+00        0.2960499954D+02        0.7500000000D+06        0.5086381000D+07
   -0.9671999931D+01        0.2264999962D+02        0.1000000000D+07        0.2657306000D+08
   -0.1321700001D+02        0.4895299911D+02        0.6893938750D+06        0.8595105000D+07
   -0.1152099991D+02        0.2493499947D+02        0.1000000000D+07        0.2615907200D+08

How can I replace all the D's with E's?

Based on another stackoverflow answer, I wrote the following loop, but it's very slow and there is probably an easier way.

for ind in range(len(df_fires.LATI)):
    val = df_fires.LATI[ind]
    df_fires.LATI[ind] = float(val.replace('D','E'))

    val = df_fires.LONGI[ind]
    df_fires.LONGI[ind] = float(val.replace('D','E'))

Example file: https://www.dropbox.com/s/5glujwqux6d0msh/test.txt?dl=0

SugaKookie
  • 780
  • 2
  • 17
  • 41

2 Answers2

2

Try sed to replace all D's with E's in the file. Do this before parsing the file with python.

sed -e 's:D:E:g' test.txt >> test_new.txt

If you want to keep this in python, try this solution https://stackoverflow.com/a/11332274/5196039

swagrov
  • 1,510
  • 3
  • 22
  • 38
1

You can use apply to apply your function to every element in the your column.

Not sure if it will be faster as I only have a small dataset but is definitely less code:

import pandas as pd

columns = ['LATI', 'LONGI', 'AREA', 'CO2']
data = [['-0.5548999786D+01', '0.3167600060D+02', '0.1000000000D+07', '0.1375607300D+08'], 
['-0.1823500061D+02', '0.3668500137D+02', '0.1000000000D+07', '0.6878036500D+07'], 
['-0.6650000215D+00', '0.2960499954D+02',  '0.7500000000D+06', '0.5086381000D+07'], 
['-0.9671999931D+01', '0.2264999962D+02', '0.1000000000D+07',  '0.2657306000D+08'], 
['-0.1321700001D+02', '0.4895299911D+02', '0.6893938750D+06', '0.8595105000D+07'], 
['-0.1152099991D+02',  '0.2493499947D+02', '0.1000000000D+07', '0.2615907200D+08']]    

df = pd.DataFrame(columns=columns, data=data)
for column_name in columns:
    df[column_name] = df[column_name].apply(lambda x: x.replace('D', 'E'))

Output from df:

                LATI        ...                      CO2
0  -0.5548999786E+01        ...         0.1375607300E+08
1  -0.1823500061E+02        ...         0.6878036500E+07
2  -0.6650000215E+00        ...         0.5086381000E+07
3  -0.9671999931E+01        ...         0.2657306000E+08
4  -0.1321700001E+02        ...         0.8595105000E+07
5  -0.1152099991E+02        ...         0.2615907200E+08
cullzie
  • 2,705
  • 2
  • 16
  • 21
  • I tried that, but I get `AttributeError: ("'int' object has no attribute 'replace'", 'occurred at index DAY')` at the `updated_df` line. – SugaKookie Feb 26 '19 at 01:37
  • @shizishan Can you give us the data for the original dataframe in the question? – cullzie Feb 26 '19 at 01:38
  • I added a link in the question. https://www.dropbox.com/s/5glujwqux6d0msh/test.txt?dl=0 – SugaKookie Feb 26 '19 at 01:42
  • I've updated my answer to only update the columns you have in the question. There may be columns which have int values in your dataset which is causing the issue you see as ints obviously don't have a replace method. You can use apply with the list of columns which you want to convert – cullzie Feb 26 '19 at 01:44