0

I have the following list:

[[1.01782362e-05 1.93798303e-04 7.96163586e-05 5.08812627e-06
  1.39600188e-05 3.94912873e-04 2.33748418e-04 1.22856018e-05]]

When I return its type, I get:

<class 'str'>

Is the reason for that the scientific notation used for instance (i.e. e-04)?

In this case, how can I convert the above list to an integer or float?

Thanks.

EDIT

The above list snippet comes from this CSV file under the "Feature" column.

Simplicity
  • 47,404
  • 98
  • 256
  • 385
  • 5
    What you posted is neither a string nor syntaxtically valid. Please provide a [mcve]. My guess is that the above is surrounded by quote marks that you did not include, but in that case there is no mystery as to why it is a string. – John Coleman Nov 30 '21 at 15:40
  • I would change your title to say "Converting scientific notation to a float". – Gabriel G. Nov 30 '21 at 15:40
  • 1
    Does this answer your question? [Convert Scientific Notation to Float](https://stackoverflow.com/questions/25099626/convert-scientific-notation-to-float) – nikeros Nov 30 '21 at 15:43
  • I added the CSV file where the data is coming from to my question. – Simplicity Nov 30 '21 at 16:35
  • Does this answer your question? [Parsing the string-representation of a numpy array](https://stackoverflow.com/questions/43879345/parsing-the-string-representation-of-a-numpy-array) – Sam Mason Nov 30 '21 at 18:10

3 Answers3

1

It looks a lot like you have NumPy's string representation of an array. As I linked above, there doesn't seem to be a nice way of parsing this back, but in your case it might not matter, Pandas and Numpy can sort of get there reasonably easily:

import pandas as pd
import numpy as np

# read in the data
df = pd.read_csv("features_thresholds.csv")

# use numpy to parse that column
df.Feature = df.Feature.apply(lambda x: np.fromstring(x[2:-2], sep=' '))

note that the x[2:-2] is trimming off the leading [[ and trailing ]], otherwise it's mostly standard Pandas usage that most data science tutorials will go through.

Sam Mason
  • 15,216
  • 1
  • 41
  • 60
  • 1
    if you can I'd suggest asking whoever gave you that file to give it to you in some other format that's easier to parse, e.g. a JSON dump of the array would likely be more portable than what you've got – Sam Mason Nov 30 '21 at 18:26
0

What you posted must be part of a string literal:

s = '[[1.01782362e-05 1.93798303e-04 7.96163586e-05 5.08812627e-06 1.39600188e-05 3.94912873e-04 2.33748418e-04 1.22856018e-05]]'

In which case

list(map(float, s.lstrip('[').rstrip(']').split()))

evaluates to

[1.01782362e-05, 0.000193798303, 7.96163586e-05, 5.08812627e-06, 1.39600188e-05, 0.000394912873, 0.000233748418, 1.22856018e-05]
John Coleman
  • 51,337
  • 7
  • 54
  • 119
0

We can use python ast (Abstract Syntax Tree) to process it efficiently

import ast
x = '[[1.01782362e-05 1.93798303e-04 7.96163586e-05 5.08812627e-06 1.39600188e-05 3.94912873e-04 2.33748418e-04 1.22856018e-05]]'
x = ast.literal_eval(x.replace(" ",","))
print(x)