I'm importing Data from a database into python data frame. Now, I wish to use the data for further analysis, however, I need to do a little cleaning of the data before using. Currently, the required column is formatted like
('2275.1', '1950.4')
. The output that I require should look like:2275.1
and 1950.4
exclusively.
can someone please help
Asked
Active
Viewed 1,879 times
-4

user4943236
- 5,914
- 11
- 27
- 40
-
1Did you try `str.replace()`? – TigerhawkT3 Jun 26 '15 at 06:44
-
I'm voting to close this question as off-topic because SO is not a code writing service, please show your efforts – EdChum Jun 26 '15 at 08:16
-
Hi EdChum, I did try a few approaches, however I wasn't able to solve. I tried finding solutions to my approach, but since I didn't find anything. Hence, I posted. – user4943236 Jun 27 '15 at 07:37
5 Answers
0
import re
print re.findall(r"\b\d+(?:\.\d+)?\b",test_str)
You can simply do this.
or
print map(float,re.findall(r"\b\d+(?:\.\d+)?\b",x))
If you want float
values.

vks
- 67,027
- 10
- 91
- 124
0
Try ast.literal_eval
, which evaluates its argument as a constant Python expression:
import ast
data = ast.literal_eval("('2275.1', '1950.4')")
# data is now the Python tuple ('2275.1', '1950.4')
x, y = data
# x is '2275.1' and y is '1950.4'

nneonneo
- 171,345
- 36
- 312
- 383
0
I assume, that the string you provided is actually the output of python. It is hence a tuple, containing two strings, which are numbers. If so and you would like to replace the '
, you have to convert them to a number format, such as float
:
a = ('2275.1', '1950.4')
a = [float (aI) for aI in a]
print a
[2275.1, 1950.4]

jhoepken
- 1,842
- 3
- 17
- 24
0
This is one way to do it:
import re
x = "'('2275.1', '1950.4')'"
y = re.findall(r'\d+\.\d', x)
for i in y:
print i
Output:
2275.1
1950.4

Joe T. Boka
- 6,554
- 6
- 29
- 48
0
Here a non-regex approach:
data = (('2275.1', '1950.4'))
result = data[0]# 0 means the value in the first row
result2 = data[1]# 1 means the next row after 0
print result
print result2
Output:
>>>
2275.1
1950.4
>>>

Roy Holzem
- 860
- 13
- 25