Removing parentheses and comma

Question

I'm importing Data from a database into python data frame. Now, I wish to use the data for further analysis, however, I need to do a little cleaning of the data before using. Currently, the required column is formatted like ('2275.1', '1950.4'). The output that I require should look like:2275.1 and 1950.4 exclusively. can someone please help

I'm voting to close this question as off-topic because SO is not a code writing service, please show your efforts — EdChum, Jun 26 '15 at 08:16
Hi EdChum, I did try a few approaches, however I wasn't able to solve. I tried finding solutions to my approach, but since I didn't find anything. Hence, I posted. — user4943236, Jun 27 '15 at 07:37

vks · Answer 1 · 2015-06-26T06:56:39.190

0

import re
print re.findall(r"\b\d+(?:\.\d+)?\b",test_str)

You can simply do this.

or

print map(float,re.findall(r"\b\d+(?:\.\d+)?\b",x))

If you want float values.

edited Jun 26 '15 at 06:56

answered Jun 26 '15 at 06:44

vks

67,027
10
91
124

score 0 · Answer 2 · answered Jun 26 '15 at 06:48

0

Try ast.literal_eval, which evaluates its argument as a constant Python expression:

import ast

data = ast.literal_eval("('2275.1', '1950.4')")
# data is now the Python tuple ('2275.1', '1950.4')

x, y = data
# x is '2275.1' and y is '1950.4'

answered Jun 26 '15 at 06:48

nneonneo

171,345
36
312
383

score 0 · Answer 3 · answered Jun 26 '15 at 06:57

0

I assume, that the string you provided is actually the output of python. It is hence a tuple, containing two strings, which are numbers. If so and you would like to replace the ', you have to convert them to a number format, such as float:

a = ('2275.1', '1950.4')
a = [float (aI) for aI in a] 
print a
[2275.1, 1950.4]

answered Jun 26 '15 at 06:57

jhoepken

1,842
3
17
24

Thank you for your reply. The original output is – user4943236 Jun 27 '15 at 07:38

Joe T. Boka · Accepted Answer · 2015-06-26T07:21:09.587

0

This is one way to do it:

import re
x = "'('2275.1', '1950.4')'"
y = re.findall(r'\d+\.\d', x)
for i in y:
  print i

Output:

2275.1
1950.4

edited Jun 26 '15 at 07:21

answered Jun 26 '15 at 07:05

Joe T. Boka

6,554
6
29
48

score 0 · Answer 5 · answered Jun 26 '15 at 11:18

0

Here a non-regex approach:

data = (('2275.1', '1950.4'))


result = data[0]# 0 means the value in the first row
result2 = data[1]# 1 means the next row after 0


print result
print result2

Output:

>>> 
2275.1
1950.4
>>>

answered Jun 26 '15 at 11:18

Roy Holzem

860
13
25

Removing parentheses and comma

5 Answers5

Linked