-1

I have a string:

ip = 'MULTIPOLYGON (((1790.0 15563.0, 1791.0 15553.0, 1790.0 15551.0, 1789.0 15549.0, 1791.0 15547.0)), ((1752.0 15451.0, 1753.0 15449.0, 1762.0 15449.0, 1764.0 15451.0, 1761.0 15451.0, 1758.0 15454.0, 1756.0 15453.0)))'

I would like to convert it to the following format, but as int;

[((1790.0 , 15563.0) , (1791.0 , 15553.0) , (1790.0 , 15551.0) , (1789.0 , 15549.0) , (1791.0 , 15547.0)) ,((1752.0 , 15451.0) , (1753.0 , 15449.0) , (1762.0 , 15449.0) , (1764.0 , 15451.0) , (1761.0 , 15451.0) , (1758.0 , 15454.0) , (1756.0 , 15453.0))]

What I tried so far?

ip = 'MULTIPOLYGON (((1790.0 15563.0, 1791.0 15553.0, 1790.0 15551.0, 1789.0 15549.0, 1791.0 15547.0)), ((1752.0 15451.0, 1753.0 15449.0, 1762.0 15449.0, 1764.0 15451.0, 1761.0 15451.0, 1758.0 15454.0, 1756.0 15453.0)))'

st = ip[15:-2]
#s = st.split(',')
x = st.replace(", ", ") (")
res=[x.replace(" ", " , ")]
list(map(int, res))

This converts to the above format as string and not int. I get the following error: ValueError: invalid literal for int() with base 10: How do I solve it?

disukumo
  • 321
  • 6
  • 15

3 Answers3

1

Try regular expression

import re

ip = 'MULTIPOLYGON (((1790.0 15563.0, 1791.0 15553.0, 1790.0 15551.0, 1789.0 15549.0, 1791.0 15547.0)), ((1752.0 15451.0, 1753.0 15449.0, 1762.0 15449.0, 1764.0 15451.0, 1761.0 15451.0, 1758.0 15454.0, 1756.0 15453.0)))'

output = [tuple(map(lambda x: int(float(x)),i.split())) for i in re.findall(r'(\d[\d.\s]+\d)', ip)]

print(output)
[(1790, 15563), (1791, 15553), (1790, 15551), (1789, 15549), (1791, 15547), (1752, 15451), (1753, 15449), (1762, 15449), (1764, 15451), (1761, 15451), (1758, 15454), (1756, 15453)]```
Epsi95
  • 8,832
  • 1
  • 16
  • 34
1

This might help you:

from re import compile

pattern = compile("(\d+\.\d)+ (\d+\.\d)+")
ip = 'MULTIPOLYGON (((1790.0 15563.0, 1791.0 15553.0, 1790.0 15551.0, 1789.0 15549.0, 1791.0 15547.0)), ((1752.0 15451.0, 1753.0 15449.0, 1762.0 15449.0, 1764.0 15451.0, 1761.0 15451.0, 1758.0 15454.0, 1756.0 15453.0)))'
st = ip[15:-2]
print([tuple(float(y) for y in x) for x in pattern.findall(st)])

which gives the result:

[(1790.0, 15563.0), (1791.0, 15553.0), (1790.0, 15551.0), (1789.0, 15549.0), (1791.0, 15547.0), (1752.0, 15451.0), (1753.0, 15449.0), (1762.0, 15449.0), (1764.0, 15451.0), (1761.0, 15451.0), (1758.0, 15454.0), (1756.0, 15453.0)]

Jonathan1609
  • 1,809
  • 1
  • 5
  • 22
1

This is not pretty, but it will do what you want, and might be faster than the regex equivalent:

[tuple(tuple(int(float(n))
             for n in pair.split())
       for pair in t.split(', '))
 for t in ip[16:-3].split(')), ((')]

If you have Python 3.9+, you can replace ip[16:-3] with the more intuitive ip.removeprefix('MULTIPOLYGON (((').removesuffix(')))').

McSinyx
  • 164
  • 2
  • 8