-1

Im trying to read the server states from the guildwars API. For that i match the servername, then comes an occasional language specifier and a ",\n which i intend to match with .* and after that follows the population. But instead of directly matching the first occurrence of population it instead matches the last one. Can someone tell me why( and how to fix this)?

Edit: I found a workaround. By substituting .* with .{,20} it works.

relevant part of the API
"name": "Riverside [DE]",
"population": "Full"


with urlopen('https://api.guildwars2.com/v2/worlds?ids=all') as api:
s = api.read()
s = s.decode('utf-8')
search = re.search(r'''Riverside.*"population": "''',s,re.S)
print(search)
s = s[search.span()[1]:]
state = re.search(r'[a-zA-Z]*',s)
print(state)
Eumel
  • 1,298
  • 1
  • 9
  • 19

1 Answers1

1

There are two things

  1. You should use .*?(trailing question mark) which will stop at the first instance.I wont think this as good or better solution
  2. Instead once you get the data convert it into JSON and do your manipulation on top of it
import json

with urlopen('https://api.guildwars2.com/v2/worlds?ids=all') as api:
  s = api.read()
  s = s.decode('utf-8')
  jsondata = json.loads(s)
  filtered_data = filter(lambda a: str(a["name"]).find("Riverside") > -1,jsondata)
  print(filtered_data[0]["population"])
  • why does the ? lead to stopping at the first instance? – Eumel Sep 26 '17 at 16:31
  • Because thats a lazy regex expression which means match shortest possible string. – chakradhar kasturi Sep 26 '17 at 16:45
  • oh and here i thought that would be the default cause noone would ever want the longest one... – Eumel Sep 26 '17 at 16:47
  • @Eumel [Greediness](https://stackoverflow.com/questions/2301285/what-do-lazy-and-greedy-mean-in-the-context-of-regular-expressions) is often about performance. If you're trying to match `a*b`, you'll match `aaaaaaaaaaaaaaaaaaaaaaaaaaaab` faster if `*` is greedy. In cases where you don't care *what* matches as long as it *does* match, you end up wanting greedy quantifiers a lot. – trent Sep 27 '17 at 14:10