0

I’m trying to search a long string of characters for a country name. The country name is sometimes more than one word, such as Costa Rica.

Here is my code:

eol = len(CountryList)
for c in range(0, eol):
   country = str(CountryList[c])
   countrymatch = re.search(country, fullsampledata)
   if countrymatch:
      ...

fullsampledata is a long string with all the data in one line. I’m trying to parse out the country by cycling thru a list of valid country names. If country is only one word, such as ‘Holland’, it finds it. However, if country is two or more words, ‘Costa Rica’, it doesn’t find it. Why?

MissMay
  • 11
  • 2
  • Possible duplicate of [Does Python have a string 'contains' substring method?](https://stackoverflow.com/questions/3437059/does-python-have-a-string-contains-substring-method) – zwer Apr 06 '18 at 01:41
  • I cannot replicate your problem. What do CountryList and fullsampledata look like? – Jrod24 Apr 06 '18 at 01:46
  • fullsampledata = "5421/2017 3/01/2017 Green Beans Costa Rica 127 Pleasant View Drive". CountryList is a list of valid countries, so for this example CountryList[27] = "Costa Rica". So they should match up. The problem is that I don't know where in the fullsampledata string, where the country is going to be. – MissMay Apr 06 '18 at 01:59
  • So I'm running `import re CountryList = ["Holland", "Costa Rica"] fullsampledata = "5421/2017 3/01/2017 Green Beans Costa Rica 127 Pleasant View Drive" eol = len(CountryList) for c in range(0, eol): country = str(CountryList[c]) countrymatch = re.search(country, fullsampledata) if countrymatch: print country` and it seems to be working – Jrod24 Apr 06 '18 at 02:03
  • So I found that the problem is that some of the countries have special characters in them, such as, 'Holland (the Netherlands)' and India/Pakistan. So the search wasn't returning what I was expecting. How would I state my search pattern to include special characters that are in the middle of a string? – MissMay Apr 06 '18 at 03:31
  • Well that is interesting, but it is the opposite of the problem that you have asked above. If you are interested in a solution to the new problem , I suggest re-asking the question or up-voting this response and I will do my best to provide a solution to your problem as I understand it to be. – Jrod24 Apr 08 '18 at 03:30

2 Answers2

1

You can search for a substring in a string using the .find() function as follows

fullsampledata = "hwfekfwekjfnkwfehCosta Ricakwjfkwfekfekfw"
fullsampledata.find("Morocco")

-1

fullsampledata.index("Costa Rica")

17

So you can make your if statement as follows

fullsampledata = "hwfekfwekjfnkwfehCosta Ricakwjfkwfekfekfw"
country = "Costa Rica"
if fullsampledata.index(country) != -1:
   # Found
   pass
else:
   # Not Found
   pass
JahKnows
  • 2,618
  • 3
  • 22
  • 37
  • Awesome! That worked! Thank you. I'll read about the .index feature to understand it. Can you explain why a two word pattern doesn't work? Just so I understand how the search works better. The documentation doesn't really explain much. – MissMay Apr 06 '18 at 02:27
0
In [1]: long_string = 'asdfsadfCosta Ricaasdkj asdfsd asdjas USA alsj'

In [2]: 'Costa Rica' in long_string
Out[2]: True

You don't have your code properly shown and I'm a little too lazy to parse it. Hope this helps.

dbs.83
  • 159
  • 1
  • 7