I'm trying to sanitise the URL, and extract only Id's from it (by skipping the URL part). By doing some debugging, I can see the value im looking for is printing, however, not returning (or returns None)
This is the code:
def _sanitize_urls(urls=None):
redact_list = [
("abcd.google.com", 3),
("xyz.yahoo.com", 4),
]
urls_sanitized = []
redact_found = [redact for redact in redact_list if redact[0] in urls]
if redact_found:
urls = urls.split(" ")
print(urls)
urls_sanitized = [
words.split("/")[redact_found[0][1]] if redact_found[0][0] in words else words for words in urls
]
print(urls_sanitized)
urls_sanitized = " ".join(urls_sanitized)
print(urls_sanitized)
redact_found = [redact for redact in redact_list if redact[0] in urls_sanitized]
print(redact_found)
if not redact_found:
print(urls_sanitized)
return urls_sanitized
else:
_sanitize_urls(urls_sanitized)
def main():
urls = "https://abcd.google.com/ID-XXXX and https://xyz.yahoo.com/Id/ID-XXXX"
redact_exists = _sanitize_urls(urls)
print(redact_exists)
if __name__ == "__main__":
main()
Output Im expecting is => "ID-XXXX and ID-XXXX". Output I'm getting right now is None.
With some debugging on my side =>
['https://abcd.google.com/ID-XXXX', 'and', 'https://xyz.yahoo.com/Id/ID-XXXX']
['ID-XXXX', 'and', 'https://xyz.yahoo.com/Id/ID-XXXX']
ID-XXXX and https://xyz.yahoo.com/Id/ID-XXXX
[('xyz.yahoo.com', 4)]
['ID-XXXX', 'and', 'https://xyz.yahoo.com/Id/ID-XXXX']
['ID-XXXX', 'and', 'ID-XXXX']
ID-XXXX and ID-XXXX
[]
ID-XXXX and ID-XXXX
None
As you can see, print, prints the correct value, till the last moment, however its not returning to main function, rather returning None. Any ideas?