0

I try to Remove extra char from this URL(Please watch snapshot)

https://test.com/ABC]VR͜

URL Snapshot

but I can't make it perfect URL format, and I try this code

    import re

    p = re.compile(r'(?:\w*://)?(?:.*?\.)?(?:([a-zA-Z-1-9]*)\.)?([a-zA-Z-1-9]*\.[a-zA-Z]{1,}).*')

    domain = 'https://test.com/ABC]VR͜'

    print(p.match(domain))

Code Snapshot

but still, it gives output

https://test.com/ABC]\x1dV\x1dR͜\x1d

where I need output like

https://test.com/ABCVR

So, is there any build-in method in python to do that? or if need to make manually any way without regular expression? if no way without regular expression then how to do that with a regular expression?

Thank you.

SAM
  • 58
  • 6
  • https://stackoverflow.com/questions/1547899/which-characters-make-a-url-invalid This question shows which char need to remove but not show any method or programming way. – SAM Jan 28 '21 at 06:01
  • if your goal is to simply replace invalid characters, why not [`re.sub`](https://docs.python.org/3/library/re.html#re.sub) – Chase Jan 28 '21 at 06:21

0 Answers0