0

Keep in mind this in within a loop.

How can I remove everything from "?" and so on?

So that "something_else_1" gets deleted

Url_before = "https:www.something.com?something_else_1"


Url_wanted = "https:www.something.com?"

In practice it looks kinda like this:

find_href = driver.find_elements(By.CSS_SELECTOR, 'img.MosaicAsset-module__thumb___yvFP5')

with open("URLS/text_urls.txt", "a+") as textFile:
            for my_href in find_href:
                textFile.write(str(my_href.get_attribute("src"))+"#do_something_to_remove_part_after_?_in_find_href"+"\n")
AnxiousDino
  • 187
  • 15
  • Removing a part of a string is pretty well-documented (https://stackoverflow.com/questions/904746/how-to-remove-all-characters-after-a-specific-character-in-python) What did you try, and why did it fail to meet your requirements? Please read [ask] and the [question checklist](//meta.stackoverflow.com/q/260648/843953), and provide a [mre] – Pranav Hosangadi Jul 27 '22 at 17:15
  • 1
    Does this answer your question? [How to remove all characters after a specific character in python?](https://stackoverflow.com/questions/904746/how-to-remove-all-characters-after-a-specific-character-in-python) – Rory Jul 27 '22 at 17:20

2 Answers2

4

Provided there's only one instance of "?" in the string and you want to remove everything after it, you could find the index of this character with

i = Url_before.index("?")

and then remove everything after it:

Url_wanted = Url_before[:i+1]
Xavi
  • 370
  • 1
  • 7
  • 1
    This works even if there are multiple `?`, provided you want to remove everything after the first instance – Pranav Hosangadi Jul 27 '22 at 17:14
  • You are right, yeah. I was saying that more to specify that it's not entirely clear what the OP would want to happen if there were multiple '?'. Maybe using regex *would* be the simpler option then. – Xavi Jul 27 '22 at 17:16
3

Use re:

import re
Url_before = "https://media.gettyimages.com/photos/grilled-halibut-with-spinach-leeks-and-pine-nuts-picture-id503337620?k=20&m=503337620&s=612x612&w=0&h=3G6G_9rzGuNYLOm9EG4yiZkGWNWS7yadVoAen2N80IQ="
re.sub('\\?.+', '', Url_before) + "?"
'https://media.gettyimages.com/photos/grilled-halibut-with-spinach-leeks-and-pine-nuts-picture-id503337620?'

Alternatively you could split the string on ? and keep the first part:

Url_before.split("?")[0] + "?" # again adding the question mark
'https://media.gettyimages.com/photos/grilled-halibut-with-spinach-leeks-and-pine-nuts-picture-id503337620?'

EDIT: Added + "?" because I realised you wanted to keep it.

SamR
  • 8,826
  • 3
  • 11
  • 33