-2

I have the following URL in a MySQL database for a PHP application - part of our system allows a user to edit their previous post with these links and save - however as the url gets encoded again when a user edits this is then breaks the url as displayed below.

Is there an easy way or existing PHP function to determine if the string already has been encoded and to alter the string to remove the unwanted characters so it remains in the expected output below.

Expected output url:https://r5uy4lmtdqka6a1rzyexlusfl-902rjcrzfe6k93co7a644-tom.s3.eu-west-2.amazonaws.com/Carbon%20Monoxide/Summer%20CO%20Campaign/CO%20Summer%202022/CO%20Summer%20you%20can%20smell%20the%20BBQ%20-%20600x600.jpg

Actual output url:https://r5uy4lmtdqka6a1rzyexlusfl-902rjcrzfe6k93co7a644-tom.s3.eu-west-2.amazonaws.com/Carbon%2520Monoxide/Summer%2520CO%2520Campaign/CO%2520Summer%25202022/CO%2520Summer%2520you%2520can%2520smell%2520the%2520BBQ%2520-%2520600x600.jpg

Your Common Sense
  • 156,878
  • 40
  • 214
  • 345
Zabs
  • 13,852
  • 45
  • 173
  • 297
  • any reason you're encoding it? – Lawrence Cherone Aug 03 '22 at 08:18
  • Yes. DO NOT double encode your urls. Problem solved. As of your particular question: there is zero information in your question that let anyone answer it, including why you're encoding part of the url and how. – Your Common Sense Aug 03 '22 at 08:20
  • 1
    Does this answer your question? [How to find out if string has already been URL encoded?](https://stackoverflow.com/questions/2295223/how-to-find-out-if-string-has-already-been-url-encoded) - basically decode until it is same as before, then encode it once – IT goldman Aug 03 '22 at 08:22
  • @ITgoldman *obviously* it doesn't. – Your Common Sense Aug 03 '22 at 08:26
  • @ITgoldman because it's an attempt to cure the symptom and not the disease? So instead of fixing the root cause, another ugly patch is going to be added to the code, making it illogical, more complicated and less maintainable? Not to mention **URLs** are never get urlencoded, but only **values** in the **query string**. Here, some characters get encoded and some don't. And there is no way to tell which parts need to be encoded and which left alone – Your Common Sense Aug 03 '22 at 08:39
  • I thought because it was Java. Anyway, sometimes the disease is a given. – IT goldman Aug 03 '22 at 08:50

1 Answers1

1

As suggested in comments, double decode, then encode (only the query string part).

<?php
$str = "https://r5uy4lmtdqka6a1rzyexlusfl-902rjcrzfe6k93co7a644-tom.s3.eu-west-2.amazonaws.com/Carbon%2520Monoxide/Summer%2520CO%2520Campaign/CO%2520Summer%25202022/CO%2520Summer%2520you%2520can%2520smell%2520the%2520BBQ%2520-%2520600x600.jpg";
$str = "https://r5uy4lmtdqka6a1rzyexlusfl-902rjcrzfe6k93co7a644-tom.s3.eu-west-2.amazonaws.com/Carbon%20Monoxide/Summer%20CO%20Campaign/CO%20Summer%202022/CO%20Summer%20you%20can%20smell%20the%20BBQ%20-%20600x600.jpg";

function fix_url($str)
{
    $arr = explode('/', $str, 4);
    $qs = $arr[3]; // add if at all check?

    while (true) {
        $decoded = urldecode($qs);
        if ($decoded == $qs) {
            break;
        }
        $qs = $decoded;
    }
    $encoded = urlencode($decoded);
    $result = $arr[0] . '//' . $arr[2] . $encoded;
    return $result;
}

echo fix_url($str);
IT goldman
  • 14,885
  • 2
  • 14
  • 28
  • As suggested in other comments: Don't do this. Instead remove the code that encodes the URL in the first place. Only exception: clean-up of messy data that came to live due to the broken code. – D. E. Aug 03 '22 at 10:04
  • whilst most people were right in the instance that the double encoding should be removed, I have a weird unique case therefore the code above by goldman did actually work for me so kudos to Goldman – Zabs Aug 03 '22 at 13:54