Trying in building a Searchengine. Need to echo user submitted links on Searchengine Result Pages. Problem is, different submitted urls will be in different php format.
Q1. How to auto detect the format ?
Q2. My searchengine will be opening the iFrame (of user submitted links) to other users (keyword searchers). How will my php script automatically know which part of the url to run htmlentities() and which parts to run the urlencode() ? I can't be manually check the url format on a million link each day that my users submit to me everyday.
I mean, if I was opening an iFrame to my own link then no problem as I know my own link's format: Example:
$url = 'http://localhost/test/pagination.php';
$search = $domain;
$tbl = 'linking_histories';
$col= 'domain';
$i = 1;
$limit = 2;
printf(
"<iframe src='%s?mysql_tbl=%s&mysql_column=%s&keyword_search=%s&result_limit_per_page=%d&page_number=%d'></iframe><br>",
htmlentities($url),
urlencode($tbl),
urlencode($col),
urlencode($search),
$limit, // %d place-holder will force integer
$i,
urlencode($limit),
urlencode($i)
);
I mean, a user might submit a normal static link like so:
A.
'http://localhost/test/pagination.php';
Or, a dynamic one, like these:
B.
'http://localhost/test/pagination.php?keyword=cars'; //%s (printf).
C.
'http://localhost/test/pagination.php?page=4'; //%d (printf).
D.
'http://localhost/test/pagination.php?keyword=cars&page=4';
//%s (printf) & %d (printf).
For example A, this php code is ok to echo the url in the iFrame:
$url = 'http://localhost/test/pagination.php';
printf(
"<iframe src='%s'></iframe><br>",
htmlentities($url),
);
For the example B submitted link, this particular php code is fine to echo the url in the iFrame:
$url = 'http://localhost/test/pagination.php';
$search = $domain;
printf(
"<iframe src='%s?keyword_search=%s'></iframe><br>",
htmlentities($url),
urlencode($search),
);
For example C, this particular php code is correct to echo the url in the iFrame:
$url = 'http://localhost/test/pagination.php';
$i = 1;
printf(
"<iframe src='%s?page_number=%d'></iframe><br>",
htmlentities($url),
$i,
urlencode($i)
);
For D, this particular php code is correct:
$url = 'http://localhost/test/pagination.php';
$search = $domain;
$i = 1;
printf(
"<iframe src='%s?mysql_tbl=%s&mysql_column=%s&keyword_search=%s&page_number=%d'></iframe><br>",
htmlentities($url),
urlencode($search),
$i,
urlencode($i)
);
As you can see from the above, notall the 4 links on the 4 iframes are in same url format. One uses just htmlentities and no urlencode, another uses the htmentities plus one urlencode ONLY while another uses the htmlentities and TWO urlencode and so on. Some links have INT while others don't.
Now since each user submitted link will be different to each other, then I can't use one set of printf to echo all url formats.
So how to detect the url format on auto to generate the right printf with the right data type on the printf (eg. '%s', '%d") for that particular url the user submits ?
Is there any function in php that can detect the url type to tell me which functions (htmlentities, urlencode(), %s, %d, etc.) to use on which part of the url ? You know the var_dump() tells you the data type. Something like that I am looking for.
Care to show an code example how to achieve my purpose ? Remember, I need to secure the link outputs so nobody can inject any link in the iFrames ?
**EDIT: Do I use htmentities() or urlencode() here ? Or both ? Imagine url is either this:
$url = 'http://localhost/test/pagination.php?tbl=links&col=domain&search=elvanja.com&page=1&limit=5';
Or, this:
$url = http://www.elvanja.com/contactus.php;
Example 1:
printf("<iframe src='%s'></iframe><br>",
htmlentities($url));
Example 2:
printf("<iframe src='%s'></iframe><br>",
urlencode($url));
Example 3:
printf("<iframe src='%s?tbl=%s&col=%s&search=%s&limit=%d&page=%d'></iframe><br>",
htmlentities($url),
urlencode($url));
I going for EXAMPLE 3, what you say ?**
", htmlentities($url)); Example 2: printf( "
", urlencode($url)); Example 3: printf( "
", htmlentities($url), urlencode($url) ); – studentprogrammer2020 Nov 19 '20 at 18:30