On our website we used to have loads of links which had 4 digits on the end of them. They looked like this
www.example.com/example-1234.html.
The URLs with the numbers had overridden the normal URL which would be like
www.example.com/example.html.
We got rid of them by truncating the core_url_rewrite
table and this turned the active URL's with numbers at the end of them into 404's.
However, recently I have noticed that these numbers have come back and I'm not really sure why. However, this time they are also effecting category URLs so some of our category URLS look like:
www.example.com/main-category/sub-category-1234/product.html
I found this article on stackoverflow which was useful: Magento - Removing numbers in url key/product url .
However I still don't understand why this is happening. I found that the function getUnusedPath
is what is causing these numbers to happen and that it looks like it will only create those numbers at the end of the URL is
if ($rewrite && $rewrite->getId())
// and $rewrite` is equal to
$rewrite = $this->getResource()->getRewriteByRequestPath($requestPath, $storeId);
Do you know where I can find out what
getResource()->getRewriteByRequestPath($requestPath, $storeId);
does? Why are we getting these numbers appearing at the end of the URL? Do we have a setting turned on that does this? (the file that does this is located in
app/code/core/Mage/Catalog/Model/Url.php around `line 640`
I tried all the different types of saving a product to see if when you save it magento updates the URL but that doesn't work. I then tried to reindex the url rewrites and that didn't do anything either. We have 2 magento websites and on our second magento website this isn't happening and the core_url_rewrite
table doesn't have any of the urls with numbers in it. Why is it happening to one of our sites and not the other? How can we stop the URLS from having numbers added on to them and How can we find out why they are being generated?
I have also now found out that this happens every time the url rewrites are reindex as that script is running every time it's reindexed. We index the
url_rewrites
every time you save a product. We use unique URLs for all our products but I don't know why it is happening. I have also discovered that for the numbers to be added on to the end of the URL it has to fit this regex query
#[0-9a-z/-]+?(-([0-9]+))?('.preg_quote($suffix).')?$#i`
(does anyone know what this means? I have tried to look in a REGEX calculator and it doesn't help me make sense of which strings would fit this?