Before we get to the location in code where this happens, be advised you're entering a world of pain.
There's no simple rule as to how those numbers are generated. There's cases where it's the store ID, there's cases where it's the simple product ID. There's cases where it's neither
Even if there was, it's common for not-from-scratch Magento sites to contain custom functionality that changes this
Ultimately, since Magento's human readable/SEO-friendly URLs are located in the core_url_rewrite
table, it's possible for people to insert arbitrary text
Warnings of doom aside, the Model you're looking for is Mage::getSingleton('catalog/url')
. This contains most of the logic for generating Magento Catalog and product rewrites. All of these methods end by passing the request path through the getUnusedPath
method.
#File: app/code/core/Mage/Catalog/Model/Url.php
public function getUnusedPath($storeId, $requestPath, $idPath)
{
//...
}
This method contains the logic for for creating a unique number on the end of the URL. Tracing this in its entirely is beyond the scope of a Stack Overflow post, but this line in particular is enlightening/disheartening.
$lastRequestPath = $this->getResource()
->getLastUsedRewriteRequestIncrement($match[1], $match[4], $storeId);
if ($lastRequestPath) {
$match[3] = $lastRequestPath;
}
return $match[1]
. (isset($match[3]) ? ($match[3]+1) : '1')
. $match[4];
In particular, these two lines
$match[3] = $lastRequestPath;
//...
. (isset($match[3]) ? ($match[3]+1) : '1')
//...
In case it's not obvious, there are cases where Magento will automatically append a 1
to a URL, and then continue to increment it. This makes the generation of those URLs dependent on system state when they were generated — there's no simple rule.
Other lines of interest in this file are
if (strpos($idPath, 'product') !== false) {
$suffix = $this->getProductUrlSuffix($storeId);
} else {
$suffix = $this->getCategoryUrlSuffix($storeId);
}
This $suffix
will be used on the end of the URL as well, so those methods are worth investigating.
If all you're trying to do is remove numbers from the URL, you might be better off with a regular expression or some explode
/implode
string jiggering.