0

The column (keyword) of a table (market_items) have keywords separating each keyword with a comma. I'm using this query to fetch keywords;

$cid = intval($cid);
$query = $db->query("
    SELECT keywords
    FROM ".TABLE_PREFIX."market_items
    WHERE cid = '{$cid}'
    GROUP BY keywords
    ORDER BY dateline DESC
");
$cat_tags = '';
while ($t = $db->fetch_array($query))
{
    $keywords = trim($t['keywords']);
    $keys = explode(",",$keywords);
    foreach ($keys AS $key)
    {
        $cat_tags .= '<span class="small_buttons_class"><a href="market.php?action=tag_items&keywords='.$key.'">'.$key.'</a></span> ';
    }
}

It fetches keywords fine but its showing similar keywords like this:

keyword1 keyword2 keyword1 keyword3 keyword4 keyword5 keyword3

"Notice the keyword number above"

How can I remove similar keywords and show it like this? e.g.

keyword1 keyword2 keyword3 keyword4 keyword5

Just to make it clear, in column "keywords" the keywords are stored like this:

keyword2, keyword4, keyword3, keyword1

Please help!

user2854563
  • 268
  • 1
  • 11
  • 25

5 Answers5

3

You're running into one of the known difficulties of storing lists in comma-separated strings: ensuring that the list uniqueness or sort order in the list.

See my answer to Is storing a delimited list in a database column really that bad? for more pitfalls of this approach.

To solve this in SQL, you should store a list of keywords not as a string, but store one keyword per row in a second table.

CREATE TABLE market_items_keywords (
  cid INT,
  keyword VARCHAR(20),
  PRIMARY KEY (cid, keyword)
);

The primary key in this table is defined so that you can have multiple keywords per cid, but for a given cid, each keyword appears only once, and in alphabetical order.

You get a lot of other benefits from storing the keywords in this way.

  • You can index the keyword column to look up which cid's have a given keyword more efficiently than using LIKE or regular expressions.
  • You can do reports like which keywords are most popular.
  • You can easily get a list of distinct keywords used across the whole project.
  • etc.

Re your comment:

You can still use one query to get your keywords, but with assurance the list is unique and pre-sorted:

$query = $db->query("
    SELECT keyword
    FROM ".TABLE_PREFIX."market_items_keywords
    WHERE cid = '{$cid}'
");

If you need them sorted by dateline:

$query = $db->query("
    SELECT k.keyword
    FROM ".TABLE_PREFIX."market_items_keywords AS k
    JOIN ".TABLE_PREFIX."market_items AS i USING (cid)
    WHERE k.cid = '{$cid}'
    ORDER BY i.dateline DESC
");

Doing it this way means you don't have to explode() the list of keywords, nor have a foreach loop inside your while loop. Just fetch the rows.

Community
  • 1
  • 1
Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • Thanks for your tip Karwin, but since I'm reluctant to use an additional query for fetching list of keywords that's why I'm using that. Your answer really helped me, thanks! – user2854563 Nov 11 '13 at 15:20
1
while ($t = $db->fetch_array($query))
{
    $keywords = trim($t['keywords']);
    $keys = explode(",",$keywords);
    foreach ($keys AS $key)
    {
        $tags[$key] = $key;        
    }
}
foreach ($tags as $tag) {
    $cat_tags .= '<span class="small_buttons_class"><a href="market.php?action=tag_items&keywords='.$tag.'">'.$tag.'</a></span> ';
}
Daart Kote
  • 81
  • 5
1

Try this;

$cid = intval($cid);
$query = $db->query("
    SELECT keywords
    FROM ".TABLE_PREFIX."market_items
    WHERE cid = '{$cid}'
    ORDER BY dateline DESC
");
$cat_tags = '';
while ($t = $db->fetch_array($query))
{
    $keywords = trim($t['keywords']);
    $keys = explode(",",$keywords);
    $keys = array_unique($keys);
    foreach ($keys AS $key)
    {
        $key = trim($key);
        $key_collect[$key] = $key;
    }
}

foreach ($key_collect as $k)
{
    $cat_tags .= '<span class="small_buttons_class"><a href="market.php?action=tag_items&keywords='.$k.'">'.$k.'</a></span> ';
}
Imran Omer
  • 701
  • 1
  • 8
  • 20
0

SELECT DISTINCT... {and the rest of your query}

and it seems to me that you don't need to "group by"

Guillermo Mansilla
  • 3,779
  • 2
  • 29
  • 34
0

See Porter's algorithm. I made an implementation a few years ago for Spanish language working on search engine. The english implementation is more easy. You can start from my code PHP Stemmer. This remove grammatic suffixes from words based on grammar rules.

rernesto
  • 554
  • 4
  • 11