EDIT UPDATE: Turns out I have version 5.7 so Window Functions are not an option in order to find a solution.
SHOW VARIABLES LIKE 'version';
+---------------+------------+
| Variable_name | Value |
+---------------+------------+
| version | 5.7.21-log |
+---------------+------------+
Description of the problem: I have a ternary relationship table between offers, skills and profiles. That ternary relationship has an attribute, ranking.
I have a skill table where I can see the skill's name. Up untill now I had to do two queries:
1) Give me the top 10 ranked skills per profile:
SELECT DISTINCT ternary.id_skill, skill.name_skill, ranking_skill
FROM ternary
INNER JOIN skill ON skill.id_skill=ternary.id_skill
WHERE ternary.id_perfil= #IntNumber#
GROUP BY ternary.id_skill
ORDER BY ternary.ranking_skill DESC
LIMIT 10;
2) For a list of ID skills, give me if they appear in any profile, and how many times they do appear.
SELECT DISTINCT ternary.id_profile, nombre_profile, COUNT(DISTINCT ternary.id_skill) AS matching
FROM ternary
INNER JOIN profile ON ternary.id_profile=profile.id_profile
WHERE ternary.id_skill= '858534430'
OR ternary.id_skill= '3213227'
OR ternary.id_skill= '3254818'
GROUP BY(ternary.id_profile)
ORDER BY matching DESC;
In that last query a problem has been identified: It "searches" for a skill appearing at any point for a profile. Since it is possible for a profile to have thousands of skills, it can be misleading, due to what we want to achieve I now need to only "search" when it's one of the top 10 skills of ANY profile. But only in top 10.
So far, basically I have been trying to mix both queries, with little success, because it seems I cannot make a partition over two columns, and even if I use only one, I get You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '(PARTITION BY
:
SELECT *
FROM
(
SELECT DISTINCT ternary.id_skill,
skill.name_skill,
ternary.ranking_skill,
ternary.id_profile,
ROW_NUMBER() OVER(PARTITION BY id_profile, id_skill ORDER BY ternary.ranking_skill DESC) rn
FROM ternary
INNER JOIN skill ON skill.id_skill=ternary.id_skill
)
WHERE rn < 11;
I have learned this operation might be called groupwise maximum, and I have seen several answers looking for this. I have not been able to replicate any of them, and I need it specifically for mysql Ver 14.14 Distrib 5.5.60, for Linux (x86_64) using readline 5.
if it's of any help (I have tried answers that were perfect for some other, similar databases but would not work in mysql).
Tables definition:
CREATE TABLE `ternary` (
`id_offer` varchar(200) NOT NULL,
`id_skill` varchar(200) NOT NULL,
`id_profile` varchar(200) NOT NULL,
`ranking_skill` double NOT NULL,
PRIMARY KEY (`id_offer`,`id_skill`,`id_profile`),
KEY `id_skill` (`id_skill`),
KEY `id_profile` (`id_profile`),
CONSTRAINT `ternary_ibfk_1` FOREIGN KEY (`id_offer`) REFERENCES `offer` (`id_offer`),
CONSTRAINT `ternary_ibfk_2` FOREIGN KEY (`id_skill`) REFERENCES `skill` (`id_skill`),
CONSTRAINT `ternary_ibfk_3` FOREIGN KEY (`id_profile`) REFERENCES `profile` (`id_profile`)
)
CREATE TABLE `skill` (
`id_skill` varchar(200) NOT NULL,
`name_skill` varchar(200) DEFAULT NULL,
`date` date DEFAULT NULL,
PRIMARY KEY (`id_skill`)
)
Results of doing a
select * from ternay limit 10;
+------------+------------+-----------+----------------------+
| id_oferta | id_skill | id_perfil | ranking_skill |
+------------+------------+-----------+----------------------+
| 1004 | 107 | 679681082 | 0 |
| 1004 | 115 | 679681082 | 0.10846866454897801 |
| 1004 | 117 | 679681082 | 0.038003619695992294 |
| 1004 | 129 | 679681082 | 0.04987975085098989 |
| 1004 | 147 | 679681082 | 0.02771097269499438 |
| 1004 | 299 | 679681082 | 0.0522549770819894 |
| 1004 | 321 | 679681082 | 0.11955305362697576 |
| 1004 | 417 | 679681082 | 0.11321911701097703 |
| 1004 | 964 | 679681082 | 0.015043099462996949 |
| 1004 | 967 | 679681082 | 0.05304671915898924 |
+------------+------------+-----------+----------------------+
Result of query 1) describe above, which gives me top 10 for ONE profile
+------------+--------------+---------------------+
| id_skill | name_skill | ranking_skill |
+------------+--------------+---------------------+
| 109 | scala | 0.3089840175329823 |
| 122 | hadoop | 0.24164146109602963 |
| 9731 | python | 0.21470443852124863 |
| 325 | java | 0.18776741594646754 |
| 114 | sql | 0.14736188208429596 |
| 101 | kafka | 0.13389337079690544 |
| 301 | bbdd | 0.13389337079690544 |
| 927 | agile | 0.13389337079690544 |
| 320 | hive | 0.1204248595095149 |
| 109 | spark | 0.1204248595095149 |
+------------+--------------+---------------------+