0

I'm trying to write a single query that will pull from a table all records that are duplicate entries with different values. The purpose is to enable a user to mark which record in a set is the primary observation.

Table Fields:

idTable (Index), idPlant, idParameter, Value (tinytext), IsPrimary (Yes/No, default No)

I want to be able to show only records that have identical idPlant and idParameter where the Value is different but I want to list the actual records not the number of replicates (which is easy to do). The user can then for each (idPlant and idParameter) group select which idTable is the primary data to update the table and set IsPrimary to 'Yes'.

I'm sure this must be possible but the only way I can see to do it so far is a multi-step process of both MySQL and PHP processing.

I have not been able to get the answers to the similar question to work I think because I am grouping on two columns.

I tried using the SQL fiddle but could not get that to work either. So I've included the create SQL statement and some example data

I am trying to create a Query that would display the entire record for idPhenotype = 3,4,5 (because one of those does not match) and 7,8 (again because one does not match) but ignores idPhenotype 10,11 because the PhenotypeValue is identical

Table SQL

CREATE TABLE IF NOT EXISTS `phenotype` 
 (  `idPhenotype` int(10) unsigned NOT NULL AUTO_INCREMENT
 ,  `idPhenotypeDef` int(11) DEFAULT NULL
 ,  `idPlant` mediumint(9) NOT NULL,  `PhenotypeParameter` tinytext
 ,  `PhenotypeValue` text
 ,  `PhenotypeDescribedBy` tinytext
 ,  `NewestData` enum('Yes','No') NOT NULL DEFAULT 'Yes'
 ,  PRIMARY KEY (`idPhenotype`)
 ,  UNIQUE KEY `idPhenotype_UNIQUE` (`idPhenotype`)
 ) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=509165 ;

Data SQL

INSERT INTO `phenotype` (`idPhenotype`, `idPhenotypeDef`, `idPlant`
, `PhenotypeParameter`, `PhenotypeValue`, `PhenotypeDescribedBy`
, `NewestData`) VALUES
 (1, 1, 1, 'Front', 'Green', 'Historic Data', 'Yes')
,(2, 1, 2, 'Front', 'Green', 'Historic Data', 'Yes')
,(3, 1, 3, 'Front', 'Green', 'Historic Data', 'Yes')
,(4, 1, 3, 'Front', 'Dark Green', 'Historic Data', 'Yes')
,(5, 1, 3, 'Front', 'Green', 'Historic Data', 'Yes')
,(6, 1, 19, 'Front', 'Green', 'Historic Data', 'Yes')
,(7, 1, 20, 'Front', 'Green', 'Historic Data', 'Yes')
,(8, 1, 20, 'Front', 'Brown', 'Historic Data', 'Yes')
,(9, 1, 205, 'Front', 'Green', 'Historic Data', 'Yes')
,(10, 2, 1, 'Back', 'Green', 'Historic Data', 'Yes')
,(11, 2, 1, 'Back', 'Green', 'Historic Data', 'Yes')
,(12, 2, 12, 'Back', 'Green', 'Historic Data', 'Yes')
,(13, 2, 13, 'Back', 'Green', 'Historic Data', 'Yes')
,(14, 2, 14, 'Back', 'Green', 'Historic Data', 'Yes')
,(15, 2, 15, 'Back', 'Green', 'Historic Data', 'Yes');

http://sqlfiddle.com/#!9/346b4

wildplasser
  • 43,142
  • 8
  • 66
  • 109
rsphorler
  • 41
  • 6
  • would it be possible if you could make a [SQLFiddle](http://sqlfiddle.com) – glend Jun 22 '16 at 10:05
  • Possible duplicate of [Find duplicate records in MySQL](http://stackoverflow.com/questions/854128/find-duplicate-records-in-mysql) – Matt Raines Jun 22 '16 at 10:12
  • `Table Fields: ` is a kind of EAV table? WHAT is its relation to the `table SQL` ? why is it relevant to the question? BTW: I think you want to detect the duplicates *even* if the value happens to be the same. – wildplasser Jun 22 '16 at 16:19
  • Yes this would be a Entity, Attribute, Value table. The entity is a specific plant (idPlant), the attribute an observable trait and then the value of that trait. I'm not worried if the value is the same as that is to be expected (recurrent observations year on year) I'm interested if the value differs because that would mean we have a mixed population or a rogue plant. – rsphorler Jun 23 '16 at 10:30

1 Answers1

1

I believe you need "HAVING" contruction:

Try this:

SELECT * FROM phenotype
WHERE CONCAT(idPlant,'-',PhenotypeParameter) IN (
    SELECT CONCAT(idPlant,'-',PhenotypeParameter) 
    FROM phenotype 
    GROUP BY idPlant, PhenotypeParameter 
    HAVING COUNT(*) > 1
)
Sergio Bernardo
  • 368
  • 1
  • 8
  • I think this is approaching what I need but it only lists one row for each set. Although I'm not sure why as an idPlant of 3 and a Parameter of 'Front' should list 3 records. – rsphorler Jun 23 '16 at 10:26
  • Just tested with the table and data you provided... the query returns 7 rows and 3 of them are "idPlant = 3" and "PhenotypeParameter = 'Front'" – Sergio Bernardo Jun 23 '16 at 10:33
  • Interesting, I tried it on the actual table it failed so I created the dummy table above and it worked! So thanks Sergio looks like this should be working now to work out why on the real table its not. – rsphorler Jun 23 '16 at 10:42
  • Worked out why it appeared not to be working. The real table is much larger and I simply needed to include an Order by statement to get the records together. Now the next stage is to get it to ignore the records if the values are identical. – rsphorler Jun 23 '16 at 10:48