0

I've created a query that joins six tables:

SELECT a.accession, b.value, c.name, d.description, e.value, f.seqlen, f.residues
FROM chado.dbxref a inner join chado.dbxrefprop b on a.dbxref_id = b.dbxref_id
inner join chado.biomaterial d on b.dbxref_id = d.dbxref_id
inner join chado.feature f on d.dbxref_id = f.dbxref_id
inner join chado.biomaterialprop e on d.biomaterial_id = e.biomaterial_id
inner join chado.contact c on d.biosourceprovider_id = c.contact_id;

The output: enter image description here

I'm currently working with a PostgreSQL schema called Chado (http://gmod.org/wiki/Chado_Tables). My attempts to comply with the preexisting schema have led me to deposit multiple joined values within the same table (two different values within the dbxrefprop table, three different values within the biomaterialprop table). Querying the database results in a substantial amount of redundant output. Is there a way for me to reduce output redundancy by modifying my query statement? Ideally, I'd like the output to resemble the following:

test001 | GB0101 | source011 | Faaberg,K.; Lyoo,K.; Korol,D.M. | serum | T1 | Iowa, USA | 01 Jan 2005 | 1234 | AUGAACGCCUUGCAUUACUAUGACUAUGAUU
Ghoti
  • 737
  • 4
  • 19
  • 1
    I think what you are looking for is something like group_concat() in mysql, i'm not a big postgresql user but i think i found something very similar of what group_concat does in mysql. https://stackoverflow.com/questions/2560946/postgresql-group-concat-equivalent – Michael Tétreault Dec 15 '17 at 20:46
  • Thanks for the quick response Michael T. I'm relatively new to SQL and was struggling to articulate a question. An example was the best I could manage to demonstrate my problem. I agree, a function similar to group_concat() appears to be what I need. I'll give it a try. Thanks for again! – Ghoti Dec 15 '17 at 20:50
  • 1
    a better way to close off the question would be for @MIchaelT to answer it and get the rep points - or - answer it yourself then accept it, that way we don't get yet another unanswered question. – Paul Maxwell Dec 15 '17 at 23:24
  • As per your advice, I've changed my update to an answer to better indicate my problem has been resolved. Thanks. – Ghoti Dec 16 '17 at 00:08

1 Answers1

0

Working query statement:

SELECT a.accession, string_agg(distinct b.value, ' | ' ORDER BY b.value) AS bvalue_list, c.name, d.description, string_agg(distinct e.value, ' | ' ORDER BY e.value) AS evalue_list, f.seqlen, f.residues
FROM chado.dbxref a INNER JOIN chado.dbxrefprop b ON a.dbxref_id = b.dbxref_id
INNER JOIN chado.biomaterial d ON b.dbxref_id = d.dbxref_id
INNER JOIN chado.feature f ON d.dbxref_id = f.dbxref_id
INNER JOIN chado.biomaterialprop e ON d.biomaterial_id = e.biomaterial_id
INNER JOIN chado.contact c ON d.biosourceprovider_id = c.contact_id
GROUP BY a.accession, c.name, d.description, f.seqlen, f.residues;
Ghoti
  • 737
  • 4
  • 19