0

I have a table with two columns, one column (AffiliationCountry) shows the countries and the other column (ArtSubareaKeyword) shows the subject areas in related countries with comma-separated values.

I want to extract the subject area which is repeating for a country the same country one or more times and save it in a new column with the name "MostPopularSubjectArea".

Table with values:

As you can see in the table that a country is repeating and its values are also repeating.

AffiliationCountry ArtSubareaKeyword1 ArtSubareaKeyword1 ArtSubareaKeyword1
Spain Cell membranes Cell staining Coimmunoprecipitation
Kazakhstan Factor analysis Human performance Immunofluorescence
Japan Bone marrow Diagnostic medicine Genetic loci
Kazakhstan Drug research Factor analysis Human performance

Results that are required:

I want a SQL query that can store for that country a new column that stores the common subjects area which is occurring more.

AffiliationCountry MostPopularSubjectArea
Kazakhstan Human performance
CaveCoder
  • 791
  • 3
  • 17
  • 1
    Fix your data model! Do not store multiple values in a string. That is not the SQL way to store data. – Gordon Linoff Mar 28 '21 at 11:03
  • 1
    take a look at this discussion https://stackoverflow.com/questions/3653462/is-storing-a-delimited-list-in-a-database-column-really-that-bad – nbk Mar 28 '21 at 11:05
  • Is it acceptable to separate values in different columns and do what I want? What is your opinion? I mean will it work for this question ? – Abdul Wahab Mar 28 '21 at 11:34
  • I have edited my Table you can see table, if any other changes are required please tell me. Thanks – Abdul Wahab Mar 28 '21 at 11:59
  • Any kind of help from you guys will be very precious for me :) Really. – Abdul Wahab Mar 28 '21 at 12:06
  • Spreading an array across columns is also a bad schema design. Make another table with 2 columns: AffiliationCountry and Keyword. – Rick James Mar 28 '21 at 15:21

1 Answers1

0

As per the table, you can select the pair of columns, union them and find the count using group by:

select 
t1.affiliation_country, t1.keyword, count(t1.keyword) as count_keyword
from
(
    select affiliation_country, lower(artsubareakeyword1) keyword from affliation_details
    union all
    select affiliation_country, lower(artsubareakeyword2) from affliation_details
    union all
    select affiliation_country, lower(artsubareakeyword3) from affliation_details
) t1
group by
t1.affiliation_country, t1.keyword
order by
count(t1.keyword) desc

Query Reference(Fiddle): https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=25bded45786a85f6740902699e633846

Updated Query:

with affiliation_details as
(
select 
t1.affiliation_country, t1.keyword, count(t1.keyword) as count_keyword
from
(
select affiliation_country, lower(artsubareakeyword1) keyword from affliation_details
union all
select affiliation_country, lower(artsubareakeyword2) from affliation_details
union all
select affiliation_country, lower(artsubareakeyword3) from affliation_details
) t1
group by
t1.affiliation_country, t1.keyword
order by
count(t1.keyword) desc
)

select 
  distinct affiliation_country
from
  affiliation_details
where
  count_keyword in (
    select 
      max(count_keyword) 
    from 
      affiliation_details
  )
Nimantha
  • 6,405
  • 6
  • 28
  • 69
Praveen
  • 415
  • 5
  • 9