0

I have OpenStreetMap data loaded to a PostgreSQL table. A hstore type column contains all of the tags. I would like to make a comparison chart to see how many records has name, name:en, name:bg tags for example. The result I would like to see is something like this:

enter image description here

I can achieve this manually using this query:

SELECT 1 AS id, '+'  AS name, NULL AS "name:en", NULL AS "name:bg", count(*) FROM public.ways WHERE exist(tags,'name')      UNION
SELECT 2 AS id, NULL AS name, '+'  AS "name:en", NULL AS "name:bg", count(*) FROM public.ways WHERE exist(tags,'name:en')   UNION
SELECT 3 AS id, NULL AS name, NULL AS "name:en", '+'  AS "name:bg", count(*) FROM public.ways WHERE exist(tags,'name:bg')   UNION
SELECT 4 AS id, '+'  AS name, '+'  AS "name:en", NULL AS "name:bg", count(*) FROM public.ways WHERE exist(tags,'name')      AND exist(tags,'name:en')       UNION
SELECT 5 AS id, '+'  AS name, NULL AS "name:en", '+'  AS "name:bg", count(*) FROM public.ways WHERE exist(tags,'name')      AND exist(tags,'name:bg')       UNION
SELECT 6 AS id, '+'  AS name, '-'  AS "name:en", NULL AS "name:bg", count(*) FROM public.ways WHERE exist(tags,'name')      AND NOT exist(tags,'name:en')   UNION
SELECT 7 AS id, '-'  AS name, '+'  AS "name:en", NULL AS "name:bg", count(*) FROM public.ways WHERE NOT exist(tags,'name')  AND exist(tags,'name:en')
ORDER  BY id

I consider this unnecessarily long and overcomplicated, plus I have to do it manually. I know there are some possibilities using the crosstab function, but I couldn't get it working. Based on the answer to this question I was able to create something like this:

SELECT * FROM crosstab(
    'SELECT tags::text~''"name"=>".*"'' as a, tags::text~''"name:en"=>".*"'' as b, tags::text~''"name_int"=>".*"'' as c FROM public.ways') 
AS ct (name boolean,"name:en" boolean, "name:bg" boolean)
GROUP BY name,"name:en","name:bg"

enter image description here

My problem is that I cannot seem to add a count column to this, and that it does not contain options where only one of the three condition is taken into account.

Any idea how could I solve this problem, or any direction where should I start?

Example data lines:

1 "name"=>"dm"
2 "name"=>"Ешекчи дере", "name:en"=>"Khatak Dere River"
3 "name:en"=>"Sushitsa"
4 "name"=>"Слънчева", "name:bg"=>"Слънчева", "name:en"=>"Slantcheva"
588chm
  • 215
  • 1
  • 8

1 Answers1

0

Hello look if its works for you , it is possible to generate a join from a emulated table from a select to group the values :

SELECT row_number() OVER() AS id ,COUNT(*) AS count , COALESCE(a.tags , '')||COALESCE(b.tags,'')||COALESCE(c.tags ,'') AS tagcombination, 
CASE WHEN  COALESCE(a.tags , '')||COALESCE(b.tags,'')||COALESCE(c.tags ,'')="name:en" THEN '+'
WHEN  COALESCE(a.tags , '')||COALESCE(b.tags,'')||COALESCE(c.tags ,'')  = 'name:en' THEN '+' END AS name 

FROM public.ways AS a 
LEFT JOIN (SELECT DISTINCT tags FROM public.ways WHERE tags = 'name' ) AS b ON a.tags = b.tags 
LEFT JOIN (SELECT DISTINCT tags FROM public.ways WHERE tags IN('name:en', 'name:bg' )  )  AS c ON a.tags = c.tags 
JOIN (SELECT generate_series )
GROUP BY tagcombination
--WHERE a.tags IS NOT NULL 
--ORDER BY name

The name column could be translated into numbers from the tagscombination and even be ordered later if it fits better your relatory. Need to do the test and use a predicate to filter if there is more values possibilities than you want to count in the table also.

mwalter
  • 102
  • 5