0

I am newbie in PySpark. I created spark dataframe,and I have a column "Countries" which contains list of countries. How can i groupBy my dataframe by single countries which exists in Country list

+-----------------+
|        countries|
+-----------------+
|  [Россия, Китай]|
| [Великобритания]|
|       [Норвегия]|
|         [Россия]|
|               []|
|            [США]|
|         [Россия]|
|            [США]|
|               []|
|         [Россия]|
|               []|
|               []|
|         [Италия]|
| [Россия, Грузия]|
|            [США]|
|               []|
|               []|
|               []|
|[Великобритания ]|
|       [Беларусь]|
+-----------------+
  • Welcome to stackoverflow. Please add a [reproducible example](https://stackoverflow.com/questions/48427185/how-to-make-good-reproducible-apache-spark-examples) your question. – cronoik Oct 09 '19 at 14:25

1 Answers1

0

you can take a look in the official PySpark doc. With the groupBy, which is part of the pyspark.sql module, function u can group your dataframe. Latest PySpark doc If you want to group by multiple cols you can just pass a list with *listname.

data_frame_name.groupBy("countries")
RacoonOnMoon
  • 1,556
  • 14
  • 29
  • I want to groupBy by elements in lists – Oleg Zdanevich Oct 09 '19 at 14:38
  • Can you be a little bit more detailed? Which list? – RacoonOnMoon Oct 09 '19 at 14:40
  • Are you familiar with SQL? Do you want something similar like "group by colname"? Or can you show me the expected output – RacoonOnMoon Oct 09 '19 at 14:42
  • Every row contains column "countries" with the list inside (for example [Russia, China],[],[Italy] and etc.), so i am trying to groupBy by elements which contains inside the lists (For example I need to group by Russia,China,Italy) – Oleg Zdanevich Oct 09 '19 at 14:46
  • Expected output : Россия-5 Китай-1 Великобритания-2 Норвегия-1 США-3 Италия-1 Грузия-1 Беларусь-1 – Oleg Zdanevich Oct 09 '19 at 14:48
  • Maybe you can take a look on this solution https://stackoverflow.com/questions/43915762/how-to-group-by-common-element-in-array, otherwise its not done with a single command – RacoonOnMoon Oct 10 '19 at 08:16