1

I am trying to make a list of column values of an existing dataframe named datadf.

datadf = sqlContext.createDataFrame(data[0:], ('Name', 'Date', 'Lat', 'Lon', 'Number'))
print(type(datadf))
datadf.printSchema()

Returns:

<class 'pyspark.sql.dataframe.DataFrame'>
root
 |-- Name: string (nullable = true)
 |-- Date: string (nullable = true)
 |-- Lat: double (nullable = true)
 |-- Lon: double (nullable = true)
 |-- Number: long (nullable = true)
datadf.show()

Returns:

+-----------------+----------+-----------+-----------+------+
|             Name|      Date|        Lat|        Lon|Number|
+-----------------+----------+-----------+-----------+------+
|Fallopia japonica|16/09/2016|    52.3792|      6.499|    10|
|Fallopia japonica|21/08/2015|     51.813|      5.784|     1|
|Fallopia japonica|25/08/2016|    50.9623|      5.723|     1|
|Fallopia japonica|27/06/2013|     50.844|      5.688|     1|
|Fallopia japonica|31/05/2015|    51.7267|      5.615|     1|
|Fallopia japonica|04/07/2015|    52.0883|      5.147|     1|
|Fallopia japonica|21/05/2016|    51.5757|      5.027|     1|
|Fallopia japonica|09/06/2015|    51.5734|      5.024|     1|
|Fallopia japonica|13/08/2015|       51.6|      4.981|   101|
|Fallopia japonica|16/07/2014|    51.5656|      4.752|  5001|
|Fallopia japonica|26/09/2016|    51.3021|      3.977|     1|
|Fallopia japonica|27/09/2015| 53.1802005| 7.19828113|     1|
|Fallopia japonica|10/07/2011|53.11105167| 7.19632833|     1|
|Fallopia japonica|11/06/2014|53.00800151|7.192501277|     1|
|Fallopia japonica|19/06/2016|53.00857768| 7.19225564|    51|
|Fallopia japonica|21/04/2015|53.16380117|7.186146926|     1|
|Fallopia japonica|21/04/2015|53.16380117|7.186146926|     1|
|Fallopia japonica|23/08/2003|53.09439231|7.178677324|     1|
|Fallopia japonica|02/09/2002| 53.0050096|7.145194014|     1|
|Fallopia japonica|04/08/2013|  52.962782|   7.144035|     1|
+-----------------+----------+-----------+-----------+------+
only showing top 20 rows

The dataframe has latitude and longitude values, basically I want to make a python list of each.

import pandas

latlist = datadf['Lat'].tolist()
latlist = datadf['Lat'].values.tolist()

Both return: 'Column' object is not callable

Now, I suspect something is wrong with the dataframe values, as I ran into this error before. I have a basemap of the Netherlands, and I want to simply add these coordinates as points on this map.

Braiam
  • 1
  • 11
  • 47
  • 78
  • 4
    Does this answer your question? [Convert spark DataFrame column to python list](https://stackoverflow.com/questions/38610559/convert-spark-dataframe-column-to-python-list) – vladsiv Nov 26 '21 at 12:13

0 Answers0