I am trying to make a list of column values of an existing dataframe named datadf.
datadf = sqlContext.createDataFrame(data[0:], ('Name', 'Date', 'Lat', 'Lon', 'Number'))
print(type(datadf))
datadf.printSchema()
Returns:
<class 'pyspark.sql.dataframe.DataFrame'>
root
|-- Name: string (nullable = true)
|-- Date: string (nullable = true)
|-- Lat: double (nullable = true)
|-- Lon: double (nullable = true)
|-- Number: long (nullable = true)
datadf.show()
Returns:
+-----------------+----------+-----------+-----------+------+
| Name| Date| Lat| Lon|Number|
+-----------------+----------+-----------+-----------+------+
|Fallopia japonica|16/09/2016| 52.3792| 6.499| 10|
|Fallopia japonica|21/08/2015| 51.813| 5.784| 1|
|Fallopia japonica|25/08/2016| 50.9623| 5.723| 1|
|Fallopia japonica|27/06/2013| 50.844| 5.688| 1|
|Fallopia japonica|31/05/2015| 51.7267| 5.615| 1|
|Fallopia japonica|04/07/2015| 52.0883| 5.147| 1|
|Fallopia japonica|21/05/2016| 51.5757| 5.027| 1|
|Fallopia japonica|09/06/2015| 51.5734| 5.024| 1|
|Fallopia japonica|13/08/2015| 51.6| 4.981| 101|
|Fallopia japonica|16/07/2014| 51.5656| 4.752| 5001|
|Fallopia japonica|26/09/2016| 51.3021| 3.977| 1|
|Fallopia japonica|27/09/2015| 53.1802005| 7.19828113| 1|
|Fallopia japonica|10/07/2011|53.11105167| 7.19632833| 1|
|Fallopia japonica|11/06/2014|53.00800151|7.192501277| 1|
|Fallopia japonica|19/06/2016|53.00857768| 7.19225564| 51|
|Fallopia japonica|21/04/2015|53.16380117|7.186146926| 1|
|Fallopia japonica|21/04/2015|53.16380117|7.186146926| 1|
|Fallopia japonica|23/08/2003|53.09439231|7.178677324| 1|
|Fallopia japonica|02/09/2002| 53.0050096|7.145194014| 1|
|Fallopia japonica|04/08/2013| 52.962782| 7.144035| 1|
+-----------------+----------+-----------+-----------+------+
only showing top 20 rows
The dataframe has latitude and longitude values, basically I want to make a python list of each.
import pandas
latlist = datadf['Lat'].tolist()
latlist = datadf['Lat'].values.tolist()
Both return: 'Column' object is not callable
Now, I suspect something is wrong with the dataframe values, as I ran into this error before. I have a basemap of the Netherlands, and I want to simply add these coordinates as points on this map.