I want to assign values to the dataframe column from a list on a condition, but my code only works on hard-coded replacements and not a dynamic version like lists.
And I can't convert the list directly to dataframe column bcuz its length is way shorter than the column's length
no_connections = network_data.map(lambda row: (row[1], 1)).reduceByKey(lambda a,b: a+b).collect()
network_data1 = network_data1\
.withColumn("NoUserConnections", when(network_data1.NoUserConnections == 0, no_connections[0])
.otherwise(network_data1.NoUserConnections))
I can also get the values of no_connections from a dataframe like so
network_data1.groupby('User').count().show()
My Dataframe looks like this:
+---+----+-----------+-----------------+
|_c0|User|Connections|NoUserConnections|
+---+----+-----------+-----------------+
| 0| 0| 1| 0|
| 1| 0| 2| 0|
| 2| 0| 3| 0|
| 3| 0| 4| 0|
| 4| 0| 5| 0|
| 5| 0| 6| 0|
| 6| 1| 7| 1|
| 7| 1| 8| 1|
| 8| 1| 9| 1|
| 9| 1| 10| 1|
+---+----+-----------+-----------------+
and I want to put the number of instances of each User value to their corresponding User like this
+---+----+-----------+-----------------+
|_c0|User|Connections|NoUserConnections|
+---+----+-----------+-----------------+
| 0| 0| 1| 6|
| 1| 0| 2| 6|
| 2| 0| 3| 6|
| 3| 0| 4| 6|
| 4| 0| 5| 6|
| 5| 0| 6| 6|
| 6| 1| 7| 4|
| 7| 1| 8| 4|
| 8| 1| 9| 4|
| 9| 1| 10| 4|
+---+----+-----------+-----------------+