I have a dataframe, I need to get the row number / index of the specific row. I would like to add a new row such that it includes the Letter as well as the row number/index eg. "A - 1","B - 2"
#sample data
a= sqlContext.createDataFrame([("A", 20), ("B", 30), ("D", 80)],["Letter", "distances"])
with output
+------+---------+
|Letter|distances|
+------+---------+
| A| 20|
| B| 30|
| D| 80|
+------+---------+
I would like the new out put to be something like this,
+------+---------------+
|Letter|distances|index|
+------+---------------+
| A| 20|A - 1|
| B| 30|B - 2|
| D| 80|D - 3|
+------+---------------+
This is a function I have been working on
def cate(letter):
return letter + " - " + #index
a.withColumn("index", cate(a["Letter"])).show()