How I can do query by spark SQL when I add cloumn auto increase by querying

Question

I do this query in MySQL like this:

SELECT *,@a:=@a+1 AS rn FROM PERSON,(SELECT(@a:=0)) AS A

I want to know how I can do this by spark sql when I query a DataFrame,like this type:

val strSQL=" SELECT *,@a:=@a+1 AS rn FROM PERSON,(SELECT(@a:=0)) AS A"
sqlContext.sql(strSQL)

My problem is that when I run this code ,the console tell me @a:=@a+1 is wrong, I dont knwon how to write the SQL in spark SQL type.

Are you trying to imitate Rownum or LAG? If yes, windowing functions are already implemented in Spark. — philantrovert, May 22 '17 at 08:35

koiralo · Accepted Answer · 2017-05-22T08:53:08.640

If you want to generate row number then you can use the function

val schema = df.schema //data frame schema 

//add rowNumberField
val rows = df.rdd.zipWithUniqueId.map{
   case (r: Row, id: Long) => Row.fromSeq(id +: r.toSeq)}

Create DataFrame with schema 

val dfWithPK = sqlContext.createDataFrame(
  rows, StructType(StructField("rowMunber", LongType, false) +: schema.fields))

You can also use SQL function to generate row number

sqlContext.sql("select row_number() over (order by columnName) as rowNumber from Person")

Or you can use use Using MontotonicallyIncreasingID methodto generate unique row Ids.

df.withColumn("RowNumber",monotonicallyIncreasingId)

Hope this helps!

Point to note here, `monotonicallyIncreasingID` generates non-deterministic values in increasing order. Not exactly what the OP wants. — philantrovert, May 22 '17 at 10:19

How I can do query by spark SQL when I add cloumn auto increase by querying

1 Answers1