displayed column name is different from real column name

Question

I tried..

 1. read csv (columns: A, B, C.A1.T1, C.A1.T2, C.A2.T1, C.A2.T2, ...)
 2. add columns: A1, B1
 3. save parquet  
 4. read parquet  
 5. drop columns: A, B  
 6. rename column: A1 to A  
 7. rename column: B1 to B  
 8. select: A, B, C.A1.T1, C.A1.T2, C.A2.T1, C.A2.T2, ... 
 9. !! error

error

pyspark.sql.utils.AnalysisException: cannot resolve '`C.A1.T1`' given input columns: [A, B, C.A1.T1, C.A1.T2, C.A2.T1, C.A2.T2, ...]  
  'Project [A#1, B#221, 'C.A1.T1, 'C.A1.T2, 'C.A2.T1, 'C.A2.T2, ... 46 more fields]  
    +- Project [A#1, B#4, C.A1.T1#5, C.A1.T2#6, ... 47 more fields]  
      +- Project [A#1, B#4, C.A1.T1#5, C.A1.T2#6, ... 47 more fields]  
        +- Project ...  
          ...  
            +- Relation[A#0,B#1,C.A1.T1#5,C.A1.T2#6,... 50 more fields] parquet

Why is there a single quotation mark to the left of the column name?
How can I fix it?

** No single quotation mark in anywhere: csv, printSchema(), str(df)
** df.select("'C.A1.T1").show() -> cannot resolve '`'C.A1.T1`'...

Try to use: ```select A, B, `C.A1.T1`, `C.A1.T2`, `C.A2.T1`, `C.A2.T2` ....``` — 过过招, Nov 22 '21 at 04:45
I found problem and answer. - [https://stackoverflow.com/questions/44367019/column-name-with-dot-spark](https://stackoverflow.com/questions/44367019/column-name-with-dot-spark) — Travis, Nov 22 '21 at 05:06
If I had read your answer first, it would have saved me time. Thank you 过过招 — Travis, Nov 22 '21 at 05:14
Does this answer your question? [Column name with dot spark](https://stackoverflow.com/questions/44367019/column-name-with-dot-spark) — vladsiv, Nov 22 '21 at 08:02

displayed column name is different from real column name

0 Answers0