Write a text file with Cedilla Delimer(Ç) - non keyboard printable character

Question

I have a program where I am writing a Pipe Delimited file using PySpark. I want to write the file using Ç - cedilla as the delimiter.

sample code

separator = '|'
concat_udf1 = F.udf(lambda cols: "".join([x+separator if x is not None else "separator" for x in cols]), StringType())

Current dataframe output

7|2020-03-31|xyz
7|2020-03-31|abc

New dataframe output

7Ç2020-03-31Çxyz
7Ç2020-03-31Çabc

If I am changing the separator to Ç - cedilla I get below error

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)

Any help would be appreciated - TIA

What throws this error? It's a classic case of improper encoding usage (and that's what the error is telling you as well). — DaveIdito, Aug 03 '20 at 19:31

score 0 · Answer 1 · edited Jul 26 '23 at 12:41

0

This command on the terminal will work as intented:

< cedilla-dataframe-txt tr '\u00c7' '|'

Or instead of '\u00c7' can paste the cedilla character.

edited Jul 26 '23 at 12:41

greybeard

answered Jul 20 '23 at 17:29

1 Answers1