Need some help on checking the data type in spark,
I need to convert this pyspark functionality in spark
if dict(df.dtypes)['test_col'] == 'String':
... print "it is String type"
Need some help on checking the data type in spark,
I need to convert this pyspark functionality in spark
if dict(df.dtypes)['test_col'] == 'String':
... print "it is String type"
To check data type of column, Use schema
function.
Check below code.
df
.schema
.filter(c => c.name == "test_col") // Check your column
.map(_.dataType.typeName)
.headOption
.getOrElse(None)
val dtype = df
.schema
.filter(a => a.name == "c1")
.map(_.dataType.typeName)
.headOption
.getOrElse(None)
if (dtype == "string") println("it is String type")
Use dtypes
function.
val dtype = df
.dtypes
.filter(_._1 == "c1")
.map(_._2)
.headOption
.getOrElse(None)
if (dtype == "StringType" ) println("it is String type")
I have modified the answer in this post to include the datatype in the schema.
Just pass the schema of your dataframe to the function flatten
and it will provide you column name and its datatype.
def flatten(schema, prefix=None):
fields = []
for field in schema.fields:
name = prefix + '.' + field.name if prefix else field.name
dtype = field.dataType
if isinstance(dtype, ArrayType):
dtype = dtype.elementType
if isinstance(dtype, StructType):
fields += flatten(dtype, prefix=name)
else:
fields.append((name, dtype))
return fields
mySchema = flatten(df.schema)
print("Schema is here", mySchema)