5

I have in python a Spark DataFrame with nested columns, and I have the path a.b.c, and want to check if there is a nested column after c called d, so if a.b.c.d exists.

Simply checking df.columns['a']['b']['c']['d'] or df.columns['a.b.c.d'] doesn't seem to work, so I found that the df.schema function can be used. So I just iterate through e.g.:

y = df.schema['a'].dataType['b'].dataType['c'].dataType

and then should normally check if d is in y.

The way I did it is simply try y['d'], and if it fails, then it doesn't exist. But I don't think using try is the best way.

So I tried checking if 'd' in y, but apparently this doesn't work, although retrieving the element y['d'] works if it exists.

The type of y is StructType(List(StructField(d,StringType,true),...other columns))

So I don't really know how to properly check if d is in y. Why can't I directly check if 'd' in y when I can retrieve y['d']? Can anyone help? I'm also new in python, but I can't find or think of another solution.

  • I think using `in` doesn't work because of the data type of `schema` which is a `StructType`, which according to documentation contains a list of `StructField`. So you are trying to check if string 'd' is in a list of `StructField`. – LiMuBei Nov 15 '16 at 09:51
  • Possible duplicate of [how do I detect if a spark dataframe has a column](http://stackoverflow.com/questions/35904136/how-do-i-detect-if-a-spark-dataframe-has-a-column) –  Nov 15 '16 at 09:51
  • Yes, I though of that, but I still don't understand how come then retrieving `y['d']` works. So is there no simple way to check for this, except with `try`? The referenced post is not very helpful, because in python there is no quick `Try` as an option function (as far as I know, and that is the whole thing I am trying to avoid), and there's no solution for nested columns. – stackoverflowthebest Nov 15 '16 at 11:04

1 Answers1

0
df.schema.simpleString().find("column_name:")

or

"column_name:" in df.schema.simpleString()
Abdul Mannan
  • 1,072
  • 12
  • 19
  • 2
    Answer needs supporting information Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](https://stackoverflow.com/help/how-to-answer). – moken Jul 27 '23 at 12:27