I am new in pyspark. Can you please help me how to get max age from json using pyspark?
I tried df.filter(df['employees.age'] > 22).show()
It throws error,
org.apache.spark.sql.AnalysisException: cannot resolve '(
employees
.age
> 22)' due to data type mismatch: differing types in '(employees
.age
> 22)' (array and int).;; 'Filter (employees#0.age > 22)
{'employees': [{'age': '12', 'firstName': 'John', 'lastName': 'Doe'},
{'age': '14', 'firstName': 'Anna', 'lastName': 'Smith'},
{'age': '54', 'firstName': 'Peter1', 'lastName': 'Jones1'},
{'age': '44', 'firstName': 'Peter2', 'lastName': 'Jones2'},
{'age': '42', 'firstName': 'Peter3', 'lastName': 'Jones3'},
{'age': '62', 'firstName': 'Peter4', 'lastName': 'Jones4'},
{'age': '65', 'firstName': 'Peter5', 'lastName': 'Jones5'},
{'age': '23', 'firstName': 'Peter6', 'lastName': 'Jones6'},
{'age': '77', 'firstName': 'Pete7', 'lastName': 'Jones7'},
{'age': '82', 'firstName': 'Peter8', 'lastName': 'Jones8'},
{'age': '92', 'firstName': 'Peter9', 'lastName': 'Jones9'},
{'age': '78', 'firstName': 'Peter10', 'lastName': 'Jones10'}]}
I want to find those employee who has age greater than 22.