I have a below schema
root
|-- DataPartition: long (nullable = true)
|-- TimeStamp: string (nullable = true)
|-- _action: string (nullable = true)
|-- env:Data: struct (nullable = true)
| |-- _type: string (nullable = true)
| |-- al:FundamentalAnalytic: struct (nullable = true)
| | |-- _analyticItemInstanceKey: long (nullable = true)
| | |-- _financialPeriodEndDate: string (nullable = true)
| | |-- _financialPeriodType: string (nullable = true)
| | |-- _isYearToDate: boolean (nullable = true)
| | |-- _lineItemId: long (nullable = true)
| | |-- al:AnalyticConceptCode: string (nullable = true)
| | |-- al:AnalyticConceptId: long (nullable = true)
| | |-- al:AnalyticIsEstimated: boolean (nullable = true)
| | |-- al:AnalyticValue: struct (nullable = true)
| | | |-- _VALUE: double (nullable = true)
| | | |-- _currencyId: long (nullable = true)
| | |-- al:AuditID: string (nullable = true)
| | |-- al:FinancialPeriodTypeId: long (nullable = true)
| | |-- al:FundamentalSeriesId: struct (nullable = true)
| | | |-- _VALUE: long (nullable = true)
| | | |-- _objectType: string (nullable = true)
| | | |-- _objectTypeId: long (nullable = true)
| | |-- al:InstrumentId: long (nullable = true)
| | |-- al:IsAnnual: boolean (nullable = true)
| | |-- al:TaxonomyId: long (nullable = true)
Now this is a xml files which varies frequently . I want to process only tax which contains env:Data.sr:Source.* For that I have written below code
val dfType = dfContentItem.
select(getDataPartition($"DataPartition").
as("DataPartition"),
$"TimeStamp".as("TimeStamp"),
$"env:Data.sr:Source.*",
getFFActionParent($"_action")
.as("FFAction|!|")
).filter($"env:Data.sr:Source._organizationId".isNotNull)
dfType.show(false)
But this works only when sr:Source
is found in schema else I get below exception
Exception in thread "main"
org.apache.spark.sql.AnalysisException
: No such struct field sr:Source in _type, cr:TRFCoraxData,fun:Fundamental, md:Identifier, md:Relationship
;
To ignore that I have null checkfor sr:Source
,but that is not working for me .
For that check also I am getting same error .
Basically what i need is that is env:Data.sr:Source.* is null then i want to exit processing and next tag processing will start again .