1

I have two xmls from this I want 'DisplayName'

<row Id="7" Reputation="1" CreationDate="2012-12-11T20:15:57.237" DisplayName="Les McCutcheon" LastAccessDate="2012-12-28T20:02:11.327"  AccountId="2136073" />

and from this to count rows where 'UserId' is identical with 'Id' from first file

<row Id="21" UserId="21" Name="Student" Date="2012-12-11T20:41:41.960" Class="3" TagBased="False" />

I want to display e.g.: Les McCutcheon - 3 where 3 is number of rows with his UserId in second xml

My code to show two things from one xml

Dataset<String> onlyRows = logData.filter((FilterFunction<String>) s -> s.contains("<row"));
    Dataset<Row> firstSplit = onlyRows.selectExpr("split(value, 'DisplayName=\"')[1] as User_Id", "split(value, ' Reputation=\"')[1] as Reputation");

    Dataset<Row> repid = firstSplit.selectExpr("split(User_Id, '\" LastAccessDate=')[0] as User", "CAST(split(Reputation, '\" CreationDate')[0] as INT) as Reputation").orderBy(desc("Reputation"));

    repid.show(10);

I have to merge this files using xslt like here Join two xml files based on common id value, or can I do it another way?

moriarty007
  • 2,054
  • 16
  • 20
jtrim
  • 11
  • 2

0 Answers0