I am trying to query between 2 collections having a join condition between them, Say the 2 documents are:
Collection 1: Company (1M documents)
Company Documents:
<Company>
<CompanyName>XYZ</CompanyName>
<Description>ABC</Description>
.
.
</Company>
Collection 2: Employee (1M documents)
XML:
<Employee>
<EmpName>john</EmpName>
<CompanyName>XYZ</CompanyName>
.
.
</Employee>
The Query I am using is,
for $company in collection('Company'), $employee in collection('Employee')
where ( $company/Company/CompanyName eq $employee/Employee/CompanyName)
return document-uri($company)
when there are less documents in both collections, able to fetch the records, but when i do stress testing say, 1M records in both collections, in this case takes a lot of time and results in time limit exceeded exception. Is there a way to optimize this query?
I have tried the following methods, assuming there exists a range index for CompanyName,
for $company in collection('Company'), $employee in cts:values(cts:path-reference('/Employee/CompanyName'))
where ( $company/Company/CompanyName eq $employee)
return document-uri($company)
if companyName is unique then the above Query will not optimize much. I need a way which can work for both distinct and duplicate value of elements