Questions tagged [non-equi-join]

Non-equi-join is a join using non-equality binary relational operator such as (<, ≤, ≠, >, ≥) . Also known as a Non-equi Theta-Join. Theta join combines tuples from different relations provided they satisfy the theta condition. General form of theta-join can use use all kinds of comparison operators (<, ≤, =, ≠, >, ≥) Both equi-join (join using = operator) and non-equi-join are subsets of general form on Theta-join.

More info about non-equijoin on w3resource.com

46 questions
5
votes
0 answers

non-equi join does not preserve original column values

I've found an odd behavior when running a non-equi join (from R's data.table library) and I can't figure out why this is happening. Why is it that, when running a non-equi join, if I want to preserve the original value of the left table, I need to…
Felipe D.
  • 1,157
  • 9
  • 19
5
votes
1 answer

R: unequi join with merge function

I am working with data.table and I want to do a non-equi left join/merge. I have one table with car prices and another table to identify which car class each car belongs to: data_priceclass <- data.table() data_priceclass$price_from <- c(0, 0,…
Helen
  • 533
  • 12
  • 37
3
votes
3 answers

data.table non-equi join with min/max of joined value

I'm trying to do a non-equi join in data.table and extract the min/max of the joined values in that join. set.seed(42) dtA <- data.table(id=rep(c("A","B"),each=3), start=rep(1:3, times=2), end=rep(2:4, times=2)) dtB <-…
r2evans
  • 141,215
  • 6
  • 77
  • 149
3
votes
2 answers

Non-equi join of two tables

I have 2 dataframes where I need to find how many times the entries in mock$num fall within the range of x-y specified by the range dataframe. id <- c(1:9) num <- c(99,101,199,250,999,1500,3000,4000,5000) mock <- data.frame(id, num) x <-…
Robin B
  • 45
  • 5
3
votes
1 answer

Efficient indexing / joining in data.table across multiple dependent conditions for stop detection algorithm

Edit: Real data set available here With thanks to Wang, Rui, Fanglin Chen, Zhenyu Chen, Tianxing Li, Gabriella Harari, Stefanie Tignor, Xia Zhou, Dror Ben-Zeev, and Andrew T. Campbell. "StudentLife: Assessing Mental Health, Academic Performance and…
Danielle McCool
  • 146
  • 2
  • 11
3
votes
4 answers

Merge 2 dataframes using conditions on "hour" and "min" of df1 in datetimes of df2

I have a dataframe df.sample like this id <- c("A","A","A","A","A","A","A","A","A","A","A") date <- c("2018-11-12","2018-11-12","2018-11-12","2018-11-12","2018-11-12", "2018-11-12","2018-11-12","2018-11-14","2018-11-14","2018-11-14", …
Sharath
  • 2,225
  • 3
  • 24
  • 37
2
votes
2 answers

Values changed when joining in data.table

I have the following two dataframes (dput below): > df1 group date 1 A 2023-01-10 2 A 2023-01-15 3 B 2023-01-09 4 B 2023-01-12 > df2 group date1 date2 value 1 A 2023-01-09 2023-01-11 2 2 B 2023-01-11…
Quinten
  • 35,235
  • 5
  • 20
  • 53
2
votes
1 answer

non equi join returns non existing column names

I am unable to do a basic non equi join in two data.tables in R without the error: argument specifying columns specify non existing column(s): cols[2]='abs(x.val - i.val)' A min. example to show the error. library(data.table) set.seed(1); dt1 <-…
Lazarus Thurston
  • 1,197
  • 15
  • 33
2
votes
1 answer

Error while compiling statement: FAILED: SemanticException line 0:undefined:-1

If anyone would know what the issue here is please ? I am running this in Hive select * from a left join b on a.id=b.id and a.date between b.start_dte and b.end_dte Error while compiling statement: FAILED: SemanticException…
tom
  • 31
  • 3
2
votes
2 answers

Join by overlapping periods while operating for some of the values

I'm trying to join one database of periods like this one: id = c(rep(1,3), rep(2,3), rep(3,3)) start = as.Date(c("2014-07-01", "2015-03-12", "2016-08-13", "2014-07-01", "2015-03-12", "2016-08-13", "2014-07-01", "2015-03-12", "2016-08-13")) end =…
2
votes
1 answer

OR not supported currently in JOIN error in Hive

I am running a query in Hive, which is like below, and have OR condition in the left join. When I run the select, it throws me couple of error messages. OR not supported currently in JOIN ( got to know OR works only for equi joins in Hive) Both…
jahan
  • 103
  • 4
  • 19
2
votes
1 answer

Non-equi join of dates using data table

I have a data table of edits: library(data.table) edits <- data.table(proposal=c('A','A','A'), editField=c('probability','probability','probability'), startDate=as.POSIXct(c('2017-04-14 00:00:00','2019-09-06…
Slash
  • 501
  • 2
  • 9
2
votes
1 answer

non-equi-joins in R with data.table - backticked column name trouble

I can't manage to do a non-equi-join with data.table when (backticked) column names include a space. I collect such names from our database at work, and our explicit policy is for everyone to use those same names to avoid confusion. I could of…
Kjetil
  • 35
  • 3
1
vote
2 answers

Is there a faster way to perform a non-equi join and find the max of the joined values in R?

I'm trying to speed up some R code. Due to the large volume of data (tens of millions of rows), it takes some time to process. Essentially, I have a small data.table called parameters with tax rates and thresholds and a large data.table called…
Marco
  • 11
  • 1
1
vote
1 answer

R data.table - Apply a function, which uses a subset of all other rows, to each row

I have a dataset with data about subscriptions. For each subscription there is the subscriber, the type of subscription, the subscription issue date, the subscription start date, and the subscription end date. One subscriber might have multiple…
1
2 3 4