0

I'm dealing with two tables - users with about 20 million rows, and data with about 2 billion rows.

I need to select all users rows where active=1, and then join that with their corresponding data row where users.username=data.username AND data.date='2017-11-30'.

The catch is that many of these users won't have a data row where date='2017-11-30', but I still need their record to be returned, just without any info for that date.

What would be the most resource-efficient way to accomplish this? I got a start with this, but it doesn't look quite right:

SELECT users.username FROM users INNER JOIN data ON data.username = users.username WHERE users.active = 1 AND data.date = ‘2017-11-30’

DannyF247
  • 628
  • 4
  • 14
  • 35

1 Answers1

1

It sounds like you can just do a left join:

SELECT * 
FROM users LEFT OUTER JOIN data 
  ON data.username = users.username AND data.date = '2017-11-30'
WHERE
users.active = 1
EdmCoff
  • 3,506
  • 1
  • 9
  • 9