1

basic table is similar to:

mysqlTable:
          visitorID   ,park          ,DateTimeStamp
          8369        ,Birmingham    ,12/27/2018 03:26:38 PM
          8369        ,Birmingham    ,12/28/2018 11:27:32 AM
          8828        ,Central       ,01/02/2019 10:01
          8828        ,Central       ,01/04/2019 9:50
          8825        ,Central       ,12/21/2018 09:47:27 AM
          8821        ,Central       ,12/26/2018 10:11:40 AM
          8821        ,Central       ,02/03/2019 10:00:59 AM
          8821        ,Central       ,01/02/2019 10:04
          88281       ,Central       ,01/04/2019 9:53

From this table i'm creating a new one where i'm counting visitor's ID in a specific park, then grouping the visits by the visitorID and the date of the visit in a specific period of time.

mysql query: 

SELECT COUNT(*)AS visits,dateTimeStamp,visitorID 
FROM parkVisits 
WHERE 
    dateTimeStamp BETWEEN '2019-01-01 00:00:01' AND '2019-01-04 23:59:59'
    AND park ='Central'
GROUP BY visitorID, CAST(dateTimeStamp AS DATE);

my result:

mysql table:

visits   ,dateTimeStamp     ,visitorID   
2        ,01/02/2019 10:01  ,8828      
1        ,01/02/2019 10:04  ,8821       
1        ,01/04/2019 9:53   ,88281      

I would like to have a column where a Boolean indicates if the visitor has visited anytime before the specified date of it's row. I was thinking about comparing the earliest dateTimeStamp on table to the earliest given date, but a situation can occur where the first visit and the returning visit occurred during the given period.

expected:

mysql table:

visits   ,dateTimeStamp   ,visitorID    ,returningVisitor
2        ,01/02/2019 10:01,  8828       ,TRUE
1        ,01/02/2019 10:04,  8821       ,FALSE
1        ,01/04/2019 9:53,   88281      ,FALSE

EDIT:

i'm using MySQL 5.6.40

GMB
  • 216,147
  • 25
  • 84
  • 135
Anatsu
  • 364
  • 1
  • 2
  • 13
  • 1
    Since you're grouping by ``visitorID``, that means ``dateTimeStamp`` be basically an arbitrary value from all of that user's rows. MySQL may happen to return the first/earliest (see https://stackoverflow.com/a/1646121/378779) but that is not guaranteed. – kmoser Jan 14 '19 at 20:26
  • You would need to better describe how to choose the `dateTimestamp` when there are several visits in the period. How about the ealier visit in the period ? Or tmaybe the last one ? – GMB Jan 14 '19 at 20:36
  • First one is better. EDIT: on second taught it doesn’t matter, the yy:mm:dd is what’s important. – Anatsu Jan 14 '19 at 20:39

1 Answers1

1

Assuming that you are running MySQL 8.0, you could use an inner query with window function FIRST_VALUE to pick up the timestamp of the first visit in the analyzis period. Then in the outer query, an EXISTS clause with a subquery can be used to check if the current visitor has ever visited the same park before.

SELECT 
    x.visits,
    x.dateTimeStamp,
    x.visitorID,
    EXISTS (
        SELECT 1 FROM parkVisits WHERE park = x.park AND visitorID = x.visitorID AND dateTimeStamp < x.dateTimeStamp
    ) returningVisitor
FROM (
    SELECT DISTINCT
        COUNT(*) OVER (PARTITION BY p.visitorID) visits,
        FIRST_VALUE(p.dateTimeStamp) OVER (PARTITION BY p.visitorID ORDER BY p.dateTimeStamp) dateTimeStamp,
        p.visitorID,
        p.park
    FROM parkVisits p
    WHERE
        p.dateTimeStamp BETWEEN '2019-01-01 00:00:01' AND '2019-01-04 23:59:59' 
        AND p.park ='Central'
) x
ORDER BY 1 desc, 2

In this db fiddle with your sample data, it returns :

| visits | dateTimeStamp       | visitorID | returningVisitor |
| ------ | ------------------- | --------- | ---------------- |
| 2      | 2019-01-02 10:01:00 | 8828      | 0                |
| 1      | 2019-01-02 10:04:00 | 8821      | 1                |
| 1      | 2019-01-04 09:53:00 | 88281     | 0                |

NB : I think the returning visitor is 8821, and not 8828 as shown in your question.

If you are running a lower version of MySQL, that does not support window functions, you can use a GROUP BY clause in the subquery, like :

SELECT 
    x.visits,
    x.dateTimeStamp,
    x.visitorID,
    EXISTS (
        SELECT 1 FROM parkVisits WHERE park = x.park AND visitorID = x.visitorID AND dateTimeStamp < x.dateTimeStamp
    ) returningVisitor
FROM (
    SELECT DISTINCT
        COUNT(*) visits,
        MIN(p.dateTimeStamp) dateTimeStamp,
        p.visitorID,
        p.park
    FROM parkVisits p
    WHERE
        p.dateTimeStamp BETWEEN '2019-01-01 00:00:01' AND '2019-01-04 23:59:59' 
        AND p.park ='Central'
    GROUP BY p.visitorID, p.park
) x
ORDER BY 1 desc, 2   

See this db fiddle.

GMB
  • 216,147
  • 25
  • 84
  • 135