0

There was a previous question on StackOverflow about this subject (can insert the link, I've got no privileges for the moment) entitled "MySQL date comparison filter", and this goes as an extension of that one.

I've got some authors on my WordPress blog and I would like to get their productivity through MySQL. The next query works pretty well under MySQL to get an author's post during certain time range only one day:

SELECT      SQL_CALC_FOUND_ROWS wp_posts.* 
FROM        wp_posts 
  JOIN      wp_postmeta 
  ON        (wp_posts.ID = wp_postmeta.post_id) 
WHERE       wp_posts.post_type = 'post' 
  AND       post_author = '50'
  AND       post_date
    BETWEEN STR_TO_DATE('2011-10-27 14:19:17','%Y-%m-%d %H:%i:%s')
      AND   STR_TO_DATE('2011-10-27 14:51:17','%Y-%m-%d %H:%i:%s')
GROUP BY    wp_posts.ID 
ORDER BY    wp_posts.post_date DESC 
LIMIT       0, 100

But it gives me just the posts of that day during that hour range. I'd like to get a table with with everyday data filled up for each day and each author. On each day and each author, there should be the number of posts published by that author on that day and on that hour range.

The output should be something like this:

October Auth1 Auth2  Auth3
1   0   0   0
2   0   0   0
3   0   1   0
4   0   2   0
5   1   0   0
6   0   2   0
7   0   0   0
8   3   0   0
9   0   0   0
10  5   1   0
11  1   0   0
...
31  2   1   1

So the date should be a variable, but I'd like to include all authors, so I'd remove the post_author AND line.

I'm no expert at MySQL but I wonder if this could be done more or less easily and export the query results (or, more exactly, some fields of the query results) as a table, like the one shown.

javipas
  • 281
  • 1
  • 6
  • 16

3 Answers3

1

I'm not going to rewrite your entire query, but here's how you'd do the data grouping:

SELECT ...
FROM ...
WHERE YEAR(post_date) = 2011 AND MONTH(post_date) = 10
GROUP BY DAY(post_date), HOUR(post_date)

Creating multiple columns for each author is not a good use of a query. That sort of transformation is better done in your wordpress code.

Note that this query will work on exact clock periods, 1am, 2am, 3am, etc... If you need arbitrary times (1:05am, 2:05am, 3:05am, etc...), this won't work and you'll need a more complicated grouping.

Marc B
  • 356,200
  • 43
  • 426
  • 500
  • I'm affraid that is not what I was looking for: this gives us a list of posts ordered by day and hour, but that's the same I get on Wordpress posts list, that can be filtered by author and month. – javipas Oct 27 '11 at 16:17
  • Like I said, I'm not going to rewrite your query. I once looked into the bowels of WP and have never been the same since, so I'm not going to look at its database again to figure out what the real query should be. Just showing how you do date/time grouping. – Marc B Oct 27 '11 at 16:20
  • Fair enough, but that grouping is not what I need. Thanks anyway! – javipas Oct 27 '11 at 16:43
1

I reckon you should create a date reference table, populate that table and then LEFT OUTER JOIN from that table in your query. The problem of 'How Do I Display Missing Dates?' is quite a common SO question but I'll go for it anyway.

Prelim Step

At the mysql prompt run:

use WordPress;

Step 1 - Create Date Reference Table

create table all_date 
(id int unsigned not null primary key auto_increment, 
a_date date not null,
last_modified timestamp not null default current_timestamp on update current_timestamp,
unique key `all_date_uidx1` (a_date));

Step 2 - Populate Date Reference Table

The idea of this table is to have one row for every date. Now you could achieve this by running insert statements ad nauseum but why not write a routine to populate it for you (you could event create a MySQL scheduled event to ensure that you always have a complete set of dates in the table. Here's a suggestion for that routine:

DELIMITER //


CREATE PROCEDURE populate_all_dates(IN from_date DATE, IN days_into_future INT)

BEGIN

 DECLARE v_date DATE;
 DECLARE ix int;


 SET ix := 0;
 SET v_date := from_date;


 WHILE v_date <= (from_date + interval days_into_future day) DO

  insert into all_date (a_date) values (v_date) 
  on duplicate key update last_modified = now();

  set ix := ix +1;

  set v_date := from_date + interval ix day;

 END WHILE;

END//

DELIMITER ;

You can now run:

call populate_all_dates('2011-10-01',30);

To populate all the dates for October (or just crank up the days_into_the_future parameter to whatever you want).

Now that you have a date reference table with all dates that you're interested in populated you can go ahead and run your query for October:

select day(a.a_date) as 'October',
IFNULL(t.a1,0) as 'Auth1',
IFNULL(t.a2,0) as 'Auth2',
IFNULL(t.a50,0) as 'Auth50'
from all_date a
LEFT OUTER JOIN
(
SELECT date(wp.post_date) as post_date,
sum(case when wp.post_author = '1' then 1 else 0 end) as a1,
sum(case when wp.post_author = '2' then 1 else 0 end) as a2,
sum(case when wp.post_author = '50' then 1 else 0 end) as a50,
count(*) as 'All Auths'
FROM wp_posts wp
WHERE  wp.post_type = 'post'
AND wp.post_date  between '2011-10-01' and '2011-10-31 23:59:59'
GROUP BY date(wp.post_date)
) t
ON a.a_date = t.post_date
where a.a_date between '2011-10-01' and '2011-10-31'
group by day(a.a_date);
Community
  • 1
  • 1
Tom Mac
  • 9,693
  • 3
  • 25
  • 35
  • Wow, that looks promising Tom Mac, thanks for the great effort. I'm affraid I'm no expert at MySQL so... How do I run this? Once I'm in mysql prompt, I just copy&paste those code snippets? Sorry :( – javipas Oct 28 '11 at 11:38
  • If you're at the prompt then yes, you should just be able to run them as code snippets. Before you run any of them though you'll need to choose a schema to run them into. I guess wp_posts and wp_postmeta are in the WordPress schema. So run 'use WordPress;' before running any of the above. I'll edit my answer to show you what I mean. – Tom Mac Oct 28 '11 at 12:56
  • Tom Mac, I've tested your code, and it seems to work... more or less. ---- I've changed 'use WordPress' by the name of my WordPress database (the one I'm using in the WordPress blog, of course), so after trying to use your code (the "WHERE p_posts.post_type = 'post'" should read "WHERE wp_posts.post_type = 'post'") I get some results, but not the expected. IN fact, the table shows me three colums with the authors but the post count is really high, it's impossible that one day Auth1 has published 94 articles and Auth50 has published 145. – javipas Oct 29 '11 at 11:18
  • I'm afraid this is a case of me not having the data in front of me! However, a little bit of investigation into the wp_postmeta table tells me that you do not need to join onto it at all. I suspect that this is why you were getting an inflated total of posts per author. I have removed the join onto this table (so only used wp_posts) and have amended my query. See how you go with the new one! – Tom Mac Oct 29 '11 at 13:53
  • Woo, thank you so much Tom Mac!! That's working ;) I've made a little change and I've added a "AND wp_post_status='publish'" to avoid drafts and trashed posts ;) The table shows almost all that I wanted, but would it be possible to divide each day in three hour ranges? The populate_all_dates function should this way create the table and, for each day, create three rows: 00:00-14:30, 14:31-18:15 and 18:16-23:59. That would allow me to analyce the post production on these ranges of the day. Is that possible? – javipas Oct 29 '11 at 22:22
  • You're welcome. It is possible. Though for the sake of clarity to other people who may read this can I suggest that you close this question off and ask your 'Three Hour Time Interval' in a separate post? – Tom Mac Oct 31 '11 at 07:29
  • Great, I'll close it and ask another question with that subject. Hope you can help there too, many thanks Tom Mac ;) – javipas Oct 31 '11 at 08:52
  • Done, the new question is at http://stackoverflow.com/questions/7951636/three-hour-time-intervals-in-mysql-query – javipas Oct 31 '11 at 09:09
0

The query below will group number of posts by author.

SELECT      DAY(post_date) as d, MONTH(post_date) as m, YEAR(post_date) as y, post_author, COUNT(id) as c
FROM        wp_posts 
  JOIN      wp_postmeta 
  ON        (wp_posts.ID = wp_postmeta.post_id) 
 WHERE       wp_posts.post_type = 'post' 
  AND       post_date
    BETWEEN STR_TO_DATE('2011-10-27 14:19:17','%Y-%m-%d %H:%i:%s')
      AND   STR_TO_DATE('2011-10-27 14:51:17','%Y-%m-%d %H:%i:%s')
GROUP BY    d, m, y, post_author
ORDER BY    y, m, d, post_author 
LIMIT       0, 100

This will output a table like:

d   m   y    post_author    c
------------------------------------
2   10  2011  50    23
2   10  2011  51    12
2   10  2011  52    6

meaning for line 1 for example: on 2 October 2011 author with id 50 has 23 posts. You would first fetch the authors in an array to form the header of your table. Then iterating this result with PHP you could generate a table like you want it accordingly placing 's and 's.

Nedret Recep
  • 726
  • 5
  • 8
  • Just to add a note: if in a certain day there are no posts by any author, that day will not be present in the result so you should cover this case with PHP too. – Nedret Recep Oct 27 '11 at 21:04
  • Nedret, thanks for the tip: I've just tested this query, but it is not giving the expected result. COuld the COUNT(id) parameter be wrong? I get strange values for the "c" column, as if the date range was not being taken care of. I think it shows all the posts published in October, not just the day and the hour range I try to specify. For example, on 27/10/2011 only author "50" published three posts from 14:30 to 18:30. When I run my query (the one on the question) the answer is OK, but if I run your query it shows me 113 posts published... Strange :( – javipas Oct 28 '11 at 11:55