4

I have a cron job to purge data older than a certain # of months (set by user). The cron job calls a purgeData() method that purges data from a postgres table. I manipulate java Calendar from the current date (via GregorianCalendar.getInstance) to determine the target date in which to delete data prior to.

My problem is that calendar manipulation and/or converting the newly manipulated calendar to a string for use in postgres fails occassionally, setting the target date to the current date which deletes everything prior to the current date, rather than data older than 1 (or #months to keep data) month old.

Here is my simple date format:

public static final SimpleDateFormat dateFormatter = new SimpleDateFormat(
            "yyyy-MM-dd HH:mm:ss.SSS");

Here is my method:

public String purgeData() throws ParseException {
    Connection con = null;
    String sqlString = "";
    PreparedStatement pst = null;
    String returnString = "";

    Calendar startDate = GregorianCalendar.getInstance();
    returnString += "# Months to keep data: " + getNumMonthsKeepData();
    startDate.add(Calendar.MONTH, getNumMonthsKeepData() * -1);
    String targetDate = dateFormatter.format(startDate.getTime());
    Calendar today = GregorianCalendar.getInstance();

    returnString +=" Target date (string): " + targetDate + " start date (Calendar): " + startDate.toString() + ", Start month: " + startDate.get(Calendar.MONTH) + ", Current month: " + today.get(Calendar.MONTH);

    if (startDate.get(Calendar.MONTH)!= today.get(Calendar.MONTH)) {

        String tableName = getPreviousMonthlyTable();
        try {
            con = getDBConnection();

            try {
                // Delete old data
                sqlString = "DELETE FROM \"" + tableName
                        + "\" WHERE  datetime < '" + targetDate + "'";

                pst = con.prepareStatement(sqlString);
                int rowsDeleted = pst.executeUpdate();
                returnString += "SUCCESS: Purged data prior to " + targetDate
                        + " # rows deleted: " + rowsDeleted
                        + "( # rows deleted last purge: "
                        + numRowsDeletedPreviously + " )\n";

            } catch (SQLException ex) {
                returnString += "FAILED to execute: " + sqlString;
            }

            try {
                if (pst != null) {
                    pst.close();
                }
                if (con != null) {
                    con.close();
                }

            } catch (SQLException ex) {
                return null;
            }
        } catch (SQLException ex) {
            returnString += "Delete from table fail: " + ex.getMessage();
        }
    } else {
        returnString += "FAIL:  Fail to delete data prior to: " + targetDate + ". Start month: " + startDate.get(Calendar.MONTH)
                + " equals current month: " + today.get(Calendar.MONTH);
    }
    return returnString;
}

The date failure seems random, as when it is successful on one deployment, it fails on another.

Fail output:

20150421-00:33:11.006 Postgres Notification - Purge: # Months to keep data: 1 Target date (string): 2015-04-21 00:00:00.001, Start month: 2, Current month: 3 SUCCESS: Purged data prior to 2015-04-21 00:00:00.001 # rows deleted: 7575704( # rows deleted last purge: 26608 )

NOTE: Target date should be 2015-03-21 00:00:30.000 (note that it is also 30 min off as the cron job runs every 4 hours starting at 00:30)

Older failure output (prior to addition of more logging): 20150414-20:37:53.347 Postgres Notification - Purge: SUCCESS: Purged data prior to 2015-04-14 19:00:00.004 # rows deleted: 12195291( # rows deleted last purge: 128570 )

NOTE: Purge data prior to should be 2015-03-14 20:30:00.000 (note that it is also 1 hour 30 min off as the cron job runs every 4 hours starting at 00:30)

Success output: 20150421-00:30:02.559 Postgres Notification - Purge: # Months to keep data: 1 Target date (string): 2015-03-21 00:30:00.003, Start month: 2, Current month: 3 SUCCESS: Purged data prior to 2015-03-21 00:30:00.003 # rows deleted: 139757( # rows deleted last purge: 33344 )

It appears that the date manipulation actually works, as shown in the output of start month and current month. In both failure cases, the integer values are different. However, the string conversion to SimpleDateFormat seems to be wrong.

I've read the javadocs in that when setting fields on Calendar, one must call get() for the time to be recalculated. However, add() is supposed to force recomputation.

Nathan Hughes
  • 94,330
  • 19
  • 181
  • 276
Renee
  • 41
  • 1

1 Answers1

5

Your dateformatter is not threadsafe, storing it in a class variable and letting multiple threads hammer on it concurrently is going to give you invalid results.

This is documented in the API documentation for SimpleDateFormat under the heading Synchronization:

Date formats are not synchronized. It is recommended to create separate format instances for each thread. If multiple threads access a format concurrently, it must be synchronized externally.

One fix is to have your method create its own SimpleDateFormatter instance. There are more options, listed in a related question here: Making DateFormat Threadsafe. What to use, synchronized or Thread local.

However, formatting a date is not necessary, instead you can pass the date to the PreparedStatement as an argument:

sqlString = "DELETE FROM \"" + tableName + "\" WHERE datetime < ?";
pst = con.prepareStatement(sqlString);
pst.setTimestamp(1, new Timestamp(startDate.getTime()));
Community
  • 1
  • 1
Nathan Hughes
  • 94,330
  • 19
  • 181
  • 276
  • +1. Another solution is to create thread-local formatters as described in this [related question](http://stackoverflow.com/questions/18589986/date-conversion-with-threadlocal). – DNA Apr 21 '15 at 19:56
  • @DNA: yes, that's correct. I was going to link to a question like that, but there are multiples and I was puzzling over which to choose. – Nathan Hughes Apr 21 '15 at 19:57
  • Thank you for your response! I will create a SimpleDateFormat instance within my method. – Renee Apr 21 '15 at 20:09
  • You could get rid of the SimpleDateFormat altogether by changing the SQL statement to use a bind variable and set the date to a java.sql.Timestamp value created from your calendar. – user3745362 Apr 21 '15 at 21:06
  • @user3745362: agreed, the formatting does seem unnecessary. – Nathan Hughes Apr 22 '15 at 01:53
  • Thanks, @NathanHughes and user3745362 creating a sql timestamp variable directly from the my manipulated calendar. That's an even better solution. Appreciate the feedback. – Renee Apr 22 '15 at 17:41