0

I have a webpage with a label like that: "Table last updated on Thu Jul 27 10:57:10 CEST 2017 from OWNER"

I have to check if this date is later than 0h today. I'm getting the html code with:

Document doc = Jsoup.parse(driver.getPageSource());
String htmlcode = doc.body().text();

I thought about substringing the code to get the date, but since this label value can vary in size, I could not get the whole label. Any ideas on how to get the date from the code, so I can compare it?

Alexander Rumanovsk
  • 2,043
  • 4
  • 21
  • 33

2 Answers2

3

tl;dr

ZonedDateTime.parse(                              // Parse string into a date + time-of-day + time zone.
    … ,                                           // Your input string.
    DateTimeFormatter.ofPattern( "EEE MMM d HH:mm:ss zzz uuuu" , Locale.US )  // Specify `Locale` to determine human language and cultural norms in parsing and translating the text.    
)
.toLocalDate()                                    // Extract the date-only portion of the `ZonedDateTime` object.
.isEqual( 
    LocalDate.now( ZoneId.of( "Africa/Tunis" ) )  // Get current date as seen by people of a certain region (time zone).
)

java.time

The Answer by aUserHimself is correct in suggesting the use of jsoup library. But the example code is ill-advised in other ways, making these few mistakes:

  • Using troublesome legacy date-time classes. Those classes are now supplanted by the java.time classes.
  • Assumes the day starts at 00:00:00. Not true for all dates in all time zones. Anomalies such as Daylight Saving Time (DST) mean the day may start at a time such as 01:00:00.
  • Ignoring the issue of Locale, which determines the human language used in parsing the text of the name of month, name of day-of-week, etc. The Locale also determines the expected punctuation and other cultural norms.
  • Ignores the crucial issue of time zone in determining the current date.

Example code.

String input = … ;
Locale locale = Locale.US ;
DateTimeFormatter f = DateTimeFormatter.ofPattern( "EEE MMM d HH:mm:ss zzz uuuu" , locale ) ;
ZonedDateTime zdt = ZonedDateTime.parse( input , f ) ;
LocalDate ld = zdt.toLocalDate() ; 

Compare to today's date. Must specify the expected/desired time zone. For any given moment, the date varies around the world by zone. A new day dawns earlier in India than in Canada, for example.

ZoneId z = ZoneId.of( "America/Montreal" ) ; 
LocalDate today = LocalDate.now( z ) ;

Boolean isSameDate = ld.isEqual( today ) ;

About java.time

The java.time framework is built into Java 8 and later. These classes supplant the troublesome old legacy date-time classes such as java.util.Date, Calendar, & SimpleDateFormat.

The Joda-Time project, now in maintenance mode, advises migration to the java.time classes.

To learn more, see the Oracle Tutorial. And search Stack Overflow for many examples and explanations. Specification is JSR 310.

With a JDBC driver complying with JDBC 4.2 or later, you may exchange java.time objects directly with your database. No need for strings or java.sql.* classes.

Where to obtain the java.time classes?

The ThreeTen-Extra project extends java.time with additional classes. This project is a proving ground for possible future additions to java.time. You may find some useful classes here such as Interval, YearWeek, YearQuarter, and more.

Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154
0

Try something like this (prior to Java 8):

    // get the label content as text (assuming you only have 1 label)
    Document doc = Jsoup.parse(driver.getPageSource());
    Element label = doc.select("label").first();
    String labelText = label.text();

    // get the relevant part (the date) from label content (between "on" and "from")
    String dateString = labelText.split("on")[1].split("from")[0].trim();

    // parse date
    SimpleDateFormat simpleDateFormat = new SimpleDateFormat("EEE MMM d HH:mm:ss zzz yyyy", Locale.ENGLISH);
    java.util.Date date = simpleDateFormat.parse(dateString);

    // create calendar from label date
    Calendar calendarLabel = new GregorianCalendar();
    calendarLabel.setTime(date);

    // create calendar for beginning of today in the default time zone
    //Calendar calendarToday = Calendar.getInstance();
    //  or in a timezone of your choice
    Calendar calendarToday = Calendar.getInstance(TimeZone.getTimeZone("Europe/Athens"));
    calendarToday.set(Calendar.HOUR_OF_DAY, 0);
    calendarToday.set(Calendar.MINUTE, 0);
    calendarToday.set(Calendar.SECOND, 0);
    calendarToday.set(Calendar.MILLISECOND, 0);

    // find out if label date is later than 0h of today
    System.out.println(calendarLabel.compareTo(calendarToday) >= 1);

For a more succinct solution in Java 8, see this answer of Basil Bourque.

aUserHimself
  • 1,589
  • 2
  • 17
  • 26
  • 1
    This code uses troublesome old date-time classes that are now legacy, supplanted by the java.time classes. Another problem: This code assumes the day starts at 00:00:00 which is *not* always true for all dates on all zones. – Basil Bourque Aug 09 '17 at 16:02
  • @Basil Bourque I agree I missed some details by assuming that all dates are in the same zone and that there are newer classes to use in `Java 8`. I have updated my answer, thanks for bringing up all these concerns! Also I upvoted your answer. – aUserHimself Aug 10 '17 at 07:07
  • 1
    Not to be picking on you, but 3-4 letter pseudo-zones like `CET` are *not* actual time zones. They are not standardized. They are not even unique! [Real time zone names](https://en.m.wikipedia.org/wiki/List_of_tz_database_time_zones) are in the format of `continent/region` such as `Asia/Kolkata` or `Pacific/Auckland` or [`Africa/Casablanca`](https://en.m.wikipedia.org/wiki/Africa/Casablanca). – Basil Bourque Aug 10 '17 at 07:22
  • No problem, I was never aware of it. I will update my answer accordingly. It is very confusing in any case, as `CEST` is also used in the html example above as a valid time zone. – aUserHimself Aug 10 '17 at 07:34