I am scraping a webpage that contains dates in this format: "8th November 2013". After I have returned the dates they are organized into an unordered array of strings. What I want to do then is somehow convert these strings to a simple date format like yyyy-MM-dd so I can order them and use them for interacting with the calendar?
Asked
Active
Viewed 985 times
0
-
Does this thread get you started: http://stackoverflow.com/questions/4011075/how-do-you-format-the-day-of-the-month-to-say-11th-21st-or-23rd-in-java – EdgeCase Nov 08 '13 at 17:19
-
Yeah if I did it this way I would have to write an entire class to parse and format the string dates to my liking!? Very time consuming but seems it may be my only choice? – k-prog Nov 08 '13 at 17:34
4 Answers
1
How about something like this?
private String dateLongStringConvert(String dateLongString) {
// split long date string into string array
String[] dateArray = dateLongString.split(" ");
// get day of month as an integer (strip out non numeric chars)
int dayOfMonth = Integer.parseInt(dateArray[0].replaceAll("\\D+", ""));
// Convert month string to number
String month = "";
switch (dateArray[1]) {
case "January":
month = "01";
case "Feburary":
month = "02";
case "March":
month = "03";
case "April":
month = "04";
case "May":
month = "05";
case "June":
month = "06";
case "July":
month = "07";
case "August":
month = "08";
case "September":
month = "09";
case "October":
month = "10";
case "Novemember":
month = "11";
case "December":
month = "12";
}
// return formated date string
return dateArray[2] + "-" + month + "-" + String.format("%02d", dayOfMonth);
}

wyoskibum
- 1,869
- 2
- 23
- 43
-
This worked but I had to make a couple of changes. I had to remove the switch(String) because switch statement on String objects is a new feature introduced in Java 1.7. Unfortunately Android requires version 1.6 or 1.5. Instead I used multiple if else statements. Also when split the string into an array had to split("\\s+") to identify the white spaces. Thanks very much for your help! – k-prog Nov 12 '13 at 15:37
1
String inputDate = "8th November 2013";
inputDate = inputDate.replaceAll("([0-9])st|nd|rd|th|\\.", "$1"); // get rid of the th.
Date date = new SimpleDateFormat("d MMM y", Locale.ENGLISH).parse(inputDate); // parse input date
String outputDate = new SimpleDateFormat("yyyy-MM-dd").format(date); // format to output date

Squ1sh
- 27
- 3
-
Tried this and it removes the th fine but when it tries to format the result I get the following exception: "Unparseable date: "8 November 2013" – k-prog Nov 12 '13 at 14:49
0
Proper way to do such thing is to use a parser like Stanford Temporal Tagger and figure out dates from the text. A nice GUI(http://nlp.stanford.edu:8080/sutime/process) is provided by the team to evaluate the tool

Karthik
- 1,005
- 8
- 7