0

I have a list of files:

  • xxx_05102019023601017.csv
  • xxx_05092019023601036.csv
  • xxx_05082019023600900.csv

Using Groovy (or Java), I need to extract the date from the list of file names and reformat them so the year is leading. Like so...

  • xxx_20190510023601017.csv
  • xxx_20190509023601036.csv
  • xxx_20190508023600900.csv

Is there a slick Groovy way to accomplish this?

user1781500
  • 171
  • 10
  • If you want a way using Groovy, then why did you tag java? Also, what have you tried? – Roddy of the Frozen Peas May 21 '19 at 17:20
  • First find out pattern, use SimpleDateFormat and parse the string after extracting from file name using Java. If it is related to Groovy, remove the tag as @RoddyoftheFrozenPeas suggested. – Sambit May 21 '19 at 17:22
  • Sorry, I removed the Java tag. I'm able to extract the name using regex, but am not familiar enough with date structures alter it. – user1781500 May 21 '19 at 17:23
  • @Sambit I recommend you don’t use `SimpleDateFormat`. That class is notoriously troublesome and long outdated. Instead use `LocalDateTime` and `DateTimeFormatter`, both from [java.time, the modern Java date and time API](https://docs.oracle.com/javase/tutorial/datetime/). – Ole V.V. May 21 '19 at 18:51
  • @OleV.V., thanks for informing me about it. Good I am learning it. – Sambit May 21 '19 at 19:36

2 Answers2

3

I don't see the need to use date parsing/formatting

use just regular expression to swap two parts

def oldName = "xxx_05102019023601017.csv"
def newName = oldName.replaceAll(/^(\D+)(\d{4})(\d{4})/,'$1$3$2')

out:

xxx_20190510023601017.csv

regexp explain:

enter image description here

https://regex101.com/r/X0u9wv/1

String.replaceAll( regex, replacement )

https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#replaceAll-java.lang.String-java.lang.String-

replacement - the string to be substituted for each match

$1 $2 and $3 corresponds to each (...) in regexp, so I just swap 2-nd and 3-d groups in replacement

daggett
  • 26,404
  • 3
  • 40
  • 56
  • This is perfect! Can you break down how that Regex works? I've never seen the second argument given. – user1781500 May 21 '19 at 19:33
  • Nice regexp explanation. – Perplexabot May 21 '19 at 20:05
  • It’s a nice and nicely explained regular expression solution, thanks. For my part I would still use `LocalDateTime` and `DateTimeFormatter` because (1) they are specifically targeted towards dates and times (2) the code (although longer) will more directly tell the reader which parts (month, day, year) are swapped (3) the code will be easier and more natural to read (4) you will get stricter validation. – Ole V.V. May 22 '19 at 02:22
1

The Answer by daggett using regex is slick. If curious, here is the date-time way to handle it.

java.time

xxx_05102019023601017.csv

I am assuming the digits represent day-of-month, month, year, hour, minute, second, millisecond.

Input

Split your string on the underscore by calling String::split.

String input = "foobar_05102019023601017.csv" ;
String[] parts = string.split( "_" ) ;
String part1 = parts[0]; // foobar
String part2 = parts[1]; // 05102019023601017.csv

Define a formatter to match the second part.

DateTimeFormatter f = DateTimeFormatter.ofPattern( "ddMMuuuuHHmmssSSS'.csv'" ) ;

Parse as a LocalDateTime object, since your input lacks any indicator of time zone or offset-from-UTC.

LocalDateTime ldt = LocalDateTime.parse( part2 , f ) ;

Output

Define a formatter for the output.

DateTimeFormatter formatterOutput = DateTimeFormatter.ofPattern( "uuuuMMddHHmmssSSS" ) ;

Generate output.

String datetimeOutput = ldt.format( formatterOutput ) ;
String prefix = part1 + "_" ;
String suffix = ".csv" ;
String output = prefix + datetimeOutput + suffix ;

Or more succinctly, use a StringBuilder for a single-liner.

String output = new StringBuilder()
    .append( part1 ) 
    .append( "_" ) 
    .append( ldt.format( formatterOutput ) )
    .append( ".csv") 
    .toString() 
;

ISO 8601

Your format is close to the “basic” variation of standard ISO 8601 format. I suggest using these standard formats wherever feasible. To comply, insert a T between the year-month-day portion and the hour-minute-second portion.

To do so, change the DateTimeFormatter pattern. Insert the letter inside a pair of single-quotes: 'T'.

DateTimeFormatter formatterOutput = DateTimeFormatter.ofPattern( "uuuuMMdd'T'HHmmssSSS" ) ;

Zone/Offset

A date and a time without an assigned time zone or offset-from-UTC is ambiguous and therefore prone to misinterpretation. I suggest always including the zone or offset to communicate clearly.

If this date and time was meant to represent a moment in UTC (a good idea generally), append simply a Z. This letter means UTC, or an offset of zero hours-minutes-seconds. The letter is pronounced “Zulu”.

String output = new StringBuilder()
    .append( part1 ) 
    .append( "_" ) 
    .append( ldt.format( formatterOutput ) )
    .append( "Z" ) 
    .append( ".csv") 
    .toString() 
;
Ole V.V.
  • 81,772
  • 15
  • 137
  • 161
Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154