1

This is more a data related question than programming, but I wasn't sure where else to post it. If it better fits elsewhere modders are free to move it.

I'm trying to import a bank exported CSV into GNUcash, but GNUcash only takes specific date formats such as YYYY-MM-DD (1999-05-16), whereas my bank only exports as YYYYMMDD (19990516).

I'm not sure what the easiest / most convenient way is to convert the file for use. Writing a script and/or using complex regexes for it seems excessive but might be necessary?

I tried loading it into LibreOffice Calc and MS Excel but couldn't find any conversion options; formatting the cells with a custom YYYYMMDD format broke the output. I was hoping there was an easier way to solve this. If there isn't I might just end up writing that script and putting it online somewhere.

Alex
  • 261
  • 1
  • 3
  • 12

3 Answers3

1

You could open your CSV with Notepad2, tick regular expression search and replace.

If the date is written between double quotes and commas :

,"([12][90][901]\d)([01]\d)([0123]\d)",

with

,"\1-\2-\3",

For anything more complex, Excel macros or Ruby could help you.

Eric Duminil
  • 52,989
  • 9
  • 71
  • 124
  • Unfortunately the file does contain IBAN numbers so I can't use this, but thanks anyway! – Alex Dec 23 '16 at 19:46
  • 1
    I updated the example. This should be much more robust, provided the date is between quotes or commas. – Eric Duminil Dec 23 '16 at 20:31
  • Neat! I used this and it worked, thanks! I succesfully imported it in gnucash as well.. just to hit another bump in the road, need to figure out how to make certain transactions count as deposits and others as withdrawals .. But at least the dates work now! – Alex Dec 24 '16 at 00:41
1

GNUcash is right -- ISO formats are better.

But just about any script tool or language has facilities for that. At the most basic level we have GNU date:

$ date --date=20161223 +%Y-%m-%d
2016-12-23
$ 

Whatever you end up doing, please do NOT use regular expression for dates. Computing with dates and times can be tricky, and if no minors are around, much saltier language may be called for. Use libraries which can parse and format dates. There are many to choose from.

And I second the suggestion for converting the csv. If you can / want to use R, the anytime package reads dates (and datetimes) without a format string, and the default format for a date is what you need here. Python has capable libraries too, and so do most other scripting (and compiled) languages.

Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
  • I love this place! Learn new things every day. I'll give that a try! Though I do think regexes also work in this case because the format is static enough, e.g 20160405 for 2016-05-05 instead of 201655 which would be more problematic to parse with regexes. – Alex Dec 23 '16 at 19:45
  • I am telling you: regexp are the devil. Of course you split a string into pieces, but it won't stop you from create 34-13-2016. A Date class will stop you. It may matter. – Dirk Eddelbuettel Dec 23 '16 at 21:07
  • I agree that a proper script with date validation should be used for anything more complex. But right now, the only problem is to find the date in the CSV, and UNIX `date` or your Date class don't help you in this case. If both the column and the date format are known, I don't see anything wrong in using a regex just to add two dashes. – Eric Duminil Dec 24 '16 at 00:49
0

you can open the file with LibreOffice Calc or MS Excel and then convert the date column with formulas.

let's say that in A1 you have 19990516, then convert in with the following formula =date(left(A1,4),mid(A1,5,2),right(A1,2))