4

I have the following data:

D:\toto\food\Cloture_49000ert1_10_01_2013.pdf
D:\toto\food\Cloture_856589_12_01_2013.pdf
D:\toto\food\Cloture_66rr5254_10_12_2012.pdf

How can I extract the date part? For example:

D:\toto\food\Cloture_49000ert1_10_01_2013.pdf --> 10_01_2013
D:\toto\food\Cloture_856589_12_01_2013.pdf --> 12_01_2013
D:\toto\food\Cloture_66rr5254_10_12_2012.pdf --> 10_12_2012

My idea is to use LastIndexOf(".pdf") and then count 10 character backwards.

How can I solve this using substrings or another method?

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
user1958628
  • 409
  • 4
  • 7
  • 18

7 Answers7

6

Use Substring in this case.

Retrieves a substring from this instance. The substring starts at a specified character position.

Try like this;

string s = "D:\\toto\\food\\Cloture_490001_10_01_2013.pdf";
string newstring = s.Substring(s.Length - 14, 10);
Console.WriteLine(newstring);

Here is a DEMO.

Soner Gönül
  • 97,193
  • 102
  • 206
  • 364
4

You do not need to find index of .pdf

path.Substring(path.Length - 14, 10)
Mehmet Ataş
  • 11,081
  • 6
  • 51
  • 78
3

I'd do this with a Regex.

^[\w:\\]+cloture_(\d+)_([\d_]+).pdf$

Would match the date in the second group.

Echilon
  • 10,064
  • 33
  • 131
  • 217
2

If the filename is always in that format, you could do something crude like this:

string filename = @"D:\toto\food\Cloture_490001_10_01_2013.pdf";

string date = filename.Substring(filename.Length - 14, 10);

That will get a substring from 10_01_2013.pdf, which is 14 characters long, but only take the first 10 characters, leaving you with 10_01_2013.

If, however, the filename is in a different format and the date could appear anywhere within the name, you may want to consider something like Regular Expressions to be able to do a match for ##_##_#### and pull that out.

Rudi Visser
  • 21,350
  • 5
  • 71
  • 97
0

try this approach:

string dateString = textString.Substring(textString.Length-14, 10);

see here as well: Extract only right most n letters from a string

Community
  • 1
  • 1
Davide Piras
  • 43,984
  • 10
  • 98
  • 147
0

If you want to use LastIndexOf then

string str = @"D:\toto\food\Cloture_490001_10_01_2013.pdf";
string temp = str.Substring(str.LastIndexOf(".pdf") - 10, 10);

And you can parse it like

DateTime dt;
if(DateTime.TryParseExact(temp, "MM_dd_yyyy", CultureInfo.InvariantCulture, DateTimeStyles.None, out dt))
{
    //valid 
}
else
{
    //invalid
}
Bilal Hashmi
  • 1,465
  • 13
  • 12
0

I'd go with your idea of using LastIndexOf ".pdf" and then count backwards. Or use the Path.GetFileNameWithoutExtension method to just get the name and then take the last 10 characters.

These methods will both keep working if the path to the filenames ever changes (which it probably will) and don't rely on magic numbers (other than the one that defines the length of the substring we are interested in) to find the right place in the string.

ChrisF
  • 134,786
  • 31
  • 255
  • 325
  • If you think about it, it's still relying on magic numbers / positioning. The `Substring` solution doesn't rely on the filenames being the same length, though. – Rudi Visser Jan 15 '13 at 12:58
  • @RudiVisser - Well only the magic number of the length of the substring required. – ChrisF Jan 15 '13 at 12:59
  • True, but it's the same like `Substring`, apart from we assume a 3-char extension too :) – Rudi Visser Jan 15 '13 at 13:00