1

So I'm trying to split a file name to the base and it's extension. I found this answer to this question, but I have another condition that this answer doesn't answer. I have a two conditions that I need to add:

1. If a file starts with "." and doesn't have any other period, then it counts like the first period is a part of the file's name, and there is no extension (the extension is ""). the answer in the link above

fileName.split("\\.(?=[^\\.]+$)")

will return for the given file name ".Myfile"

["",MyFile"]

but what I want it to return is

[.MyFile,""]

is there any way to do so using regex (changing the condition)?

2. I need 2 cells, no matter what the file name is. If the file name is "README " then I still want two cells to be created, the second one should contain an empty char.

I want:

[README,""]

to be returned.

Is this possible?

edit: solved! thanks to the help of Wiktor Stribiżew I solved it. I changed it to

String fileName;
String[] splitedName
String pat = "((?!^)\\.(?=[^.]*$|(?<=^\\.[^.]{0,1000})$))|$";
fileName = "README.txt"
System.out.println(Arrays.toString(fileName .split(pat,2)));
fileName = ".README"
System.out.println(Arrays.toString(fileName .split(pat,2)));
fileName = "README"

Result

"README.txt" => [REAADME,txt]
".README" => [REAADME, ]
"README" => [REAADME, ]

as I wanted.

Community
  • 1
  • 1
Dvir Itzko
  • 404
  • 4
  • 17

2 Answers2

2

Final solution:

String pat = "(?!^)\\.(?=[^.]*$)|(?<=^\\.[^.]{0,1000})$|$";

The pattern consists of 3 alternatives to split with:

  • (?!^)\\.(?=[^.]*$) - split with a dot that is not the first character in the string ((?!^)) and that has 0+ characters other than . to the right of it up to the string end (``)
  • (?<=^\\.[^.]{0,1000})$) - split at the end of string if a string starts with a literal . and has 0 to 1000 characters (maybe setting to 1,256 is enough, but there are longer file names, please adjust accordingly)
  • $ - split at the end of string (replace with \\z if you need no \n if a string ends with \n)

When you pass 2 as a limit argument to the split method, you can limit the number of splits to just two, see Java demo:

System.out.println(Arrays.toString(".MyFile".split(pat,2)));            // [.MyFile, ]
System.out.println(Arrays.toString("MyFile.ext".split(pat,2)));         // [MyFile, ext]
System.out.println(Arrays.toString("Another.MyFile.ext".split(pat,2))); // [Another.MyFile, ext]
System.out.println(Arrays.toString("MyFile.".split(pat,2)));            // [MyFile, ]
System.out.println(Arrays.toString("MyFile".split(pat,2)));             // [MyFile, ]

Original answer

I believe you are looking for

(?!^)\\.(?=[^.]*$)|(?<=^\\.[^.]{0,1000})$

One note: the pattern that can be used with split uses a constrained-width lookbehind that assumes that the length of the file cannot be more than 1000. Increase the value as needed.

See the IDEONE demo:

String pat = "(?!^)\\.(?=[^.]*$)|(?<=^\\.[^.]{0,1000})$";
String s = ".MyFile";
System.out.println(Arrays.toString(s.split(pat,-1)));
s = "MyFile.ext";
System.out.println(Arrays.toString(s.split(pat,-1)));
s = "Another.MyFile.ext";
System.out.println(Arrays.toString(s.split(pat,-1)));
s = "MyFile.";
System.out.println(Arrays.toString(s.split(pat,-1)));

Results:

".MyFile"            => [.MyFile, ]
"MyFile.ext"         => [MyFile, ext]
"Another.MyFile.ext" => [Another.MyFile, ext]
"MyFile."            => [MyFile, ]
Community
  • 1
  • 1
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • first, Thank you!. you solved the first problem I had. but can you maybe help me with the second one as well? I want "_README_" outcome to be '[README, ]' - if the original file name has no dot, I still want to consider it as an empty extension (" ") and would it be possible if you explained the regex you wrote? thanks! – Dvir Itzko May 13 '16 at 06:02
  • 1
    so I think I solved it, thanks to *Wiktor Stribiżew*/ I added |$ in the end, and in the split operation I limited it to 2 , meaning: `String pat = "((?!^)\\.(?=[^.]*$|(?<=^\\.[^.]{0,1000})$))|$" 'myText2.split(pat,2)` and now I get the wanted result – Dvir Itzko May 13 '16 at 06:43
0

You could use a split to turn the file name into an array, and then loop through the array and find everything that is after the "."

Here is some example code.

    String str = "cat.txt";
    char[] cArray = str.toCharArray();

And then loop through the array and find the "." and capture everything after it.

DjangoBlockchain
  • 534
  • 2
  • 17