30

How to remove invalid characters from a String , so that it can be used as a file name ?
the invalid characters include ("\\/:*?\"<>|").

akash
  • 22,664
  • 11
  • 59
  • 87
curious
  • 915
  • 4
  • 14
  • 27
  • possible duplicate of [Java: remove all occurrences of char from string](http://stackoverflow.com/questions/4576352/java-remove-all-occurrences-of-char-from-string) – Soana Jul 22 '15 at 13:07
  • 2
    This is platform-specific. For example, linux has no problem with many of these characters in filenames. – Qualia Jul 22 '15 at 13:16
  • Not a duplicate, this is a completly different question, the answer below is about the correct regular expression for what characters have to be removed from a filename to avoid attacks, it has nothing to do with just how to remove a character which is answered in the other question mentioned. – Christian Aberger Nov 26 '19 at 06:17

4 Answers4

43

You can try this,

String fileName = "\\/:*AAAAA?\"<>|3*7.pdf";
String invalidCharRemoved = fileName.replaceAll("[\\\\/:*?\"<>|]", "");
System.out.println(invalidCharRemoved);

OUTPUT

AAAAA37.pdf
akash
  • 22,664
  • 11
  • 59
  • 87
  • 4
    should we replace it with a underscore "_", rather than blank..because user can give a file name which consist of only invalid characters. in that case it will return null. – curious Jul 28 '15 at 10:43
  • 1
    what about CON and such keywords with which user cannot create file? – Naik Ashwini Sep 18 '17 at 04:43
3

You can use regex

 String s= string.replaceAll("[\\\\/:*?\"<>|]", "");
Sanjay Rabari
  • 2,091
  • 1
  • 17
  • 32
  • 2
    Please test your answer before posting it. You will notice that there is a problem with ``\`` character. Also what is the point of `^`? – Pshemo Jul 22 '15 at 13:15
  • I'm pretty sure the ^ character means the complement of the set, so everything not including the items in the character set. – Jeremy Fisher Jul 22 '15 at 13:19
  • Yes `^` at start of character class `[^...]` represents negation of characters its describes, which means that your solution will remove everything which is not special characters. So what your code would do (assuming it would have corrected syntax) would be opposite of what OP wants. – Pshemo Jul 22 '15 at 13:24
  • There is still problem with ``\`` in your regex. Remember that ``\`` inside `[]` is also special, for instance it can be used as part of `\d`. It means that it requires escaping (in regex, and in string). – Pshemo Jul 22 '15 at 13:28
  • Good, now you ended up with correct solution :) (if I ware you I would probably remove my answer since answer with same solution was already posted and this answer doesn't add anything new; I would also up-vote correct answer, but that is just me) – Pshemo Jul 22 '15 at 13:32
  • @Pshemo I like to correct my mistake....and I dont hesitate if people see my mistake....and If I were you.. and I saw that the answer is Ok then moved on.... – Sanjay Rabari Jul 22 '15 at 13:34
2

You should not try to second guess the user. If the provided filename is incorrect just show an error message or throw an exception as appropriate.

Removing those invalid characters from a suplied filename does in no way ensure that the new filename is valid.

Anonymous Coward
  • 3,140
  • 22
  • 39
2

You can replace characters by replaceAll():

@Test
public void testName() throws Exception
{
    assertEquals("", "\\/:*?\"<>|".replaceAll("[\\\\/:*?\"<>|]", ""));
}

however, note that

  • . (current directory) and .. (parent directory) on its own are also invalid, although you allow dots
  • for using the file with WebDAV, & is also a disallowed character (might be Microsoft specific)
  • COM1 is also an invalid file name, although it has legal characters only (also applies to PRN, LPT1 and similar) (might be Microsoft specific)
  • $MFT and similar are also invalid, although you can use $ in general (might be NTFS specific)
Thomas Weller
  • 55,411
  • 20
  • 125
  • 222