1

I have a method in my source for work with directory paths and files names. Some paths and file names occasionally are written with '´' or 'ñ' chars.

Problem is that directory path with that specials chars is not recognized as directory and is recognized as file. I occasionally need to read the file extension and when file has that chars the code don't works and don't reach de extension.

public static void listarDirectorio(File f, String separador) {

    File[] ficheros = f.listFiles();
    File ficheroTratado = null;

    logM.escribeLog(separador + "Ruta listada: " + f.getName(), false);

    for (int x = 0; x < ficheros.length; x++) {

        ficheroTratado = null;
        ficheroTratado = ficheros[x];

        if (!ficheros[x].isDirectory()) {
            if (esBorrable(ficheroTratado.getName())) {
                //  logM.escribeLog(
                //      "Fichero borrado: " + ficheroTratado.getName(),
                //  true);
            }
        }

        if (ficheros[x].isDirectory()
                && !ficheros[x].getName().startsWith("@")) {

            String nuevo_separador;
            nuevo_separador = separador + " # ";
            listarDirectorio(ficheros[x], nuevo_separador);
        }
    }
}

public static boolean esBorrable(String nFichero) {
    boolean esBorrable = false;

    try {
        String extension = "";
        int extIndex = nFichero.lastIndexOf(".");
        String ruta = "";

        //logM.escribeLog("nombre de fichero: " + nFichero, false);
        extension = nFichero.substring(extIndex, extIndex + 4);
        //logM.escribeLog("extension que tengo: " + extension, false);

        for (int i = 0; i < instance.getArrayExtensiones().size(); i++) {
            ruta = "";
            ruta = instance.getArrayExtensiones().get(i);

            if (ruta.equalsIgnoreCase(extension)) {
                //( logM.escribeLog("Este es borrable", false);
                esBorrable = true;
            } else {
                esBorrable = false;
            }
        }
    } catch (Exception e) {
        logM.escribeLog("Problema al tratar el fichero: " + nFichero, false);
        e.printStackTrace();
        return false;
    }

    return esBorrable;
}

I hope you can help me with that issue.

rrnieto
  • 15
  • 4
  • 1
    One thing I would note is that you can use an [enhanced for loop](https://blogs.oracle.com/CoreJavaTechTips/entry/using_enhanced_for_loops_with) to loop over the files - `for(final File ficheroTratado : f.listFiles();)`. This will save you 4 lines of code and make the rest **much** more readable. – Boris the Spider Oct 05 '13 at 10:14
  • Might help: http://stackoverflow.com/questions/15642862/special-character-in-filename-are-not-supported-while-copying-using-uri – An SO User Oct 05 '13 at 10:58
  • I don't understand the question. Are you saying that files that contain `\`` or `ñ` in their filename are incorrectly identified as directories? – Alastair McCormack Oct 07 '13 at 17:08
  • @AlastairMcCormack yes, is just this. The isDirectory() method always returns false. – rrnieto Oct 10 '13 at 06:48

1 Answers1

1

OK, I've replicated your issue but it took some doing! The issue occurs when the locale or file.encoding does not match the encoding of the filename. Remember that in Linux, the filesystem name is just an 8bit string and does not have a forced encoding.

To replicate:

  1. Linux box, possibly with a ext2/ext3 filesystem. No issue on Windows 7 x64
  2. Create a directory called "dirñ" using Windows-1252/ISO-88591-15 encoding. This could be done by setting your Term emulator, Putty for example, to Windows-1252 Translation, then: mkdir dirñ.
  3. Set your locale to `en_US.UTF-8"
  4. Run your Java application
  5. Directories with non-UTF-8 chars in them will be classed as files.

Solution

It transpires that this is a known bug but the only solution is to use Java 7's NIO2 implementation: http://jcp.org/en/jsr/detail?id=203 I've tested this and it does work as expected. In the new world order, you could write a directory Filter as detailed here: http://docs.oracle.com/javase/tutorial/essential/io/dirs.html#filter

The alternative solution is to get all your filenames into the same encoding, such as UTF-8, and ensure your locale matches. The trouble is that you can only convert to a new encoding if you know what the existing encoding was and consistent across your files.

Alastair McCormack
  • 26,573
  • 8
  • 77
  • 100
  • Thank you. I solved the problem changing the locale values to es_ES.utf8 in the operating system. Just add a few lines in the .profile – rrnieto Oct 13 '13 at 17:22