4

Assume the following structure in your recources folder:

resources
├─spec_A
| ├─AA
| | ├─file-aev
| | ├─file-oxa
| | ├─…
| | └─file-stl
| ├─BB
| | ├─file-hio
| | ├─file-nht
| | ├─…
| | └─file-22an
| └─…
├─spec_B
| ├─AA
| | ├─file-aev
| | ├─file-oxa
| | ├─…
| | └─file-stl
| ├─BB
| | ├─file-hio
| | ├─file-nht
| | ├─…
| | └─file-22an
| └─…
└─…

The task is to read all files for a given specification spec_X one subfolder by one. For obvious reasons we do not want to have the exact names as string literals to open with Source.fromResource("spec_A/AA/…") for hundreds of files in the code.

Additionally, this solution should of course run inside the development environment, i.e. without being packaged into a jar.

Jan
  • 1,042
  • 8
  • 22

3 Answers3

2

The only option to list files inside a resource folder I found is with nio’s Filesystem concept, as this can load a jar-file as a file system. But this comes with two major downsides:

  1. java.nio uses java Stream API, which I cannot collect from inside scala code: Collectors.toList() cannot be made to compile as it cannot determine the right type.
  2. The filesystem needs different base paths for OS-filesystems and jar-file-based filesystems. So I need to manually differentiate between the two situations testing and jar-based running.

First lazy load the jar-filesystem if needed

  private static FileSystem jarFileSystem;

  static synchronized private FileSystem getJarFileAsFilesystem(String drg_file_root) throws URISyntaxException, IOException {
    if (jarFileSystem == null) {
      jarFileSystem = FileSystems.newFileSystem(ConfigFiles.class.getResource(drg_file_root).toURI(), Collections.emptyMap());
    }
    return jarFileSystem;
  }

next do the limbo to figure out whether we are inside the jar or not by checking the protocol of the URL and return a Path. (Protocol inside the jar file will be jar:

  static Path getPathForResource(String resourceFolder, String filename) throws IOException, URISyntaxException {
    URL url = ConfigFiles.class.getResource(resourceFolder + "/" + filename);
    return "file".equals(url.getProtocol())
           ? Paths.get(url.toURI())
           : getJarFileAsFilesystem(resourceFolder).getPath(resourceFolder, filename);
  }

And finally list and collect into a java list

  static List<Path> listPathsFromResource(String resourceFolder, String subFolder) throws IOException, URISyntaxException {
    return Files.list(getPathForResource(resourceFolder, subFolder))
      .filter(Files::isRegularFile)
      .sorted()
      .collect(toList());
  }

Only then we can go back do Scala and fetch is

class SpecReader {
  def readSpecMessage(spec: String): String = {
    List("CN", "DO", "KF")
      .flatMap(ConfigFiles.listPathsFromResource(s"/spec_$spec", _).asScala.toSeq)
      .flatMap(path ⇒ Source.fromInputStream(Files.newInputStream(path), "UTF-8").getLines())
      .reduce(_ + " " + _)
  }
}

object Main {
  def main(args: Array[String]): Unit = {
    System.out.println(new SpecReader().readSpecMessage(args.head))
  }
}

I put a running mini project to proof it here: https://github.com/kurellajunior/list-files-from-resource-directory

But of course this is far from optimal. I wanto to elmiminate the two downsides mentioned above so, that

  1. scala files only
  2. no extra testing code in my production library
Jan
  • 1,042
  • 8
  • 22
2

Here's a function for reading all files from a resource folder. My use case is with small files. Inspired by Jan's answers, but without needing a user-defined collector or messing with Java.

// Helper for reading an individual file.
def readFile(path: Path): String =
  Source.fromInputStream(Files.newInputStream(path), "UTF-8").getLines.mkString("\n")


private var jarFS: FileSystem = null; // Static variable for storing a FileSystem. Will be loaded on the first call to getPath.
/**
 * Gets a Path object corresponding to an URL.
 * @param url The URL could follow the `file:` (usually used in dev) or `jar:` (usually used in prod) rotocols.
 * @return A Path object.
 */
def getPath(url: URL): Path = {
  if (url.getProtocol == "file")
    Paths.get(url.toURI)
  else {
    // This hacky branch is to handle reading resource files from a jar (where url is jar:...).
    val strings = url.toString.split("!")
    if (jarFS == null) {
      jarFS = FileSystems.newFileSystem(URI.create(strings(0)), Map[String, String]().asJava)
    }
    jarFS.getPath(strings(1))
  }
}

/**
 * Given a folder (e.g. "A"), reads all files under the resource folder (e.g. "src/main/resources/A/**") as a Seq[String]. */
 * @param folder Relative path to a resource folder under src/main/resources.
 * @return A sequence of strings. Each element corresponds to the contents of a single file.
 */
def readFilesFromResource(folder: String): Seq[String] = {
  val url = Main.getClass.getResource("/" + folder)
  val path = getPath(url)
  val ls = Files.list(path)
  ls.collect(Collectors.toList()).asScala.map(readFile) // Magic!
}

(not catered to example in question)

Relevant imports:

import java.nio.file._
import scala.collection.JavaConverters._ // Needed for .asScala
import java.net.{URI, URL}
import java.util.stream._
import scala.io.Source
TrebledJ
  • 8,713
  • 7
  • 26
  • 48
  • Did you actually try this against both not packaged and packaged program code? When I recall correctly, the last time this `Files.list(…)` method failed, when the URI from `toURU` actually pointed to protocol `jar:…` – Jan Jul 20 '22 at 10:13
  • @Jan You were right, my original code didn't work for jar: protocol files. I tried using the jarFileSystem method in your answers but `getClass.getResource("/").toURI` kept returning `file:/opt/spark/conf/` (I'm using spark btw), which makes the FileSystem throw up. I managed to hack it together using this method: https://stackoverflow.com/a/32557217/10239789. A bit ugly, but it seems to work. – TrebledJ Aug 01 '22 at 03:47
  • Ok, you avoid using the self written collector by using javaCollectors directly after the Files.list(). Neat. But the check for either jar or file remains the same. And even the jar-filesystem you are using too. I will merge those ideas to make it slimmer if that still works with the filtering. Thanks – Jan Aug 01 '22 at 10:43
0

Thanks to @TrebledJ ’s answer, this could be minimized to the following:

class ConfigFiles (val basePath String) {
  lazy val jarFileSystem: FileSystem = FileSystems.newFileSystem(getClass.getResource(basePath).toURI, Map[String, String]().asJava);

  def listPathsFromResource(folder: String): List[Path] = {
    Files.list(getPathForResource(folder))
      .filter(p ⇒ Files.isRegularFile(p, Array[LinkOption](): _*))
      .sorted.toList.asScala.toList // from Stream to java List to Scala Buffer to scala List
  }

  private def getPathForResource(filename: String) = {
    val url = classOf[ConfigFiles].getResource(basePath + "/" + filename)
    if ("file" == url.getProtocol) Paths.get(url.toURI)
    else jarFileSystem.getPath(basePath, filename)
  }
}

special attention was necessary for the empty setting maps.

checking for the URL protocol seems inevitable. Git updated, PUll requests welcome: https://github.com/kurellajunior/list-files-from-resource-directory

Jan
  • 1,042
  • 8
  • 22