1

I have a program that scrapes a webpage. I'm using JSoup and Selenium. To configure the user agent in the JSoup request, I have a userAgents.txt file containing a list of user agents. In each execution, I have a method that reads the .txt file, and returns a random user agent.

The program is working as expected when running in IntelliJ.

The problem happens when I try to build the .jar file, with mvn clean package. When running the .jar file, I get a FileNotFoundException, since the program can't find the userAgents.txt file.

If I remove this functionality, and hardcode the user agent, I have no problems.

The file currently is in src/main/resources. When executing the .jar, I get the exception:

java.io.FileNotFoundException: ./src/main/resources/userAgents.txt (No such file or directory)

I tried the maven-resources-plugin to copy the files into the target folder:

<plugin>
    <artifactId>maven-resources-plugin</artifactId>
    <version>3.3.0</version>
    <executions>
        <execution>
            <id>copy-resources</id>
            <phase>package</phase>
            <goals>
                <goal>copy-resources</goal>
            </goals>
            <configuration>
                <outputDirectory>${basedir}/target/extra-resources</outputDirectory>
                <includeEmptyDirs>true</includeEmptyDirs>
                <resources>
                    <resource>
                        <directory>${basedir}/src/main/resources</directory>
                        <filtering>false</filtering>
                    </resource>
                </resources>
            </configuration>
        </execution>
    </executions>
</plugin>

Even changing the path inside the program (to open file from target/extra-resources) the error persists.

I also added this <resources>, and got nothing:

<resources>
    <resource>
        <directory>src/main/resources</directory>
        <includes>
            <include>**/*.txt</include>
            <include>**/*.csv</include>
        </includes>
    </resource>
</resources>

Inside the program, I'm reading the file using:

String filePath = "./src/main/resources/userAgents.txt";
File extUserAgentLst = new File(filePath);
Scanner usrAgentReader = new Scanner(extUserAgentLst);

So, my question is:

  • How to make sure the userAgents.txt file is inside the .jar file, so that when I run it, the program reads from this file and doesn't return any exception?
Pexers
  • 953
  • 1
  • 7
  • 20
samsey8
  • 15
  • 7
  • To check that the file is actually inside the `jar` produced, you can use `jar tf file.jar` command to list the contents of it. – Evgeny Bovykin Oct 27 '22 at 10:19
  • 1
    Does this answer your question? [How to really read text file from classpath in Java](https://stackoverflow.com/questions/1464291/how-to-really-read-text-file-from-classpath-in-java) – xerx593 Oct 27 '22 at 10:33

1 Answers1

3

You can use getResourceAsStream instead, like so:

import java.io.BufferedReader;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.util.stream.Collectors;

public class MyClass {

  public static void main(String[] args) {
    InputStream inStream = MyClass.class.getClassLoader().getResourceAsStream("userAgents.txt");
    if (inStream != null) {
      BufferedReader reader = new BufferedReader(new InputStreamReader(inStream));
      String usersTxt = reader.lines().collect(Collectors.joining());
      System.out.println(usersTxt);
    }
  }

}

It shouldn't be necessary to specify the tag <resources> in the pom.xml file. You just need to place your file inside src/main/resources before running the mvn package command to build the project.

Pexers
  • 953
  • 1
  • 7
  • 20
  • 1
    I had to make some adjustments to my particular situation - but overall, this solved my problem. Thank you! – samsey8 Oct 27 '22 at 10:56