1

I have this CSV file:

id,name,mark
20203923380,Lisa Hatfield,62
20200705173,Jessica Johnson,59
20205415333,Adam Harper,41
20203326467,Logan Nolan,77

And I'm trying to process it with this code:

 try (Stream<String> stream = Files.lines(Paths.get(String.valueOf(csvPath)))) {
                DoubleSummaryStatistics statistics = stream
                        .map(s -> s.split(",")[index]).skip(1)
                        .mapToDouble(Double::valueOf)
                        .summaryStatistics();
} catch (IOException e) // more code

I want to get the column by its name.

I guess I need to validate the index to be the index of the column the user enters as an integer, like this:

int index = Arrays.stream(stream).indexOf(columnNS);

But it doesn't work.

The stream is supposed to have the following values, for example:

Column: "mark"

62, 59, 41, 77

Alexander Ivanchenko
  • 25,667
  • 5
  • 22
  • 46
abdelsh
  • 23
  • 6
  • I'd suggest you look at using a third-party library such as Apache commons-csv – D-Dᴙum Apr 24 '22 at 08:51
  • for this project i cant use any library outside the java SE library – abdelsh Apr 24 '22 at 08:51
  • You know what the columns are, so why don't you build a `Map` so you can do something like `.map(s -> s.split(",")[Mark.getColumn("id")])`? – g00se Apr 24 '22 at 10:07

1 Answers1

1

I need to validate the index to be the index of the column the user enters as an integer ... But it doesn't work.

Arrays.stream(stream).indexOf(columnNS)

There is no method indexOf in the Stream IPA. I'm not sure what did you mean by stream(stream) but this approach is wrong.

In order to obtain the valid index, you need the name of the column. And based on the name, you have to analyze the very first line retrieved from the file. Like in your example with column name "mark", you need to find out whether this name is present in the first row and what its index is.

What I want is to get the column by it's name ... The stream is supposed ...

Streams are intended to be stateful. They were introduced in Java in order to provide to expressive and clear way of structuring the code. And even if you manage to cram stateful conditional logic into a stream, you'll lose this advantage and end up with convoluted code that is less clear performant than plain loop (remainder: iterative solution almost always performs better).

So you want to keep your code clean, you can choose either: to solve this problem using iterative approach or relinquish the requirement to determine the index of the column dynamically inside the stream.

That's how you can address the task of reading the file data dynamically based on the column name with loops:

public static List<String> readFile(Path path, String columnName) {
    List<String> result = new ArrayList<>();
    try(var reader = Files.newBufferedReader(path)) {
        int index = -1;
        String line;
        while ((line = reader.readLine()) != null) {
            String[] arr = line.split("\\p{Punct}");
            if (index == -1) {
                index = getIndex(arr, columnName);
                continue; // skipping the first line
            }
            result.add(arr[index]);
        }
    } catch (IOException e) {
        e.printStackTrace();
    }
    return result;
}
// validation logic resides here
public static int getIndex(String[] arr, String columnName) {
    int index = Arrays.asList(arr).indexOf(columnName);
    if (index == -1) {
        throw new IllegalArgumentException("Given column name '" + columnName + "' wasn't found");
    }
    return index;
}
// extracting statistics from the file data
public static DoubleSummaryStatistics getStat(List<String> list) {
    return list.stream()
        .mapToDouble(Double::parseDouble)
        .summaryStatistics();
}

public static void main(String[] args) {
    DoubleSummaryStatistics stat = getStat(readFile(Path.of("test.txt"), "mark"));
}
Alexander Ivanchenko
  • 25,667
  • 5
  • 22
  • 46
  • i dont want to get the values on the main method, i want to get the values for another method, so how can i use ``` DoubleSummaryStatistics stat = getStat(readFile(csvPath, columnNs)); ``` with ```.getMin()``` ```getMax()``` ```getAverage()``` on getters and setters? ```//code public static DoubleSummaryStatistics getStat(List list) { return list.stream() .mapToDouble(Double::valueOf) .summaryStatistics(); } public void setMean(BigDecimal mean) { this.mean = mean; } //more setters ``` – abdelsh Apr 24 '22 at 21:03
  • @abdelsh If you want to construct a *custom object* based on the data from the `DoubleSummaryStatistics` you can create a separate method responsible for that. And I'll suggest to utilize a constructor, make use of the Builder pattern, instead of setters. If you have an issue with implementing this logic, since it doesn't directly related to the topic of reading data from a file, I recommend opening a *new question* to address this problem. – Alexander Ivanchenko Apr 24 '22 at 23:51
  • that worked out really good, but now i have a problem with a calculation, so how is the best way to sort the stream, get the middle value of it or if it there are two middle values get them and get the median of it – abdelsh Apr 24 '22 at 23:56
  • @abdelsh `DoubleSummaryStatistics`(https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/DoubleSummaryStatistics.html#method-summary) can provide you avarage, min and max. To culculate a *median* you create an array `double[]` by using a list returned by `readFile()` like that: `readFile().stream().mapToDouble(Double::valueOf).sorted().toArray();` - that will give a sorted array. The next step see [here](https://stackoverflow.com/questions/11955728/how-to-calculate-the-median-of-an-array) or in this [tutorial](https://javatutoring.com/java-calculate-median/) – Alexander Ivanchenko Apr 25 '22 at 00:08
  • how can i implement a writeFile method to create a new file and implements them? what im doing is trying to create the file inside the fileReader wich is the next code ```String[] arr = String.valueOf(csvPath).split("\\\\"); int i = 0; for (i=0; i< arr.length; i++){ } String csvFile = arr[i]; File newFile = new File(csvFile); try { newFile.createNewFile(); } catch (IOException e) { e.printStackTrace(); } ``` – abdelsh Apr 25 '22 at 20:25
  • the getStats methods keeps giving nullPointerException when the selected collumn have strings how does i make it return that the selected column is nos calculable? and this is [the task i have to do](https://stackoverflow.com/questions/72006151/task-calculations-csv-file-using-java) – abdelsh Apr 25 '22 at 22:05
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/244199/discussion-between-alexander-ivanchenko-and-abdelsh). – Alexander Ivanchenko Apr 25 '22 at 22:33