0

I am not able to use multiple aggregate function with multiple group by attribute using java 8 stream apis. Yet I am able to use single aggregate function with multiple group by attribute using java stream but I have a requirement where I need multiple min or max or aggregate function to use modification on list.

I need the same result that can get it from SQL like below mentioned but I have an employee list and I have to do this with java only.

Note: Employee ID is not unique its duplicate in employee table in my example.

SELECT emp_id,
       emp_name,
       year_of_joining,
       gender,
       department,
       Min(salary),
       Min(age)
FROM   employee
GROUP  BY emp_id,
          emp_name,
          year_of_joining,
          gender,
          department 

Below is the list of Employee Object

List<Employee> employeeList = new ArrayList<Employee>();
         
employeeList.add(new Employee(111, "Jiya Brein", 32, "Female", "HR", 2011, 25000.0));
employeeList.add(new Employee(111, "Jiya Brein", 32, "Female", "HR", 2011, 21000.0));
employeeList.add(new Employee(111, "Jiya Brein", 32, "Female", "HR", 2011, 29000.0));
employeeList.add(new Employee(111, "Jiya Brein", 15, "Female", "HR", 2011, 25000.0));

Below is the Employee Object

class Employee
{
    int id;
     
    String name;
     
    int age;
     
    String gender;
     
    String department;
     
    int yearOfJoining;
     
    double salary;
     
    public Employee(int id, String name, int age, String gender, String department, int yearOfJoining, double salary) 
    {
        this.id = id;
        this.name = name;
        this.age = age;
        this.gender = gender;
        this.department = department;
        this.yearOfJoining = yearOfJoining;
        this.salary = salary;
    }
     
    public int getId() 
    {
        return id;
    }
     
    public String getName() 
    {
        return name;
    }
     
    public int getAge() 
    {
        return age;
    }
     
    public String getGender() 
    {
        return gender;
    }
     
    public String getDepartment() 
    {
        return department;
    }
     
    public int getYearOfJoining() 
    {
        return yearOfJoining;
    }
     
    public double getSalary() 
    {
        return salary;
    }
     
    @Override
    public String toString() 
    {
        return "Id : "+id
                +", Name : "+name
                +", age : "+age
                +", Gender : "+gender
                +", Department : "+department
                +", Year Of Joining : "+yearOfJoining
                +", Salary : "+salary;
    }
}

I have tried lot but I was not able to get the exact result with Employee Object. Any Help would be really appreciated. I don't know how to use multiple aggregate function with multiple group by statements so that in last I can get list of Employee object. I have tried to do this with below mentioned code but able to do with single aggregate. How can i do with multiple aggregate.


    Function<Employee, List<Object>> compositeKey = personRecord -> Arrays.<Object>asList(personRecord.getId(),
        personRecord.getName(), personRecord.getGender(), personRecord.getDepartment(),
        personRecord.getYearOfJoining());

    Map<List<Object>, Optional<Employee>> collect = employeeList.stream().collect(
        Collectors.groupingBy(compositeKey, Collectors.minBy(Comparator.comparingDouble(Employee::getSalary))));
neeraj bharti
  • 361
  • 1
  • 3
  • 20
  • have you tried anything? show your effort – bananas Dec 20 '21 at 06:34
  • Yes i have tried using collectors.teeing and using collectors.collect but was able to get the single aggregate function but not able to to do this with multiple aggrgate. What i have tried i will update in question. – neeraj bharti Dec 20 '21 at 07:43
  • @bananas I have updated my efforts in post kindly look and suggest if i need to add something. – neeraj bharti Dec 20 '21 at 07:54
  • @neerajbharti Ideally in the employee table, employee id should be unique. Does your employee records contain duplicate entries for employee id? If it is unique, then I believe that grouping by `emp_id, emp_name, year_of_joining, gender, department` is exactly same as just grouping by `emp_id` – Gautham M Dec 20 '21 at 08:08
  • @Gautham M Yes in my case employee id is duplicate in employee table. And the result would not be same if we use group by statement. – neeraj bharti Dec 20 '21 at 08:25

3 Answers3

1

Here is the solution.

Function<Employee, List<Object>> compositeKey = personRecord -> Arrays.<Object>asList(personRecord.getId(),
        personRecord.getName(), personRecord.getGender(), personRecord.getDepartment(),
        personRecord.getYearOfJoining());

 List<Employee> groupedEmployees = employeeList.stream().collect(Collectors.groupingBy(compositeKey)).values()
            .stream()
            .map(e -> new Employee(e.get(0).getId(), e.get(0).getName(),
                e.stream().mapToInt(Employee::getAge).min().orElse(0), e.get(0).getGender(), e.get(0).getDepartment(),
                e.get(0).getYearOfJoining(), e.stream().mapToDouble(Employee::getSalary).min().orElse(0)))
            .collect(Collectors.toList());
neeraj bharti
  • 361
  • 1
  • 3
  • 20
0

Here a dangerous solution

override Employee equals and HashCode methods like below(you can add null check if u need)

 @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Employee employee = (Employee) o;
        return id == employee.id && yearOfJoining == employee.yearOfJoining && name.equals(employee.name) && gender.equals(employee.gender) && department.equals(employee.department);
    }

    @Override
    public int hashCode() {
        return Objects.hash(id, name, gender, department, yearOfJoining);
    }

now u can group like that

final Map<Employee, List<Double>> groupedEmployees = employeeList.stream()
                .collect(Collectors.groupingBy(e -> e,
                        Collectors.collectingAndThen(Collectors.toList(), list -> {
                            final double minSalary = list.stream().mapToDouble(Employee::getSalary).min().orElse(0);
                            final double minAge = list.stream().mapToDouble(Employee::getAge).min().orElse(0);
                            return List.of(minSalary, minAge);
                        })));

for test add more data coz your data doesn't contain same group (do it before mapping :) )

employeeList.add(new Employee(111, "Jiya Brein", 32, "Female", "HR", 2011, 25000.0));
employeeList.add(new Employee(111, "Jiya Brein", 32, "Female", "HR", 2011, 21000.0));
employeeList.add(new Employee(111, "Jiya Brein", 32, "Female", "HR", 2011, 29000.0));
employeeList.add(new Employee(111, "Jiya Brein", 15, "Female", "HR", 2011, 25000.0));

groupedEmployees.forEach((k, v) -> System.out.println(k + " -> " + v));
divilipir
  • 882
  • 6
  • 17
  • Thanks for your help and suggestion. Is there any other work around so that I could not override Employee Object equals and hashcode and the same thing would be done by stream api as I also write one code for single aggregate function which is working fine so what I need to change or update in my code so that it will work and see the same answer. – neeraj bharti Dec 20 '21 at 08:16
  • Thanks for your help as I dont want to override equals and hashcode method so not marking your answer as correct but its gives me the way to solve it. Thanks its really very appreciated. – neeraj bharti Dec 20 '21 at 12:01
0

This task is resolved by introducing separate objects/containers to store "groupBy" fields and "aggregated" fields. In the simplest form, raw lists could be used negore Java 14 but currently they can be replaced with record (in Java 14+).

ThenCollectors.toMap with merge function to implement actual aggregation should be used as shown below:

public record GroupBy(int id, String name, String gender, String department, int yearOfJoining) {}
public record Aggregate(int age, double salary) {}

List<Employee> groupedBy = employeeList.stream()
    .collect(Collectors.toMap(
        emp -> new GroupBy(emp.getId(), emp.getName(), emp.getGender(), 
            emp.getDepartment(), emp.getYearOfJoining()
        ), // grouped by fields
        emp -> new Aggregate(emp.getAge(), emp.getSalary()), // aggregated fields
        (ag1, ag2) -> new Aggregate(
            Math.min(ag1.age(), ag2.age()),
            Math.min(ag1.salary(), ag2.salary())
        )
        , LinkedHashMap::new // keep insertion order same as in the input
    )) // Map<GroupBy, Aggregate>
    .entrySet()
    .stream()
    .map(e -> new Employee(
        e.getKey().id(), e.getKey().name(), e.getValue().age(),
        e.getKey().gender(), e.getKey().department(), 
        e.getKey().yearOfJoining(), e.getValue().salary()
    ))    
    .collect(Collectors.toList());

Prior to Java 14, similar classes should be created and used, and for GroupBy class the methods hashCode and equals have to be implemented.

Assuming Lombok is used to generate all the boilerplate code, the same can be achieved:

@Accessors(fluent = true) // get rid of "get" prefixes
@Data
class GroupBy {
    private final int id;
    private final String name;
    private final String gender;
    private final String department;
    private final int yearOfJoining;
}

@Accessors(fluent = true)
@Data
@AllArgsConstructor
class Aggregate {
    private int age;
    private double salary;

    public Aggregate min(Aggregate other) {
        this.age = Math.min(this.age, other.age);
        this.salary = Math.min(this.salary, other.salary);
        return this;
    }
}

Here the merge function is implemented in Aggregate class itself.

List<Employee> groupedBy = employeeList.stream()
    .collect(Collectors.toMap(
        emp -> new GroupBy(emp.getId(), emp.getName(), emp.getGender(), 
            emp.getDepartment(), emp.getYearOfJoining()
        ), // grouped by fields
        emp -> new Aggregate(emp.getAge(), emp.getSalary()), // aggregated fields
        (ag1, ag2) -> Aggregate::min // use merge function
        , LinkedHashMap::new // keep insertion order same as in the input
    )) // Map<GroupBy, Aggregate>
    .entrySet()
    .stream()
    .map(e -> new Employee(
        e.getKey().id(), e.getKey().name(), e.getValue().age(),
        e.getKey().gender(), e.getKey().department(), 
        e.getKey().yearOfJoining(), e.getValue().salary()
    ))    
    .collect(Collectors.toList());
Nowhere Man
  • 19,170
  • 9
  • 17
  • 42
  • I have one doubt suppose if my age and salary is String and i have to find min for both and the result would be the same as form the oracle sql then what i need to change here. new Aggregate( Math.min(ag1.age(), ag2.age()), Math.min(ag1.salary(), ag2.salary()) )``` – neeraj bharti Dec 23 '21 at 10:42
  • If these fields are returned as String from the database, they need to be converted to appropriate number (`int` or `double`) when putting to `Aggregate` object/record, because Strings are compared in alphabetical order and `"10"` is "less" than `"9"`. Then, when processing is done, these values may be converted back to String using `String.valueOf`. – Nowhere Man Dec 23 '21 at 10:55
  • You are not getting my point if these values are of String like 'X','Y','A' then min should give A and max should give X the how can we handle this. – neeraj bharti Dec 23 '21 at 12:24
  • You are asking different things: how age or salary could be `A`,`X`? (btw, floating-point [should NEVER be used for money/currency](https://stackoverflow.com/questions/3730019/why-not-use-double-or-float-to-represent-currency)) Or you're asking _now_ how to get min and max at the same time? – Nowhere Man Dec 23 '21 at 12:36
  • your answer is correct for age and salary but suppose I want min and max for any attribute like name or department or any other String attribute then how can I handle this condition this is my question. i am happy with your answer and its working fine but if i change the condition then how can i handle it. – neeraj bharti Dec 23 '21 at 12:48
  • The approach is the same: create a record/object for the new "groupBy" (as it's definitely changed), and implement another record/object for the aggregated part -- it may store both min and max values as you need; update the logic in `merge` function according to your needs. – Nowhere Man Dec 23 '21 at 13:23
  • Also, if there are such huge changes in your requirements, you should ask another question. – Nowhere Man Dec 23 '21 at 13:32