0

I am trying to resolve a problem which i am currently facing. I am trying to find out which is the fastest way to retrieve information from a database.

So a managers name is passed in, what i need to do is return all the users who's manager is passed in, but if any of these users happen to be a manager, i have to return all the user names (List<String>) they manage also... this will repeat until i return everyone

here is a small example of what i need to return

manager --> manager --> manager --> manager --> employee
        --> manager --> employee
        --> manager --> manager --> manager --> employee
        --> employee

so in the example above the code would be returning 12 names

i know i could do this a number of different ways but i do not know which would be the best way (recursive for loop, for loop recursively calling method, SQL statement, HQL statement ... etc.)

As this list can be any size depending on the manager passed in, i need to find which would be the quickest way to retrieve this as i have coded this to use recursive for loops and it takes 1 minute 20 seconds which is WAY too long

Any ideas?

Hip Hip Array
  • 4,665
  • 11
  • 49
  • 80
  • maybe you should try to do it with one query, if possible. How does the table structure looks like ? – Olha Jun 09 '15 at 14:48
  • i have a User object which just has a manager name which is another User object but is not linked with a foreign key or anything so have to search using the name which is awful (not my design, this was the way it was when i got the code) – Hip Hip Array Jun 09 '15 at 14:52
  • How many users are there in the database? If you can load them all in memory you can perform the query quickly by searching the object hierarchy. – Adriaan Koster Jun 09 '15 at 15:11
  • 1 minute 20 seconds to return a list of which size? Can you change the DB design or not? – gvo Jun 09 '15 at 15:13
  • @AdriaanKoster : Currently where is just over 13000 users in the database and this will be constantly growing – Hip Hip Array Jun 09 '15 at 15:17
  • @gvo : I can change the database structure but i only want to be doing that as a last resort\ – Hip Hip Array Jun 09 '15 at 15:18
  • Could you show us your code? I think it might help – gvo Jun 09 '15 at 15:18
  • 13k users is not that much, you could load all of them, especially if you create stub objects (just the id and relations). – Adriaan Koster Jun 09 '15 at 15:20

2 Answers2

1

What you need is to perform a performance analysis, in order to know if the latency comes from the database, the application server, or the network (latency due to many loops). According to your figures, I think you are doing too many queries, but this is an hypothesis you should verify.

In my company, which is a big company, there is no more than 15 levels between the Big Boss and any employee. Therefore, you shouldn't have to do more than 15 request.

I suspect that you do one loop for each people, in order to know it's employee. Get where manager = name What you could do is doing one HQL request to get all the employees on a list "get where manager IN (list of manager)". It should dramatically reduce the time spent, because you would do 15 request recursively instead of 13k for the big boss.

Request 1    Request 2   Request 3 ...
manager   --> manager --> manager --> manager --> employee
          --> manager --> employee
          --> manager --> manager --> manager --> employee
          --> employee

Otherwise, if you want to use an SQL statement, you might conisder using the keyword WITH.

gvo
  • 845
  • 1
  • 14
  • 22
  • yes the latency is coming from the many loops (e.g. one manager can have 20 sub managers) – Hip Hip Array Jun 09 '15 at 15:31
  • Then, instead of getting the employees for each manager, fetch all the employees for every manager at the same hierarchical level. It only changes your request, you cannot do less modifications. ;) – gvo Jun 09 '15 at 15:33
  • yes if i use the 'IN' operator then this should reduce the amount of calls being made straight away. I will try this now and see does this improve performance much – Hip Hip Array Jun 09 '15 at 15:37
  • sorry meant to mark this as correct earlier... i done a few other changes to my code but using the IN clause was the biggest increase in performance... cheers – Hip Hip Array Jun 17 '15 at 09:00
-1

See this similar question.

Also see this presentation about database antipatterns by Bill Karwin. From slide 48 onwards the 'Naive Tree' antipattern is discussed.

Solution #1: Path enumeration. Store the path of ancestors (managers in your case):

ID, PATH, NAME
1, 1/, george
2, 2/, peter
3, 1/3, harry
4, 1/3/4, bertrand

Easy to query descendants of george:

SELECT * from Employee WHERE path LIKE '1/%';

The presentation also mentions other solutions which I think are less useful in your case:

  • Solution #2: Nested sets
  • Solution #3: Closure table

EDIT: here's another idea mixing two database queries with a recursive in-memory search.

    public static class Employee {
        private Long id;
        private boolean isManager;
        private Employee manager;

        public Long getId() {
            return id;
        }

        public void setId(Long id) {
            this.id = id;
        }

        public boolean isManager() {
            return isManager;
        }

        public void setIsManager(boolean isManager) {
            this.isManager = isManager;
        }

        public Employee getManager() {
            return manager;
        }

        public void setManager(Employee manager) {
            this.manager = manager;
        }
    }

First get all managers in the database. If the total number of Employees is 16k then the number of managers should be manageable (no pun intended).

    // gets all existing managers from the database
    private static List<Employee> getAllManagers() {
        //  SELECT * FROM Employee WHERE isManager = true;
        return new ArrayList<>();
    }

Then iterate over all managers and recursively determine which managers work under the query manager.

    private static Set<Employee> findSubordinateManagers(Employee queryManager) {
        List<Employee> allManagers = getAllManagers();
        Set<Employee> subordinateManagers = new HashSet<>();
        for (Employee employee : allManagers) {
            if (isSubordinateTo(employee, queryManager)) {
                subordinateManagers.add(employee);
            }
        }
        return subordinateManagers;
    }    

    // determines if the given employee is subordinate to the given manager
    private static boolean isSubordinateTo(Employee employee, Employee manager) {
        if (employee.getManager() == null) {
            return false;
        }
        if (employee.getManager().getId().equals(manager.getId())) {
            return true;
        }
        return isSubordinateTo(employee, employee.getManager());
    }

Then do a second SQL query to get all the employees directly managed by the set of selected managers:

    // finds all employees directly subordinate to the given set of managers
    private static Set<Employee> findEmployees(Set<Employee> managers) {
        // SELECT * from Employee WHERE manager IN (managers);
        return new HashSet<>();
    }    
Community
  • 1
  • 1
Adriaan Koster
  • 15,870
  • 5
  • 45
  • 60
  • In this book, every solution is related to using the right design in the DB, which he doeesn't really want to change. And 13k records, with a depth of 20 max, he doesn't really need to. – gvo Jun 09 '15 at 15:28