1

My problem is that even if I take ping etc. into account, the following API request takes 10 times longer in my production environment(remote database(AWS), docker, etc.) with the same amount of data. It takes around 150ms when I run it on localhost and 1.5-2s when I run it on production. What might cause that huge difference? Like I said its the same amount of data, the ping, and the traffic transmitted(only 5kb) shouldn't cause that. I also only have two SQL queries I manually call. A lot of more SQL queries are automatically called because of the OneToMany relation you can see below. But I can't think of why that would cause any difference in performance between localhost and production.

Thats the API method i call:

@GetMapping
public ResponseEntity<List<Project>> getProjects(@RequestParam(required = false) boolean additionalInfo) {
  return ResponseEntity.ok(projectService.getProjects(additionalInfo));
}

Inside the service:

  private List<Project> getProjects(boolean additionalInfo) {
    List<Project> projects = (List<Project>) projectRepository.findAll();
    if (additionalInfo) {
      for (Project project : projects) {
        setAdditionalInfo(project);
      }
    }
    return projects;
  }

  private void setAdditionalInfo(Project project) {
    Integer notificationCount = 0;
    Duration duration = Duration.ZERO;
    LocalDate lastDailyEntryDate = LocalDate.MIN;

    for (DailyEntry dailyEntry : dailyEntryRepository.findByProjectId(project.getId())) {
      if (dailyEntry.getStatus().equals(EntryStatus.OPEN)) {
        notificationCount++;
      }
      duration = duration.plus(Duration.between(dailyEntry.getStartTime(), dailyEntry.getEndTime()));
      if (lastDailyEntryDate.isBefore(dailyEntry.getDate())) {
        lastDailyEntryDate = dailyEntry.getDate();
      }
    }

    int inactiveSince = calculateInactiveSince(lastDailyEntryDate, project);
    if (inactiveSince >= 3 && !project.getIsArchived()) {
      project.setInactiveSince(inactiveSince);
    }
    project.setNotificationCount(notificationCount);
    project.setUsedBudget(duration.toHours() * 100);
  }

  private Integer calculateInactiveSince(LocalDate lastDailyEntryDate, Project project) {
    int inactiveSince;
    if (lastDailyEntryDate != LocalDate.MIN) {
      Period period = Period.between(lastDailyEntryDate, LocalDate.now());
      inactiveSince = period.getYears() * 12 + period.getMonths();
    } else {
      Period period = Period.between(project.getCreationDate(), LocalDate.now());
      inactiveSince = period.getYears() * 12 + period.getMonths();
    }
    return inactiveSince;
  }

Thats how the entites look like:

Project:

@Data
@Entity
@ToString(exclude = {"employees"})
@JsonInclude(JsonInclude.Include.NON_NULL)
public class Project {

  @Id
  @GeneratedValue(strategy = GenerationType.AUTO)
  private long id;

  @Column(unique = true)
  private String name;

  private Integer budget;

  private String description;

  private Boolean isArchived;

  private LocalDate archivedDate;

  private LocalDate creationDate;

  @NotNull
  @ManyToOne
  private Customer customer;

  @ManyToMany
  @JoinTable(
          name = "employee_projects",
          joinColumns = @JoinColumn(name = "project_id"),
          inverseJoinColumns = @JoinColumn(name = "employee_id")
  )
  private List<Employee> employees;

  @Transient
  private Long usedBudget;

  @Transient
  private Integer notificationCount;

  @Transient
  private Boolean isInactive;

  @Transient
  private Integer inactiveSince;
}

DailyEntry:

@Data
@Entity
@JsonInclude(JsonInclude.Include.NON_NULL)
public class DailyEntry {

  @Id
  @GeneratedValue(strategy = GenerationType.AUTO)
  private long id;

  private LocalDate date;
  private LocalTime startTime;
  private LocalTime endTime;
  private Duration breaks;
  private String performanceRecord;
  private EntryStatus status;

  @ManyToOne
  private Project project;

  @ManyToOne
  private Employee employee;
}
nomadSK25
  • 2,350
  • 3
  • 25
  • 36
M.Dietz
  • 900
  • 10
  • 29
  • have you measured where the time is spent? if not, why not? – eis Dec 10 '19 at 15:07
  • i didnt yet because i dont have acces to the production backend infrastructure yet, will check whether im going to be able to get more information the following days. Im new to the project, was just wondering whether theres an obvious reason someone here could see immediately, but seems like its a deeper problem which will need more analysis. – M.Dietz Dec 10 '19 at 20:21

3 Answers3

1

By default Hibernate FetchType is EAGER (it fetched the child entities along with a parent).

If you really need child data too you can either use: Cache or Pagination or use EntityGraph

Or you can just Set FetchType.Eager.

for example:

@NotNull @ManyToOne(fetch = FetchType.Lazy) private Customer customer;

Sagar Saud
  • 124
  • 2
  • 11
  • But why does that lead to a longer request time in production? Shouldnt that have the same impact on both localhost and production(remote)? – M.Dietz Dec 10 '19 at 14:35
  • Assuming there is a huge chunk of data on your PROD Environment, It WILL take more require time for your service to return the value. Also if there might be a Latency issue or your DB issue. You could use the Actuator to find the health of your system. – Sagar Saud Dec 11 '19 at 08:46
  • But like i said, i have the same amount of data on localhost as i have on production(copied the production data over to localhost). Latency shouldnt be an issue because other simple requests work fine(they would be slow too if latency from frontend to backend was the issue). So i guess it has to be somehow related to the DB. – M.Dietz Dec 11 '19 at 09:44
1

This kind of issues can be solved by elimination of the parts that contribute to the slowness:

  1. Network
  2. Self Service processing time (your server code)
  3. Database calls

First of all lets eliminate the network (maybe you're trying to reach the server that is in another end of the World)

So you can use spring boot actuator that has an endpoint that shows the information about last 50 requests, including the execution time. This time is measured from the point the request reaches the server till the time that response is sent.

Now, if you see "1.5-2 seconds" there, its on server, otherwise check the network.

The self-processing should be the same unless you're processing a different amount of data which is not true as you state, so I move on.

Regarding the Database calls. This is a very broad topic, in general there can be many reasons:

  • The DB server works much busier in production, because it has to serve much more requests (coming from your flow or other flows, or maybe other applications configured to work with this DB).
  • The hardware that runs the DB is slow/weak

  • The tables lack some index and the requests are slow.

One way or another this has to be thoroughly checked.

I believe the best way is to measure the query execution time and place this data to some log. There are many ways to implement such a log, starting with "manually" measuring the time, and ending with wrapping the Repository with a logging proxy like stated in this SO thread

Mark Bramnik
  • 39,963
  • 4
  • 57
  • 97
0

projectRepository.findAll() is the root cause. You should never do this, rather invoke through pagination:

projectRepository.findAll(Pageable p);

Ctorres
  • 468
  • 6
  • 19
Kashif
  • 1
  • 1
  • I never worked with pagination yet, whats the difference between invoking it through pagination? I want to get all projects anyways, isnt pagination used if i wanted to only get a few projects instead of all? – M.Dietz Dec 10 '19 at 13:58
  • //below code will process all projects in batches of 1000 projects... Pageable pageable = PageRequest.of(0, 1000); while(true) { Page projectPage = projectRepository.findAll(pageable); List< Project > users = projectPage().map(project -> { Project u = new Project(); //do additional work return u; }).collect(Collectors.toList()); if (!projectPage hasNext()) break; pageable = projectPage.nextPageable(); } – Kashif Dec 11 '19 at 17:18