1

So I have a normal UserEntity class along with a normal UserRepository interface. All with proper annotation (I hope).

I am curious as to what would be the time complexity of these queries:

userRepository.findById(id)

and

userRepository.findByPublicId(publicId)

As you can see the first one is using the primary key id and the second one is using the column publicId.

I have a database with 44,356 fake UserEnities and somehow the speed for both of these is generally the same, even after fresh runs. Of course, this is good but how is this possible when one is supposed to be o(log(n)) and the other is o(n)

Here are the details of the objects:

UserEntity:

    @Data
    @AllArgsConstructor
    @NoArgsConstructor
    @Builder
    @Entity
    public class UserEntity {
    
        @Id
        @GeneratedValue(strategy = GenerationType.IDENTITY)
        private Long id;
    
        @Email
        @NotEmpty(message = "Email is required")
        private String email;
    
        @NotBlank(message = "Username is required")
        private String username;
    
        @NotBlank(message = "Profile name is required")
        private String profileName;
    
    }

UserRepository:

@Repository
public interface UserRepository extends JpaRepository<UserEntity, Long> {
    Optional<UserEntity> findByUsername(String username);
    Optional<UserEntity> findByPublicId(String publicId);
    Optional<UserEntity> findById(String id);
}
  • Are you actually referring to little o? Because for this, I don't think the premiss is true. https://stackoverflow.com/questions/1364444/difference-between-big-o-and-little-o-notation – Jens Schauder Nov 13 '20 at 10:42
  • It may depends on underlying db index implementation. For example, in Postgresql index can have several type, one of them is btree. "PostgreSQL provides several index types: B-tree, Hash, GiST, SP-GiST, GIN and BRIN. Each index type uses a different algorithm that is best suited to different types of queries. By default, the CREATE INDEX command creates B-tree indexes, which fit the most common situations." – WesternGun Nov 13 '20 at 11:20

1 Answers1

1

The main problem with your test is that your database is probably way to small to demonstrate the Big O behaviour. The time required for the method call of a repository method consists of several parts:

  1. The time required to construct a SQL statement, bind parameters to it
  2. Sending the statement to the database
  3. The database executing the statement
  4. Sending the data back to the client
  5. Constructing java objects out of the result.

For a query returning a single result based on a simple predicate from a single small table this is probably dominated by 2. and 4. which are both constant with respect to the table size.

The difference between the index lookup which should be O(log(n)) and a full table scan which should be O(n) is likely completely irrelevant.

This will probably change when you have a couple of million rows, but the exact point where it becomes relevant depends on many things like networks speed and latency, size of the result/table/row, speed of the various computers involved, phase of the moon and your preferred dessert.

Jens Schauder
  • 77,657
  • 34
  • 181
  • 348