Many web pages actually use a video player such as JWPlayer, Dash.js, BitMovin etc along with the HTML video tag, which may complicate the picture as they may have their own seeking logic or optimisations.
For simple HTTP streaming, then the player downloads the video in chunks using HTTP range requests as Ivo mentions.
For more complex scenarios where the video is streamed using a streaming protocol such as HLS or DASH, the video is again downloaded in chunks but the chunks are requested as part of the streaming protocol implementation.
DASH and HLS are adaptive streaming protocols that provide multiple bit rates versions of each chunk of a video allowing the player choose the best one for the current network conditions and device resolution etc - see here for how you can see the different bit rates on YouTube as an example: https://stackoverflow.com/a/42365034/334402
Seeking is actually a little bit complex if you want to provide a good user experience.
Many players will support a separate thumbnail stream provided by the server - this allows the player display thumbnails of scenes from various points along the timeline. This is essentially a set of images from the video at regular intervals, making it much quicker to display a thumbnail as the player does not have to download a whole section of video and decode it just to show the point you are hovering over in the timeline.
When you actually click at that point it will only then request that section of the video and decode and play it back.