When your data is of variable length, then typically that data is framed within another container. That is to say, there's a header preceding the actual data block that tell the receiver how much data it should accept.
For example HTTP uses new line characters to delimit data. If there's variable-length message, then in the header it will include "Content-length:" field that indicates exactly how many bytes to read once entire header is received (header stops when you read 2 consecutive new lines).
It is perfectly fine to read 4 bytes from socket, get how much data follows, then do another receive and read the rest. Only be careful, when you ask for 4 bytes, the socket might give you anywhere between 1-4 bytes so anything less than 4 means you need to go back and ask for remaining few bytes. This is a very common mistake. In dev environment you will almost always get 4 bytes when asking for 4, but once you deploy your app, somewhere on some machine you will get random crashes because their network behavior is somehow different.
Generally, it is a bad approach to rely on timeouts to determine when you reach end of data. With a timeout, you might get things "reliably" working in a well-controlled dev environment, but it is a very flaky solution. Any CPU/disk/network hick up might cause your app to stop receiving prematurely. You are also limiting your data throughput and responsiveness since your app is sleeping for some time interval instead of doing work.