Well, the use of O(N) nomenclature abstracts away some important details that generally are only insignificant as N approaches infinity. Those details can and often are the most significant factors at values of N less than infinity. To help explain, consider that if a term is listed as O(N^x), it is only specifying the most significant factor of N. In reality, the performance could be characterized as:
aN^x + bN^(x-1) +cN^(x-2) + ... + K
So as N approaches infinity, the dominant term becomes N^x, but clearly at values of N that are less than infinity the dominant term could be one of the lesser terms. Looking at your examples, you give two algorithms. Let's call algorithm A the one that provides O(N) performance, and the one that provides O(logN) performance we'll call algorithm B. In reality, these two algorithms have performance characteristics as follows:
Performance A = aN + b(log N) + c
Performance B = x(log N) + y
If your constant values are a=0.001 and x=99,999, you can see how A provides better performance than B. In addition, you mention that one algorithm increases the likelihood of a cache miss, and that likelihood depends on the size of the data. You'll need to figure out the likelihood of the cache miss as a function of the data size and use that as a factor when calculating the O performance of the overall algorithm. For example:
If the cost of a cache miss is CM (we'll assume it's constant), then for algorithm A the overall cache performance is F(N)CM. If that cache performance is a factor in the dominant loop of algorithm A (the O(log N) part), then the real performance characteristic of algorithm A is O( F(N)(log N)). For algorithm B the overall cache performance would be F(log N)*CM. If the cache miss manifests during the dominant loop of algorithm B, then the real performance of algorithm B is O(F(log N)*N). As long as you can determine F(), you can then compare algorithm A and B.