To echo the others: The Stopwatch class is the best way to do this.
To answer your questions about only measuring clock cycles: The fact that you're running on a multi-tasking OS on a modern processor makes measuring clock cycles almost useless. A context switch has a good chance of removing your code and data from the processors cache, and the OS might decide to swap your working set out in the meantime.
The processor could decide to reorder your instructions based on cache waits or memory accesses, and execute what it can while it's waiting. Or it may not if it is in the cache.
So, in short, performing multiple runs and averaging them is really the only way to go.
To get less jitter in the time, you could elevate the priority of the thread/process, but this can result is a slew of other issues (Bumping to real-time priority, and getting stuck in a long loop will essentially stop all other processing. If a bug occurs, and you get stuck in an infinite loop, your only choice is the reset button), and is not recommended at all, especially in on a users computer, or in a production environment. And since you can't do that where it matters, it makes the benchmarks you run in your machine, with any priority modifications, invalid.