I think that, in (1), you've nailed one of most important factors that can negatively influence the performance of multithreaded apps. Esp. Google for 'false sharing'.
(2), however only affects a set of multithreaded apps - those that that run CPU-bound threads in parallel. If an app uses many threads that are I/O bound, (2) does not matter too much.
Looking at my box here, it has 100 processes and 1403 threads, CPU use 3%. Only 7 out of the 100 processes are single-threaded. Most of the apps, therefore, are multithreaded but I/O waiting.
My box would work reasonably well, at the moment, if it had only one core. Sure, hitting a link that winds up my browser would probably be a bit slower to bring up a complex page, but not much.
In the commonest case then, where apps are multithreaded to take avantage of the high I/O performance of preemptive multitaskers, apps scale very well indeed, even on a single-core CPU.
Try not to fall into the trap of thinking that preemptive multitasking OS are all about 'doing CPU-bound tasks in parallel' - they actually make this difficult by forcing the need for locking, synchro, signalling etc. It's much more about high-performance I/O, something that a cooperative scheduler is spectacularly bad at.