I made two posts on the subject (here and there), to summarize, yes, the 64bit compiler uses SSE2 (double precision), but it doesn't use SSE (single precision). Everything is converted to double precision floats, and computed using SSE2 (edit: however there is an option to control that)
This means f.i. that if Maths on double precision floats is fast, maths on single precision is slow (lots of redundant conversions between single and double precisions are thrown in), "Extended" is aliased to "Double", and intermediate computations precision is limited to double precision.
Edit: There was an undocumented (at the time) directive that controls SSE code generation, {$EXCESSPRECISION OFF} activates SSE code generation, which brings back performance within expectations.