The CAS latency is in memory-bus clock cycles. This is always one half the transfers-per-second number. e.g. DDR3-1600 has a memory clock of 800MHz, doing 1600M transfers per second (during a burst transfer).
DDR2, DDR3, and DDR4 still use a double-pumped 64-bit memory bus (transferring data on the rising and falling edges of the clock signal), not quad-pumped. This is why they're still called Double Data-Rate (DDR) SDRAM.
The FSB speed has nothing to do with it.
On old CPUs without integrated memory controllers, i.e. systems that actually have an FSB, its frequency is often configurable (in the BIOS) separately from the memory speed. See Front Side Bus and RAM speed; on even older systems, the FSB and memory clocks were synchronous.
Normally systems were designed with a fast enough FSB to keep up with the memory controller. Running the FSB at the same clock speed as the memory can reduce latency by avoiding buffering between clock domains.
So yes, the CAS latency in seconds is cycle_count / frequency
, or more like your formula
1000ns/us * CL / RAMspeed * 2 transfers/clock
, where RAMspeed is in mega-transfers per second.
Higher CL numbers at a higher memory frequency often work out to a similar absolute latency (in seconds). In other words, modern RAM has higher CAS latency timing numbers because more clock cycles happen in the same amount of time.
Bandwidth has vastly improved, while latency has stayed nearly constant, according to these graphs from Crucial which explain CL vs. frequency.
Of course this is not "the memory latency", or the "true" memory latency.
It's the CAS latency of the DRAM itself, and is the most important factor in latency between the memory controller and the DRAM, but is only a part of the latency between a CPU core and memory. There is non-negligible latency inside the CPU between the core and uncore (L3 and memory controller). Uncore is Intel terminology; IDK what AMD calls the parts of the memory hierarchy in their various microarchitectures.
Especially many-core Xeon CPUs have significant latency to L3 / memory controller, because of the large ring bus(es) connecting all the cores. A many-core Xeon has worse L3 and memory latency than a similar dual or quad-core with the same memory and CPU clock frequencies.
This extra latency actually limits single-thread / single-core bandwidth on a big Xeon to worse than a laptop CPU, because a single core can't keep enough requests in flight to fill the memory pipeline with that much latency. Why is Skylake so much better than Broadwell-E for single-threaded memory throughput?.