The hardware provides a Memory Management Unit. It is a piece of circuitry which is able to intercept and alter any memory access. Whenever the processor accesses the RAM, e.g. to read the next instruction to execute, or as a data access triggered by an instruction, it does so at some address which is, roughly speaking, a 32-bit value. A 32-bit word can have a bit more than 4 billions distinct values, so there is an address space of 4 GB: that's the number of bytes which could have a unique address.
So the processor sends out the request to its memory subsystem, as "fetch the byte at address x and give it back to me". The request goes through the MMU, which decides what to do with the request. The MMU virtually splits the 4 GB space into pages; page size depends on the hardware you use, but typical sizes are 4 and 8 kB. The MMU uses tables which tell it what to do with accesses for each page: either the access is granted with a rewritten address (the page entry says: "yes, the page containing address x exists, it is in physical RAM at address y") or rejected, at which point the kernel is invoked to handle things further. The kernel may decide to kill the offending process, or to do some work and alter the MMU tables so that the access may be tried again, this time successfully.
This is the basis for virtual memory: from the point of view, the process has some RAM, but the kernel has moved it to the hard disk, in "swap space". The corresponding table is marked as "absent" in the MMU tables. When the process accesses his data, the MMU invokes the kernel, which fetches the data from the swap, puts it back at some free space in physical RAM, and alters the MMU tables to point at that space. The kernel then jumps back to the process code, right at the instruction which triggered the whole thing. The process code sees nothing of the whole business, except that the memory access took quite some time.
The MMU also handles access rights, which prevents a process from reading or writing data which belongs to other processes, or to the kernel. Each process has its own set of MMU tables, and the kernel manage those tables. Thus, each process has its own address space, as if it was alone on a machine with 4 GB of RAM -- except that the process had better not access memory that it did not allocate rightfully from the kernel, because the corresponding pages are marked as absent or forbidden.
When the kernel is invoked through a system call from some process, the kernel code must run within the address space of the process; so the kernel code must be somewhere in the address space of each process (but protected: the MMU tables prevent access to the kernel memory from unprivileged user code). Since code can contain hardcoded addresses, the kernel had better be at the same address for all processes; conventionally, in Linux, that address is 0xC0000000. The MMU tables for each process map that part of the address space to whatever physical RAM blocks the kernel was actually loaded upon boot. Note that the kernel memory is never swapped out (if the code which can read back data from swap space was itself swapped out, things would turn sour quite fast).
On a PC, things can be a bit more complicated, because there are 32-bit and 64-bit modes, and segment registers, and PAE (which acts as a kind of second-level MMU with huge pages). The basic concept remains the same: each process gets its own view of a virtual 4 GB address space, and the kernel uses the MMU to map each virtual page to an appropriate physical position in RAM, or nowhere at all.