1) Each process has 4gb of virtual memory space, but it need not be allocated all at once. The operating system specifies to the MMU what parts of physical memory are mapped to its virtual space, and what parts are not mapped at all. Accesses to the parts that are not mapped will cause the processor to fault and the operating system usually generates a segfault. There is also a marker for "not present" which tells the processor that the area of memory is not in physical memory space but is in the swap space, so the processor faults and the operating system swaps the page back into physical memory, then resumes the process where it left off. To describe a processes page table, you only need a few bytes of memory, so 100 processes would not use that much memory until they actually request it.
2) There are many memory allocation algorithms. Usually the operating system only allocates large blocks of memory at a time, and so calls to malloc() only sometimes result in a call to the operating system, most of the time however it is the C standard library implementation details that handle the micromanagement. There is no guarantee that an access out of bounds of an array will produce a seg fault, as it could be part of a different array that was malloc'ed earlier, or part of free space that the standard library is keeping track of for future allocations and therefore will not segfault. There are debugging tools like valgrind that will detect such errors, however.
3) The details as to where each segment is located is operating system dependent, but for code that is general and portable, there is no need to know.
For more information on all of these topics, consult the osdev wiki, specifically the part on paging and memory allocation.