+-------+ +-------+
|-------| |-------|
||10 kb|| ||25 kb||
+-------+ +-------+
|xxxxxxx| |xxxxxxx|
|xxxxxxx| |xxxxxxx|
|xxxxxxx+------------->+xxxxxxx|
+-------+ |xxxxxxx|
||10 kb|| |xxxxxxx|
+-------+ |xxxxxxx|
|xxxxxxx| |xxxxxxx|
|xxxxxxx| |xxxxxxx|
+-------+ |xxxxxxx|
||05 kb|| |xxxxxxx|
+-------+ +-------+
Look at the representation above. Let us assume that xxxxxxx
represents the occupied space on the disk while the numbers represent the empty space available.
Both the scenarios represent a vacant space of 25 kb. But in case 1, if you have to insert (or perform operations) that would require a contiguous memory allocation of, say 15 kb
, you won't be able to do that. Although a space of 25 kb is available, but since that isn't contiguous, you might get a Memory / Disk Full Error
and thus either the space will go waste or will be assigned for tasks that are very low on memory requirement.
In case 2, a block of contiguous memory is available. A task requiring ~25kb
of memory can easily be executed,
This isn't only with Redshift or DBMS; it holds true with anything that remotely involves memory management, including Operating Systems.
What causes such memory partitions (called Fragmentation)?
Fragmentation is caused by continuously creating and deleting (modifying) files on disk. When a file occupying a space is removed, it creates a gaping memory hole there. A file of size less than the memory hole can occupy that space or the space will go waste.
What should be done?
Defragment! In your specific case, Amazon Redshift provides the provision to VACUUM tables and/or schema. You might be having enough disk space, but not enough contiguous memory that the engine would be able to allocate to the task executed by you.