This is mostly about cache. For example, imagine we have 4 vertex and 4 colors. You can provide the information this way (excuse me but I don't remember the exact function names)
glVertexPointer(..., vertex);
glColorPointer(..., colors);
What it internally does, is read vertex[0], then apply colors[0], then again vertex[1] with colors[1]. As you can see, if vertex is, for example, 20 megs long, vertex[0] and colors[0] will be, to say the least, 20 megabytes apart from each other.
Now, on the other hand, if you provide a structure like { vertex0, color0, vertex1, color1, etc.} there will be a lot of cache hits because, well, vertex0 and color0 are together, and so are vertex1 and color1.
Hope this helps answer the question
edit: on second read, I may not have answered the question. You might probably be wondering how does OpenGL know which values to read from that structure, maybe? Like I said before with a structure such as { vertex, color, vertex, color } you tell OpenGL that vertex is at position 0, with an offset of 2 (so next one will be at position 2, then 4, etc) and color starts at position 1, with an offset of 2 also (so position 1, then 3, etc).
addition: In case you want a more practical example, look at this link http://www.lwjgl.org/wiki/index.php?title=Using_Vertex_Buffer_Objects_(VBO). You can see there how it only provides the buffer once and then uses offsets to render efficiently.