The latter case is correct. GLSL is built on having as simple and straightforward execution as possible. You cannot allocate memory in the shader, iterating is expensive. A well-written GLSL or HLSL shader takes in a fixed set of data, and outputs a fixed set of data. This makes it very fast to execute in parallel. This is my skeletal vertex shader.
#version 110
uniform mat4 transform[ 32 ];
in int boneID;
void main()
{
gl_FrontColor = gl_Color;
gl_Position = transform[ boneID ] * gl_Vertex;
}
On the C/++ side, when calling glUniformMatrix4fv, point it to an array of several matrices and tell it how many there are, and it'll bring them through into GLSL as separate values.
Please note that my code is built on an old version of GLSL, and uses many now-deprecated elements such as gl_Vertex.