1

Okay really confused about something here. Two questions.

First question is according to my compiler, string is always a size of 40 bytes. How is this possible when we can have more than 40 characters in a string and each character should be taking up 1 byte?

Second question: If I have a struct containing a string (40bytes) and an integer (4bytes), why is my resulting structure size 48 instead of 44? I can't figure out what is going on here.

Thanks if anyone knows/understands why i'm getting this behavior.

    struct Employee //This struct is size 48 for some reason?
    {
        string name; //String takes up 40 bytes
        int ID; //int takes up 4 bytes
    }; 

    struct Size8Struct //This struct is size 8 as expected
    {
        int ID;
        int filler;
    };


    int main() {
        cout << sizeof(Size8Struct) << endl; //returns 8 as expected
        Employee Jim; Jim.ID=1; Jim.name="Jim";
        cout << sizeof(Jim) << endl; //returns 48, why?

        string test = "123456789012345678901234567890123456789012345678901234567890 test"; //How is it possible for this string to hold over 40 chars if it is only 40 bytes long?
        cout << sizeof(test) << endl;
    }
user3503712
  • 51
  • 1
  • 4

6 Answers6

3

First question is according to my compiler, string is always a size of 40 bytes. How is this possible when we can have more than 40 characters in a string and each character should be taking up 1 byte?

The sizeof operator tells you the size of an object in bytes, including padding, not the number of elements a user defined type has. The size of any given type is fixed at compile time.

In the case of string, the number of elements can be obtained using the size() member function:

cout << test.size() << endl;

Second question: If I have a struct containing a string (40bytes) and an integer (4bytes), why is my resulting structure size 48 instead of 44? I can't figure out what is going on here.

This is due to padding. The compiler can add "empty space" between data members, or after the last one, to make them align in a way that is more efficient for a given platform. See Why isn't sizeof for a struct equal to the sum of sizeof of each member?.

Community
  • 1
  • 1
juanchopanza
  • 223,364
  • 34
  • 402
  • 480
3

First of all every C++ object of a class has the same fixed size. So you cannot expect a string object's size depend on how long the stored string is.

Typically the string object contains (among other things) a pointer to the actual location where the string is stored, though there are some clever implementations which depending on the string size either store it on the heap or in the object. More about this in this excellent talk.

The difference between the size of the elements and the total simply comes from padding.

Community
  • 1
  • 1
Karoly Horvath
  • 94,607
  • 11
  • 117
  • 176
1

The problem here is alignment.

There is padding on the first struct because in memory it actually looks like this:

struct Employee //This struct is size 48 for some reason?
{
    string name; //String takes up 40 bytes
    int ID; //int takes up 4 bytes
    4 Padding bytes
}; 

Read this: http://en.wikipedia.org/wiki/Data_structure_alignment

Salgar
  • 7,687
  • 1
  • 25
  • 39
0

sizeof operator works at compile time. Size of a type is static and known at compile time.

std::string class internaly stores some pointers that are at runtime assigned to point to dynamically allocated storage that's holding the data (the characters). You can get the size of this data by calling the member function size() (or length()).

The difference between the sum of sizes of individual members and the size of the whole class is because of padding bytes inserted by the compiler. See Data structure allignment.

jrok
  • 54,456
  • 9
  • 109
  • 141
0

The sizeof gives you the static size of an object or type. In C++, all objects have a fixed size. In order to support growing or shrining, a class must externalize some of its innards.

Consider e.g. this partial std::string implementation (ignoring that std::string is only a type alias):

class string {
    char *buffer_;
    size_t len_, capacity_;
};

Objects of this class will always have a fixed size once compiled; the real storage is not within itself, but somewhere external, as indicated by the pointer member.

A std::string implementation may also choose to implement to "inline" the storage for small strings and only use the external memory for larger strings:

class string {
    char *buffer_;
    char small_string_buffer_[16];
    size_t len_, capacity_;
};

You will have noticed len_ and capacity_, which are additional size factors.

Another size factor is padding, where the string pads up its memory such that it is properly aligned in order to increase performance. Let's assume 32 bytes is the system's optimum:

class string {
    union {
       char *buffer_;
       char small_string_buffer_[16];
    }; // 16 bytes in size
    size_t len_, capacity_; // let's say 8 bytes, summing to 24

    char pad_[8]; // now at 32 bytes
};
Sebastian Mach
  • 38,570
  • 8
  • 95
  • 130
0

The size of any C / C++ data member is fixed at compile-time, any variable space requirements are met by using pointers to dynamic memory, though sometimes members pull double duty, allowing the storage of small buffers in the object itself for efficiency.

Also, the spec allows any amount of padding after any member of a struct or class, though the minimum neccessary to guarantee proper alignment of all data-members and a second struct of the same type following directly behind is used, for efficiency.

You are probably on a 64-bit system, which means pointers are 8 byte big, with a natural alignment of 8 byte, so 4 padding bytes follow your int member.

Deduplicator
  • 44,692
  • 7
  • 66
  • 118