13

I'm writing some code which stores some data structures in a special named binary section. These are all instances of the same struct which are scattered across many C files and are not within scope of each other. By placing them all in the named section I can iterate over all of them.

In GCC, I use _attribute_((section(...)) plus some specially named extern pointers which are magically filled in by the linker. Here's a trivial example:

#include <stdio.h>

extern int __start___mysection[];
extern int __stop___mysection[];

static int x __attribute__((section("__mysection"))) = 4;
static int y __attribute__((section("__mysection"))) = 10;
static int z __attribute__((section("__mysection"))) = 22;

#define SECTION_SIZE(sect) \
    ((size_t)((__stop_##sect - __start_##sect)))

int main(void)
{
    size_t sz = SECTION_SIZE(__mysection);
    int i;

    printf("Section size is %u\n", sz);

    for (i=0; i < sz; i++) {
        printf("%d\n", __start___mysection[i]);
    }

    return 0;
}

I'm trying to figure out how to do this in MSVC but I'm drawing a blank. I see from the compiler documentation that I can declare the section using __pragma(section(...)) and declare data to be in that section with __declspec(allocate(...)) but I can't see how I can get a pointer to the start and end of the section at runtime.

I've seen some examples on the web related to doing _attribute_((constructor)) in MSVC, but it seems like hacking specific to CRT and not a general way to get a pointer to the beginning/end of a section. Anyone have any ideas?

PeeHaa
  • 71,436
  • 58
  • 190
  • 262
Andrew B.
  • 133
  • 1
  • 5
  • May I ask why you want to control binary section naming in the first place? – Reinderien Sep 27 '10 at 21:42
  • It's for a high-performance instrumentation framework. Imagine a printf(format, args...) invocation, where all the format strings were stored in the binary section, and the only thing that gets logged is the arguments plus a lookup value. The argument substitution takes place in post-processing. – Andrew B. Sep 27 '10 at 21:57
  • 1
    A better example of this is a program that allows you to add modules by relinking rather than recompiling (and possibly regenerating some code). If you can treat the entire section as an array of some struct then you can iterate over it and perform some action on/for each entry, such as call `cur_entry[i]->init(&cur_entry)`. You can also use special knowledge about memory usage patterns to optimize for paging and cache locality by doing this. Not usually Windows related (that I know of) but this can also be required for Harvard architecture processors. – nategoose Sep 27 '10 at 23:16
  • Yes, my log mechanism works similarly. The special section is a large array of structs containing all the metadata for the instrumentation points. I want to be able to iterate over this array to dump out all the metadata for later use in post-processing. – Andrew B. Sep 28 '10 at 13:29
  • 1
    I'm also using this construction for running unittests. In this way I don't need a 'main' function which knows all the modules in my system, but each module declares unittest structs which are stored in a special section. The main function loops through all these structs in the same way the main() function in the example of Andrew B does. – Bart Apr 20 '11 at 06:42
  • Trying your code gives `undefined reference to '__stop___mysection` and `undefined reference to '__start___mysection`. Will this require some ld script magic to work? – speakman Apr 27 '11 at 10:07
  • It would be far better to do this using language features. – David Heffernan May 07 '11 at 19:07

4 Answers4

10

There is also a way to do this with out using an assembly file.

#pragma section(".init$a")
#pragma section(".init$u")
#pragma section(".init$z")

__declspec(allocate(".init$a")) int InitSectionStart = 0;
__declspec(allocate(".init$z")) int InitSectionEnd   = 0;

__declspec(allocate(".init$u")) int token1 = 0xdeadbeef;
__declspec(allocate(".init$u")) int token2 = 0xdeadc0de;

The first 3 line defines the segments. These define the sections and take the place of the assembly file. Unlike the data_seg pragma, the section pragma only create the section. The __declspec(allocate()) lines tell the compiler to put the item in that segment.

From the microsoft page: The order here is important. Section names must be 8 characters or less. The sections with the same name before the $ are merged into one section. The order that they are merged is determined by sorting the characters after the $.

Another important point to remember are sections are 0 padded to 256 bytes. The START and END pointers will NOT be directly before and after as you would expect.

If you setup your table to be pointers to functions or other none NULL values, it should be easy to skip NULL entries before and after the table, due to the section padding

See this msdn page for more details

shimpossible
  • 356
  • 2
  • 5
  • Thanks a lot! I did read about this over @ https://devblogs.microsoft.com/oldnewthing/20181107-00/?p=100155 but already forgot about it. – Trass3r Jul 10 '22 at 21:22
5

First of all, you'll need to create an ASM-file containing all the sections you are interested (for ex., section.asm):

.686
.model flat

PUBLIC C __InitSectionStart
PUBLIC C __InitSectionEnd

INIT$A SEGMENT DWORD PUBLIC FLAT alias(".init$a")
        __InitSectionStart EQU $
INIT$A ENDS

INIT$Z SEGMENT DWORD PUBLIC FLAT alias(".init$z")
        __InitSectionEnd EQU $
INIT$Z ENDS

END

Next, in your code you can use the following:

#pragma data_seg(".init$u")
int token1 = 0xdeadbeef;
int token2 = 0xdeadc0de;
#pragma data_seg()

This gives such a MAP-file:

 Start         Length     Name                   Class
 0003:00000000 00000000H .init$a                 DATA
 0003:00000000 00000008H .init$u                 DATA
 0003:00000008 00000000H .init$z                 DATA

  Address         Publics by Value              Rva+Base       Lib:Object
 0003:00000000       ?token1@@3HA               10005000     dllmain.obj
 0003:00000000       ___InitSectionStart        10005000     section.obj
 0003:00000004       ?token2@@3HA               10005004     dllmain.obj
 0003:00000008       ___InitSectionEnd          10005008     section.obj

So, as you can see it, the section with the name .init$u is placed between .init$a and .init$z and this gives you ability to get the pointer to the begin of the data via __InitSectionStart symbol and to the end of data via __InitSectionEnd symbol.

Ilya Matveychikov
  • 3,936
  • 2
  • 27
  • 42
0

I was experimenting here a bit and tried to implement the version without an assembly file, however was struggling with the random number of padding bytes between the sections, which makes it almost impossible to find the start of the .init$u section part if content isn't just pointers or other simple items that could be checked for NULL or some other known pattern. Whether padding is inserted seems to correlate with the use of debug option Zi. When given, padding is inserted, without, all sections appear exactly in the way one would like to have them.

don
  • 1
0

ML64 allows to cut a lot of the assembly noise :

public foo_start
public foo_stop

.code foo$a
foo_start:

.code foo$z
foo_stop:

end
diapir
  • 2,872
  • 1
  • 19
  • 26