1

I have two C source files:

/* w1.c */
#include <stdio.h>
__attribute__((weak)) void fw(void) { printf("FW1\n"); }
int main(int argc, char **argv) {
  (void)argc; (void)argv;
  fw();
  return 0;
}   
/* w2.c */
#include <stdio.h>
void fw(void) { printf("FW2\n"); }

If I compile and run them with gcc, FW1 or FW2 is printed, depending on whether w2.c is used:

$ gcc -s -O2 -o prog1 w1.c
$ ./prog1
FW1
$ gcc -s -O2 -o prog12 w1.c w2.c
$ ./prog12
FW2

This works like this because in w1.c the symbol fw is weak, so it gets ignored iff another file (i.e. w2.c) defines a non-weak symbol with the same name.

Thuse the mere presence of w2.c can modify the behavior of w1.c, through a weak symbol defined in w1.c.

Is there something likes this with the OpenWatcom C compiler? What is the syntax insteda of __attribute__((weak))? Simply omitting __attribute__((weak)) won't work, because linking will fail because of the duplicate symbol. I want the compiler commands and programs ./prog1 and ./prog12) above work with owcc instead of gcc, and possibly adding some flags.

I need this because I want to implement a cleanup mechanism (calling function fw) which should be replaced automatically with something more sophisticated if additional source (or object) files are also used. Thus I'd need two implementations of the cleanup mechanism (simple fw in w1.c, sophisticated fw in w2.c).


I also considered using common symbols (i.e. those which can be defined in multiple files as long as at most one file has a nonzero definition). Here are my source files, with the common symbol being fwptr:

/* w1o.c */
#include <stdio.h>
void (*fwptr)(void);       
static void fw1(void) { printf("FW1\n"); }
int main(int argc, char **argv) {
  (void)argc; (void)argv;
  (fwptr ? fwptr : fw1)();
  return 0;
}
/* w2o.c */
#include <stdio.h>
static void fw2(void) { printf("FW2\n"); }
void (*fwptr)(void) = fw2;

As expected, if I compile and run them with gcc, FW1 or FW2 is printed, depending on whether w2o.c is used:

$ gcc -s -O2 -o prog1 w1o.c
$ ./prog1o
FW1
$ gcc -s -O2 -o prog12 w1o.c w2o.c
$ ./prog12o
FW2

However, OpenWatcom doesn't allow multiple definitions of a symbol:

$ owcc -s -O2 -o prog1o w1o.c
$ ./prog1o
FW1
$ owcc -s -O2 -o prog12o w1o.c w2o.c
Warning! W1027: file w2o.o(/tmp/w2o.c): redefinition of _fwptr ignored
$ ./prog12o
FW1

Here is what's going on with GCC:

/* symbol.c */
int sym0 = 0;
int sym1 = 1;
int sym2;
extern int sym3;
int func() { return sym0 + sym1 + sym2 + sym3; }
$ gcc -fno-pic -m32 -Os -c -o symbol.o symbol.c
$ readelf -a symbol.o
...
   Num:    Value  Size Type    Bind   Vis      Ndx Name
...  8: 00000000    28 FUNC    GLOBAL DEFAULT    1 func
     9: 00000000     4 OBJECT  GLOBAL DEFAULT    3 sym1
    10: 00000000     4 OBJECT  GLOBAL DEFAULT    4 sym0
    11: 00000004     4 OBJECT  GLOBAL DEFAULT  COM sym2
    12: 00000000     0 NOTYPE  GLOBAL DEFAULT  UND sym3

As seen above, sym3 (the common one) is different from the others, and at link time we use this to provide an alternative definition.

When compiling the same code with OpenWatcom owcc, and dumping the .obj file with OpenWatcom dmpobj, I get:

  • sym0: EXTDEF (type 0), PUBDEF386 (type 0), LEDATA386: 4 bytes in segment _DATA
  • sym1: EXTDEF (type 0), PUBDEF386 (type 0), LEDATA386: 4 bytes in segment _DATA
  • sym2: EXTDEF (type 0), PUBDEF386 (type 0), 4 bytes in segment _BSS
  • sym3: EXTDEF (type 0), no PUBDEF386, no definition in _DATA or _BSS

I have asked a new question about common symbols with OpenWatcom: Support for common symbols in OpenWatcom C compiler

pts
  • 80,836
  • 20
  • 110
  • 183

1 Answers1

1

A few ways to get a weak symbol without the knowledge of owcc ...

  1. Generate and modify a .s file
  2. Use a utility to modify the ELF .o file

Just so we're on the same page. We need a way to identify which symbols we want to be weak.

We can scan the source files, looking for a "directive" that will be ignored by owcc.

Special comments are often good for this:

// .weak fw
void
fw(void)
{
}

Or, we could use a "phony" macro:

#define WEAK /**/

WEAK
void
fw(void)
{
}

Or, we could have a weak.txt file that lists the weak symbols.

Either way, now the converter knows what symbols to change ...


Change the generated assembler to add a .weak directive

I pulled the watcom source from the github repo. AFAICT, the watcom assembler is relatively simple.

What assembler does owcc use? Can it use the GNU as that comes with gcc et. al.?

I'm guessing that it can because weak symbols are [generally] an ELF feature. I'm also assuming that the native assembler for owcc doesn't support them.

If we can use GNU as, there is a way. Take a look at the .s from the build of your w1.c under gcc.

Normally, we get:

.globl fw

But, notice that we just get:

.weak fw

We can tell owcc to generate the assembler source. Then, edit it and change the directives for any symbols that we want to be weak. Then, build from the .s

These steps can be hidden under a script for convenience.


Use utility to change the symbol binding in the ELF relocatable

Again, I assume we have an ELF relocatable.

Here is readelf -s from your source without the __attribute__((weak)):


Symbol table '.symtab' contains 13 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS s1.c
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    3
     4: 0000000000000000     0 SECTION LOCAL  DEFAULT    4
     5: 0000000000000000     0 SECTION LOCAL  DEFAULT    5
     6: 0000000000000000     0 SECTION LOCAL  DEFAULT    6
     7: 0000000000000000     0 SECTION LOCAL  DEFAULT    9
     8: 0000000000000000     0 SECTION LOCAL  DEFAULT   10
     9: 0000000000000000     0 SECTION LOCAL  DEFAULT    8
    10: 0000000000000000    10 FUNC    GLOBAL DEFAULT    1 fw
    11: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND puts
    12: 0000000000000000    21 FUNC    GLOBAL DEFAULT    6 main

Notice that fw has a binding of GLOBAL

Here is Here is readelf -s from your source with the __attribute__((weak)):


Symbol table '.symtab' contains 13 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS w1.c
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    3
     4: 0000000000000000     0 SECTION LOCAL  DEFAULT    4
     5: 0000000000000000     0 SECTION LOCAL  DEFAULT    5
     6: 0000000000000000     0 SECTION LOCAL  DEFAULT    6
     7: 0000000000000000     0 SECTION LOCAL  DEFAULT    9
     8: 0000000000000000     0 SECTION LOCAL  DEFAULT   10
     9: 0000000000000000     0 SECTION LOCAL  DEFAULT    8
    10: 0000000000000000    10 FUNC    WEAK   DEFAULT    1 fw
    11: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND puts
    12: 0000000000000000    16 FUNC    GLOBAL DEFAULT    6 main

Notice that the binding for fw is WEAK

There are various ELF utilities that allow manipulation of .o files. There ought to be one that can change symbol attributes to change the binding to WEAK.

Or, look at man 5 elf and libelf. We can write our own utility. There should be many examples on the web that can be [easily] modified.

Here's a modified version of the source (e.g. com.c):

/* com.c */
#if 0
#include <stdio.h>
#endif

#if WEAK
#define BIND    __attribute__((weak))
#else
#define BIND    /**/
#endif

BIND
void
fw(void)
{
#if 0
    printf("FW1\n");
#endif
}

#if 0
int
main(int argc, char **argv)
{
    (void) argc;
    (void) argv;
    fw();
    return 0;
}
#endif

Here are some commands I ran:

cc -O2 -DWEAK=0 com.c -o st.s -S
cc -O2 -DWEAK=0 com.c -o st.o -c
cc -O2 -DWEAK=1 com.c -o wk.s -S
cc -O2 -DWEAK=1 com.c -o wk.o -c

I then did a diff of hex dumps of st.o and wk.o:

--- sdcmp0
+++ sdcmp1
@@ -20,7 +20,7 @@
 00000130: 00000000 03000600 00000000 00000000  ................
 00000140: 00000000 00000000 00000000 03000400  ................
 00000150: 00000000 00000000 00000000 00000000  ................
-00000160: 07000000 12000100 00000000 00000000  ................
+00000160: 07000000 22000100 00000000 00000000  ...."...........
 00000170: 01000000 00000000 00636F6D 2E630066  .........com.c.f
 00000180: 77000000 00000000 20000000 00000000  w....... .......
 00000190: 02000000 02000000 00000000 00000000  ................

So, the difference between GLOBAL and WEAK is a single bit/byte. That is, global is 0x12 and weak is 0x22.

We can make sense of this by looking at a portion of /usr/include/elf.h:

/* How to extract and insert information held in the st_info field.  */

#define ELF32_ST_BIND(val)      (((unsigned char) (val)) >> 4)
#define ELF32_ST_TYPE(val)      ((val) & 0xf)
#define ELF32_ST_INFO(bind, type)   (((bind) << 4) + ((type) & 0xf))

/* Both Elf32_Sym and Elf64_Sym use the same one-byte st_info field.  */
#define ELF64_ST_BIND(val)      ELF32_ST_BIND (val)
#define ELF64_ST_TYPE(val)      ELF32_ST_TYPE (val)
#define ELF64_ST_INFO(bind, type)   ELF32_ST_INFO ((bind), (type))

/* Legal values for ST_BIND subfield of st_info (symbol binding).  */

#define STB_LOCAL   0       /* Local symbol */
#define STB_GLOBAL  1       /* Global symbol */
#define STB_WEAK    2       /* Weak symbol */
#define STB_NUM     3       /* Number of defined types.  */
#define STB_LOOS    10      /* Start of OS-specific */
#define STB_GNU_UNIQUE  10      /* Unique symbol.  */
#define STB_HIOS    12      /* End of OS-specific */
#define STB_LOPROC  13      /* Start of processor-specific */
#define STB_HIPROC  15      /* End of processor-specific */

/* Legal values for ST_TYPE subfield of st_info (symbol type).  */

#define STT_NOTYPE  0       /* Symbol type is unspecified */
#define STT_OBJECT  1       /* Symbol is a data object */
#define STT_FUNC    2       /* Symbol is a code object */
#define STT_SECTION 3       /* Symbol associated with a section */
#define STT_FILE    4       /* Symbol's name is file name */
#define STT_COMMON  5       /* Symbol is a common data object */
#define STT_TLS     6       /* Symbol is thread-local data object*/
#define STT_NUM     7       /* Number of defined types.  */
#define STT_LOOS    10      /* Start of OS-specific */
#define STT_GNU_IFUNC   10      /* Symbol is indirect code object */
#define STT_HIOS    12      /* End of OS-specific */
#define STT_LOPROC  13      /* Start of processor-specific */
#define STT_HIPROC  15      /* End of processor-specific */
Craig Estey
  • 30,627
  • 4
  • 24
  • 48
  • I was considering a similar ELF-patching workaround, thank you for providing all the details! However, please note that OpenWatcom doesn't support ELF (it supports OMF. obj output instead), and it doesn't support assembly output in the GNU as(1) syntax either. So I'd have to write or use a converter. It's doable, but a lot of work, especially to handle the corner cases. – pts May 20 '23 at 09:18
  • Please note that this answer doesn't say what useful facilities OpenWatcom is already providing as an alternative to weak symbols. I'm still very curious about that. – pts May 20 '23 at 09:22