0

I need to develop a milter for Sendmail and have thought a long time about which language / framework I should use. Finally, I have decided to do it in plain C directly using the milter API.

I have studied the milter API documentation and think I have grasped the concepts. But there is one thing which worries me heavily. From this section:

A single filter process may handle any number of connections simultaneously. All filtering callbacks must therefore be reentrant, and use some appropriate external synchronization methods to access global data [...].

While I very well understand why the callbacks must be thread-safe, I cannot understand why they must be re-entrant. I cannot imagine that those callbacks could be called from an interrupt or a signal handler (perhaps except the abort callback, I'll have to re-read this).

The problem with the requirement to be re-entrant is that a re-entrant function must not call non-re-entrant code. Therefore, if the callbacks really had to be re-entrant, I couldn't use malloc() and most other library functions in there; from man 3 malloc:

To avoid corruption in multithreaded applications, mutexes are used internally to protect the memory-management data structures employed by these functions [...]

This for sure means that malloc() is thread-safe, and it probably means that malloc() is not re-entrant, and thus no function which uses it.

So I have got two questions:

1) Do the milter callbacks really need to be re-entrant, or is this actually a very odd wording for "need to be thread-safe" in the milter API documentation?

2) If they really need to be re-entrant, how can I circumvent the problems mentioned above? Due to the characteristics of the milter API, I can hardly imagine how to do something reasonable in the callbacks without using malloc() and other non-re-entrant library functions.

Binarus
  • 4,005
  • 3
  • 25
  • 41
  • "This for sure means that malloc() is thread-safe, and it probably means that malloc() is not re-entrant." Something which is thread-safe may also be re-entrant but not necessarily vice versa. Imagine a single threaded application which passes function pointers to functions (e.g. signal handlers in a signal-slot concept). This may cause situations where re-entrance becomes an issue though thread-safety is not one. IMHO, if `malloc()` is granted to be thread-safe I would bet it's re-entrant as well. – Scheff's Cat Nov 01 '17 at 12:04
  • Found a nice Q&A about this: [SO: Threadsafe vs re-entrant](https://stackoverflow.com/q/856823/7478597). – Scheff's Cat Nov 01 '17 at 12:05
  • @Scheff The reason why I believe that `malloc()` is thread-safe is that they are using mutexes (i.e. a sort of locks). The reason why I believe that `malloc()` is not re-entrant is that they are using mutexes (i.e. a sort of locks) ... – Binarus Nov 01 '17 at 14:26
  • Well, I followed the link I provided above and read it again. The (currently) [last answer](https://stackoverflow.com/a/33445858/7478597) makes a clear distinction between thread-safe and re-entrant. Haven't seen it this clear before. Thus, I must admit `malloc()` might be thread-safe but not necessarily re-entrant (if not stated in the manual). – Scheff's Cat Nov 01 '17 at 14:43

1 Answers1

1

In the context of the Milter-API (which also stresses that it's using posix threads, internally), a 'reentrant' function is defined as one that can safely executed from multiple threads, concurrently. That means without yielding a race condition.

This usage of the term 'reentrant' is consistent with other UNIX/Linux API documentation such as strtok:

The strtok_r() function is a reentrant version of strtok(). [..]

┌───────────┬───────────────┬───────────────────────┐
│Interface  │ Attribute     │ Value                 │
├───────────┼───────────────┼───────────────────────┤
│strtok()   │ Thread safety │ MT-Unsafe race:strtok │
├───────────┼───────────────┼───────────────────────┤
│strtok_r() │ Thread safety │ MT-Safe               │
└───────────┴───────────────┴───────────────────────┘

See also other *_r() functions such as ctime_r(), gmtime_r() etc.

maxschlepzig
  • 35,645
  • 14
  • 145
  • 182