Configurable sensitive data masking via log4net

Question

I'm looking at using log4net as my logging framework of choice for a new project starting shortly. One issue that I've run into during prototyping that I can't find a definitive answer for is how you can clean or mask message content in a configurable and tidy way.

Hypothetically let's say I want several cleaners to be put in action but I also want to follow the single responsibility principle. Some cleaner examples:

Cardnumber/PAN cleaner
Password cleaner
Private data cleaner

I know that you should never be logging this sort of information in plain text and the code executing the logs will never knowingly be doing this. I want to have a last level of protection however in case data becomes malformed and sensitive data somehow slips into somewhere it shouldn't; logs being the worst case scenario.

Option 1:

I've found this StackOverflow article which details a possible solution however it involves the use of reflection. This is not desirable for performance but it also seems hacky to manipulate internal storage mechanisms. Editing-log4net-messages-before-they-reach-the-appenders

Option 2:

The suggested answer on the same question suggests the use of a PatternLayoutConverter. This is fine for a single cleaner operation but you are unable to use multiple operations such as the below:

public class CardNumberCleanerLayoutConverter : PatternLayoutConverter
{
   protected override void Convert(TextWriter writer, LoggingEvent loggingEvent)
   {
      string message = loggingEvent.RenderedMessage;

      // TODO: Replace with real card number detection and masking.
      writer.Write(message.Replace("9", "*"));
   }
}

<layout type="log4net.Layout.PatternLayout">
   <converter>
      <name value="cleanedMessage" />
      <type value="Log4NetPrototype.CardNumberCleanerLayoutConverter, Log4NetPrototype" />
   </converter>
   <converter>
      <name value="cleanedMessage" />
      <type value="Log4NetPrototype.PasswordCleanerLayoutConverter, Log4NetPrototype" />
   </converter>
   <conversionPattern value="%cleanedMessage" />
</layout>

In the case of a naming collision as demonstrated above, the converter loaded last will be the one which is actioned. Using the above example, this means that passwords will be cleaned but not card numbers.

Option 3:

A third option which I've tried is the use of chained ForwarderAppender instances but this quickly complicates the configuration and I wouldn't consider it an ideal solution. Because the LoggingEvent class has an immutable RenderedMessage property we are unable to change it without creating a new instance of the LoggingEvent class and passing it through as demonstrated below:

public class CardNumberCleanerForwarder : ForwardingAppender
{
   protected override void Append(LoggingEvent loggingEvent)
   {
      // TODO: Replace this with real card number detection and masking.
      string newMessage = loggingEvent.RenderedMessage.Replace("9", "*");

      // What context data are we losing by doing this?
      LoggingEventData eventData = new LoggingEventData()
      {
         Domain = loggingEvent.Domain,
         Identity = loggingEvent.Identity,
         Level = loggingEvent.Level,
         LocationInfo = loggingEvent.LocationInformation,
         LoggerName = loggingEvent.LoggerName,
         ExceptionString = loggingEvent.GetExceptionString(),
         TimeStamp = loggingEvent.TimeStamp,
         Message = newMessage,
         Properties = loggingEvent.Properties,
         ThreadName = loggingEvent.ThreadName,
         UserName = loggingEvent.UserName
      };

      base.Append(new LoggingEvent(eventData));
   }
}

public class PasswordCleanerForwarder : ForwardingAppender
{
   protected override void Append(LoggingEvent loggingEvent)
   {
      // TODO: Replace this with real password detection and masking.
      string newMessage = loggingEvent.RenderedMessage.Replace("4", "*");

      // What context data are we losing by doing this?
      LoggingEventData eventData = new LoggingEventData()
      {
         Domain = loggingEvent.Domain,
         Identity = loggingEvent.Identity,
         Level = loggingEvent.Level,
         LocationInfo = loggingEvent.LocationInformation,
         LoggerName = loggingEvent.LoggerName,
         ExceptionString = loggingEvent.GetExceptionString(),
         TimeStamp = loggingEvent.TimeStamp,
         Message = newMessage,
         Properties = loggingEvent.Properties,
         ThreadName = loggingEvent.ThreadName,
         UserName = loggingEvent.UserName
      };

      base.Append(new LoggingEvent(eventData));
   }
}

Matching configuration (very hard to follow):

<log4net>
   <appender name="LocatedAsyncForwardingAppender" type="Log4NetPrototype.LocatedAsyncForwardingAppender, Log4NetPrototype">
      <appender-ref ref="CardNumberCleanerForwarder" />
   </appender>
   <appender name="CardNumberCleanerForwarder" type="Log4NetPrototype.CardNumberCleanerForwarder, Log4NetPrototype">
      <appender-ref ref="PasswordCleanerForwarder" />
   </appender>
   <appender name="PasswordCleanerForwarder" type="Log4NetPrototype.PasswordCleanerForwarder, Log4NetPrototype">
      <appender-ref ref="LogFileAppender" />
   </appender>
   <appender name="LogFileAppender" type="Log4NetPrototype.LogFileAppender, Log4NetPrototype">
      <layout type="log4net.Layout.PatternLayout">
         <conversionPattern value="%m" />
      </layout>
   </appender>
   <root>
      <level value="DEBUG" />
      <appender-ref ref="LocatedAsyncForwardingAppender" />
   </root>
</log4net>

Does anyone have another suggestion for how this could be implemented where theoretically n number of cleaners could be configured at the cost of performance?

score -3 · Answer 1 · answered May 08 '15 at 07:36

-3

In your question you are already saying that you should go to the cause and not logging any sensitive data. This can be enforced by a fourth option of using code reviews, and look at the data being logged. Your logging statements should never log any sensitive data, because the cause a security risk. Trusting on any code with filters the sensitive data will probably fail if you make changes to your project. Your QA process has to be really good to catch this kind of mistakes (I've never seen a tester going through all logs). So I would go for option 4 which is making sure you do not log this kind of information in the first place.

answered May 08 '15 at 07:36

Peter

27,590
8
64
84

I completely agree with you that all precautions should be taken to ensure this sort of data is not logged in the first place. We deal with large quantities of data from our clients which we then put through fairly complex rule engines. If a debug log states for example that account number xyz failed due to rule abc but for some reason we have been sent sensitive data in the account number field, we want to be able to detect that. Having detection logic at every point where a diagnostic log may be created would cause massive code duplication. I would prefer a singular point, hence my question. – Justin Cook May 08 '15 at 23:14
Also great point about the code reviews. We have a strict code review policy where every piece of code that will end up in production must be reviewed by at least two other developers. Unfortunately even with three sets of eyes on a block of code, things can be missed and I want to make sure we have all the defenses we can muster. – Justin Cook May 08 '15 at 23:19
Nope this was never solved as far as I can remember. I do recall that we picked Nlog over log4net in the end. – Justin Cook May 02 '21 at 17:25

Configurable sensitive data masking via log4net

1 Answers1