3

I am using AI Platform training to run ML training job using python 3.7.6. I am using the abseil module for logging messages with absl-py 0.9.0. I look at the instruction on how to direct python logging messages to stackdriver configuration medium article. I am using google-cloud-logging 1.15.0. I did some very basic code to understand the issue with my configuration.

from absl import logging
from absl import flags
from absl import app
import logging as logger
import google.cloud.logging

import sys
import os

FLAGS = flags.FLAGS

def main(argv):

    logging.get_absl_handler().python_handler.stream = sys.stdout

    # Instantiates a client
    client = google.cloud.logging.Client()

    # Connects the logger to the root logging handler; by default this captures
    # all logs at INFO level and higher
    client.setup_logging()

    fmt = "[%(levelname)s %(asctime)s %(filename)s:%(lineno)s] %(message)s"
    formatter = logger.Formatter(fmt)
    logging.get_absl_handler().setFormatter(formatter)

    # set level of verbosity
    logging.set_verbosity(logging.DEBUG)

    print(' 0 print --- ')
    logging.info(' 1 logging:')
    logging.info(' 2 logging:')

    print(' 3 print --- ')
    logging.debug(' 4 logging-test-debug')
    logging.info(' 5 logging-test-info')
    logging.warning(' 6 logging-test-warning')
    logging.error(' 7 logging test-error')
    print(' 8 print --- ')
    print(' 9 print --- ')

if __name__ == '__main__':
    app.run(main)

First abseil send all logs to the stderr. Note sure if this is expected or not. In the screenshot below we see:

  1. print messages using print are display (later in the logfile from Stackdriver)
  2. Abseil logging messages appear 2 times. One with the right label in stack driver (DEBUG, INFO, WARNING or ERROR) and one more time with the special formatting [%(levelname)s %(asctime)s %(filename)s:%(lineno)s] %(message)s but always with the ERROR label in Stackdriver.
  3. When I run the code locally I don't see duplicate.

screenshot of the output in GCP stackdriver

Any idea how to have this setup properly to see the logging messages (using abseil) once with the proper "label" in Stackdriver ?

----- EDIT --------

  1. I am seeing the issue locally and not only when running on GCP.
  2. The duplicate log messages appear when I add this line: client.setup_logging(). Before, I have no duplicate and all log messages are in the stdout stream
  3. If I look at the logger logger.root.manager.loggerDict.keys(), I see a lot of them:
dict_keys(['absl', 'google.auth.transport._http_client', 'google.auth.transport', 'google.auth', 'google','google.auth._default', 'grpc._cython.cygrpc', 'grpc._cython', 'grpc', 'google.api_core.retry', 'google.api_core', 'google.auth.transport._mtls_helper', 'google.auth.transport.grpc', 'urllib3.util.retry', 'urllib3.util', 'urllib3', 'urllib3.connection', 'urllib3.response', 'urllib3.connectionpool', 'urllib3.poolmanager', 'urllib3.contrib.pyopenssl', 'urllib3.contrib', 'socks', 'requests', 'google.auth.transport.requests', 'grpc._common', 'grpc._channel', 'google.cloud.logging.handlers.transports.background_thread', 'google.cloud.logging.handlers.transports', 'google.cloud.logging.handlers', 'google.cloud.logging', 'google.cloud', 'google_auth_httplib2'])
  1. If I look at:
root_logger = logger.getLogger()

for handler in root_logger.handlers:
    print("handler ", handler)

I see:

handler  <ABSLHandler (NOTSET)>
handler  <CloudLoggingHandler <stderr> (NOTSET)>
handler  <StreamHandler <stderr> (NOTSET)>

and we can see that the stream is stderr and not stdout. I didn''t managed to change it.

I saw this discussion stackoverflow thread and I tried the last solution by @Andy Carlson but then all my logging message are gone.

Dr. Fabien Tarrade
  • 1,556
  • 6
  • 23
  • 49
  • 1- what is the content of the duplicate logs? 2- Could you test with the [standard python library](https://cloud.google.com/logging/docs/setup/python#using_the_python_root_logger) in the AI Platform training and localy to isolate the issue? – Khalid K May 19 '20 at 03:53
  • @KhalidK it seems that everything that goes to logger got duplicated. If I run the same pice of code "client.setup_logging()" then everytihng work fine. Or logging output are see oncee and only in the stdout stream. So "client.setup_logging()" duplicate logging messages that then will appear in both stdout and stderr. – Dr. Fabien Tarrade May 19 '20 at 18:40
  • You mean logs are not duplicated when you run the code without the "client.setup_logging()". I recommend to open a [bug](https://github.com/googleapis/google-cloud-python/issues) – Khalid K May 21 '20 at 03:51
  • Just did it [issue](https://github.com/googleapis/python-logging/issues/40). The situation start to be even more confusion when importing Tensorflow and Transformers libs.I guess this is clashes betweem logging, absl.logging and google.cloud.logging. I guess my issue is that I don't really get how they aare orking and suppose to interact between each other. – Dr. Fabien Tarrade May 22 '20 at 11:24

0 Answers0