Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core][Data] Ray Core / Ray Data logging configuration leads to unexpected behavior #48732

Open
omatthew98 opened this issue Nov 13, 2024 · 4 comments
Assignees
Labels
bug Something that is supposed to be working; but isn't core Issues that should be addressed in Ray Core data Ray Data-related issues P0 Issues that should be fixed in short order

Comments

@omatthew98
Copy link
Contributor

omatthew98 commented Nov 13, 2024

What happened + What you expected to happen

We use logging.config.dictConfig(config) to configure the ray data logger (here), but this is also how ray core configures the ray logger (here).

For both of these logging configs, we use disable_existing_loggers: False. The behavior for this is described as (logging docs):

disable_existing_loggers - whether any existing non-root loggers are to be disabled. This setting mirrors the parameter of the same name in fileConfig(). If absent, this parameter defaults to True. This value is ignored if incremental is True.

This description makes it seem like these two logging calls are commutative (regardless of ordering they will produce the same result), but that is not exactly how the python logging module works. If we configure the ray module logger then the ray.data module logger, the results are expected and both are configured. If we instead configure the ray.data module then configure the ray module logger, then the ray.data logging configuration is clobbered. This happens because when configuring the parent logger of a module (e.g. ray module logger is the parent logger of the ray.data module logger), the various handlers associated with the child logger are not guaranteed to be preserved.

Our end goal should be a state where the call order of the logging configurations should not affect the logging behavior.

Versions / Dependencies

ray==2.39.0

Reproduction script

import ray
import logging

def report_logger(logger):
    # Collect this logger and its parents
    loggers = []
    current_logger = logger
    while current_logger:
        loggers.append(current_logger)
        if not current_logger.parent or current_logger.parent == current_logger:
            break
        current_logger = current_logger.parent
    
    # Report the configuration of each logger in the hierarchy
    print(f"Logging configuration for '{logger.name}' and its hierarchy:")
    for log in reversed(loggers):  # Start from the root and go down to the given logger
        print(f"\nLogger: {log.name or 'root'} (Level: {logging.getLevelName(log.level)})")
        if log.handlers:
            print("  Handlers:")
            for handler in log.handlers:
                print(f"    - {handler.__class__.__name__} (Level: {logging.getLevelName(handler.level)})")
        else:
            print("  No handlers configured")

print("BEFORE")
report_logger(logging.getLogger("ray.data"))
print()

import ray.data
ray.init(logging_config=ray.LoggingConfig(encoding="JSON", log_level="INFO"))

print("AFTER:")
report_logger(logging.getLogger("ray.data"))

Issue Severity

High: It blocks me from completing my task.

@omatthew98 omatthew98 added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Nov 13, 2024
@jcotant1 jcotant1 added core Issues that should be addressed in Ray Core data Ray Data-related issues labels Nov 13, 2024
@omatthew98
Copy link
Contributor Author

omatthew98 commented Nov 13, 2024

To be clear about the expected behavior some examples / modifications of the repro:

1. Ray Data Import then Ray Core Init [UNEXPECTED BEHAVIOR]

Script:

print("BEFORE")
report_logger(logging.getLogger("ray.data"))
print()

import ray.data
ray.init(logging_config=ray.LoggingConfig(encoding="JSON", log_level="INFO"))

print("AFTER:")
report_logger(logging.getLogger("ray.data"))

Output:

BEFORE
Logging configuration for 'ray.data' and its hierarchy:

Logger: root (Level: WARNING)
  No handlers configured

Logger: ray (Level: INFO)
  Handlers:
    - PlainRayHandler (Level: NOTSET)

Logger: ray.data (Level: NOTSET)
  No handlers configured

AFTER:
Logging configuration for 'ray.data' and its hierarchy:

Logger: root (Level: INFO)
  Handlers:
    - StreamHandler (Level: INFO)

Logger: ray (Level: INFO)
  Handlers:
    - StreamHandler (Level: INFO)

Logger: ray.data (Level: NOTSET)
  No handlers configured

Notes:
Ray Data configuration ignored, only Ray Core initialization respected.

2. Ray Core Init then Ray Data Import [EXPECTED BEHAVIOR]

Script:

print("BEFORE")
report_logger(logging.getLogger("ray.data"))
print()

ray.init(logging_config=ray.LoggingConfig(encoding="JSON", log_level="INFO"))
import ray.data

print("AFTER:")
report_logger(logging.getLogger("ray.data"))

Output:

BEFORE
Logging configuration for 'ray.data' and its hierarchy:

Logger: root (Level: WARNING)
  No handlers configured

Logger: ray (Level: INFO)
  Handlers:
    - PlainRayHandler (Level: NOTSET)

Logger: ray.data (Level: NOTSET)
  No handlers configured

AFTER:
Logging configuration for 'ray.data' and its hierarchy:

Logger: root (Level: INFO)
  Handlers:
    - StreamHandler (Level: INFO)

Logger: ray (Level: INFO)
  Handlers:
    - StreamHandler (Level: INFO)

Logger: ray.data (Level: DEBUG)
  Handlers:
    - SessionFileHandler (Level: NOTSET)
    - PlainRayHandler (Level: INFO)
    - SessionFileHandler (Level: ERROR)

Notes:
Ray Core configuration is done first, Ray Data configuration is done second, both are respected.

3. Only Ray Data Import [EXPECTED BEHAVIOR]

Script:

print("BEFORE")
report_logger(logging.getLogger("ray.data"))
print()

import ray.data

print("AFTER:")
report_logger(logging.getLogger("ray.data"))

Output:

BEFORE
Logging configuration for 'ray.data' and its hierarchy:

Logger: root (Level: WARNING)
  No handlers configured

Logger: ray (Level: INFO)
  Handlers:
    - PlainRayHandler (Level: NOTSET)

Logger: ray.data (Level: NOTSET)
  No handlers configured

AFTER:
Logging configuration for 'ray.data' and its hierarchy:

Logger: root (Level: WARNING)
  No handlers configured

Logger: ray (Level: INFO)
  Handlers:
    - PlainRayHandler (Level: NOTSET)

Logger: ray.data (Level: DEBUG)
  Handlers:
    - SessionFileHandler (Level: NOTSET)
    - PlainRayHandler (Level: INFO)
    - SessionFileHandler (Level: ERROR)

Notes:
Correctly configures the Ray Data logger.

4. Only Ray Core Init [EXPECTED BEHAVIOR]

Script:

print("BEFORE")
report_logger(logging.getLogger("ray.data"))
print()

ray.init(logging_config=ray.LoggingConfig(encoding="JSON", log_level="INFO"))

print("AFTER:")
report_logger(logging.getLogger("ray.data"))

Output:

BEFORE
Logging configuration for 'ray.data' and its hierarchy:

Logger: root (Level: WARNING)
  No handlers configured

Logger: ray (Level: INFO)
  Handlers:
    - PlainRayHandler (Level: NOTSET)

Logger: ray.data (Level: NOTSET)
  No handlers configured

AFTER:
Logging configuration for 'ray.data' and its hierarchy:

Logger: root (Level: INFO)
  Handlers:
    - StreamHandler (Level: INFO)

Logger: ray (Level: INFO)
  Handlers:
    - StreamHandler (Level: INFO)

Logger: ray.data (Level: NOTSET)
  No handlers configured

Notes:
Correctly configures the Ray Core logger.

@omatthew98 omatthew98 added P0 Issues that should be fixed in short order and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Nov 13, 2024
@alexeykudinkin
Copy link
Contributor

@omatthew98 why don't we just do

  1. Take the current config
  2. Merge in provided one
  3. Set the config back

@omatthew98
Copy link
Contributor Author

Update from some more offline discussion with @alexeykudinkin:
For now, ensuring that the import / happens after the initialization (e.g. ensuring that the ray module logger is configured before the ray.data module logger), should be an immediate workaround. We should still try to improve the behavior as this is a footgun that will result in unexpected results and missing logs that will hinder debugging efforts for users. Ideally we would do what was suggested above, but unfortunately python's logging module does not provide a native way to retrieve the existing logging configuration as a dictionary, so we would have to manually inspect and reconstruct the dict before merging the provided config which would be somewhat hacky.

An alternative to this would be to just wrap this configuration at the ray core level by:

  1. Persist configuration as root level field of the core logging module
  2. Create util configureLogging and have it be called from both core and other libraries
  3. Within the configureLogging function, merge the provided dictionary with the existing dictionary configuration and then update the logging config with the merged config.

@hongpeng-guo
Copy link
Contributor

I am trying to understand the structured logging behavior as well. One of my question is
#1. Ray Data Import then Ray Core Init [UNEXPECTED BEHAVIOR] will wipe out the ray.data logger configuration; but why
#2. Ray Core Init then Ray Data Import [EXPECTED BEHAVIOR] maintains the expected logger configurations of both logger?
It seems by going in order #2, we got the correct logger config for all three loggers: i.e., root by ray.init, ray by ray.init, and finally ray.data by import ray data

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't core Issues that should be addressed in Ray Core data Ray Data-related issues P0 Issues that should be fixed in short order
Projects
None yet
Development

No branches or pull requests

4 participants