Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional arg encoding = None for ignite.utils.setup_logger #3239

Closed
sjiang95 opened this issue Apr 23, 2024 · 5 comments · Fixed by #3240
Closed

Add optional arg encoding = None for ignite.utils.setup_logger #3239

sjiang95 opened this issue Apr 23, 2024 · 5 comments · Fixed by #3240

Comments

@sjiang95
Copy link
Contributor

sjiang95 commented Apr 23, 2024

🚀 Feature

Add optional arg encoding = None for ignite.utils.setup_logger

Motivation

I encountered garbled text when loggering messages that contains CJK characters using logger.info(). For example,

from ignite.utils import setup_logger
logger = setup_logger(name="theName", filepath='filename.log')

logger.info('你好') # garbled text in the written .log file

In the .log file, garbled text is printed.

�й���׼ʱ��.

Solution

This can be addressed by simply passing encoding = "utf-8" to the filehandler

fh = logging.FileHandler(filepath)

def setup_logger(
    name: Optional[str] = "ignite",
    level: int = logging.INFO,
    stream: Optional[TextIO] = None,
    format: str = "%(asctime)s %(name)s %(levelname)s: %(message)s",
    filepath: Optional[str] = None,
	encoding: Optional[str] = None, # add this optional arg
    distributed_rank: Optional[int] = None,
    reset: bool = False,
) -> logging.Logger:


        if filepath is not None:
            fh = logging.FileHandler(filepath, encoding = encoding) # pass encoding to the file handler
            fh.setLevel(level)
            fh.setFormatter(formatter)
            logger.addHandler(fh)

Looking forward to a discussion about this before I create a PR.

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Apr 23, 2024

Thanks for the suggestion of improvement, @sjiang95 !
If you would like to add this feature and a test in a draft PR, it would be awesome!

Few notes in case you would like to send a PR:

  • to keep BC in terms of function signature, let's set encoding: Optional[str] = None, # add this optional arg as the last argument
def setup_logger(
    name: Optional[str] = "ignite",
    level: int = logging.INFO,
    stream: Optional[TextIO] = None,
    format: str = "%(asctime)s %(name)s %(levelname)s: %(message)s",
    filepath: Optional[str] = None,
    distributed_rank: Optional[int] = None,
    reset: bool = False,
    encoding: Optional[str] = None, # add this optional arg
) -> logging.Logger:
  • maybe, we can set encoding="utf-8" by default, such that users should not think about that and get correct logging by default ?

@sjiang95
Copy link
Contributor Author

Hello @vfdev-5 ,

thanks for your response.

I also want to set encoding="utf-8" by default.

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Apr 23, 2024

@sjiang95 by the way, what is your OS, windows ?

By default, logging FileHandler is using default encoding as in open: https://docs.python.org/3/library/logging.handlers.html#filehandler
which is

In text mode, if encoding is not specified the encoding used is platform-dependent: locale.getencoding() is called to get the current locale encoding.

Source: https://docs.python.org/3/library/functions.html#open

and

On Android and VxWorks, return "utf-8".
On Unix, return the encoding of the current LC_CTYPE locale. Return "utf-8" if nl_langinfo(CODESET) returns an empty string: for example, if the current LC_CTYPE locale is not supported.
On Windows, return the ANSI code page.

Testing on the linux your PR, even with encoding=None we have already utf-8 encoding:

    fp = dirname / "log"
    logger = setup_logger(name="logger", filepath=fp, encoding=None)
    logger.info("你好")  # ni hao

    with open(fp, "r") as h:
        data = h.readlines()

    assert "你好" in data[0]

so, your PR will fix this for other OS and setups...

@sjiang95
Copy link
Contributor Author

sjiang95 commented Apr 23, 2024

@vfdev-5 yes I encountered this problem on windows.

How do you think we should deal with this? How about importing platform and using platform.system() to help limiting the behaviour on only windows?

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Apr 23, 2024

We can use encoding="utf-8" by default and in the tests check the platform as you are suggesting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants