Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent localization in country_holidays due to LANG dependency #2168

Open
pmarkoo opened this issue Dec 12, 2024 · 6 comments
Open

Inconsistent localization in country_holidays due to LANG dependency #2168

pmarkoo opened this issue Dec 12, 2024 · 6 comments
Assignees

Comments

@pmarkoo
Copy link

pmarkoo commented Dec 12, 2024

Bug Report

Expected Behavior

When using the country_holidays function without specifying the language parameter (i.e., setting it to None or omitting it), the holiday names should consistently be returned in the country's original language as per the documentation.

For example, executing the following code:

import holidays
de_holidays = holidays.country_holidays("DE")
print(de_holidays.get("2024-12-25"))

Should consistently output:
Erster Weihnachtstag

Actual Behavior

The country_holidays function exhibits inconsistent behavior based on the environment's LANG environment variable when the language parameter is not set:

Local Environment:

LANGUAGE: None
LC_ALL: None
LC_MESSAGES: None
LANG: None

Output:
Erster Weihnachtstag

Remote Server Environment:

LANGUAGE: None
LC_ALL: None
LC_MESSAGES: None
LANG: C.UTF-8

Output:
Christmas Day

Steps to Reproduce the Problem

Easy to reproduce:

import holidays

os.environ['LANG']=''
de_holidays = holidays.country_holidays("DE")
print(de_holidays.get("2024-12-25"))

os.environ['LANG'] = 'C.UTF-8'
de_holidays = holidays.country_holidays("DE")
print(de_holidays.get("2024-12-25"))

Output
Erster Weihnachtstag
Christmas Day

Environment

  • I suppose any OS or Python version will have the same behaviour
  • holidays version: 0.62

Additional Context

Add any other context about the problem here.

@pmarkoo pmarkoo changed the title Inconsistent Localization in country_holidays Due to LANG Dependency When language Is Not Specified Inconsistent localization in country_holidays due to LANG dependency Dec 12, 2024
@arkid15r arkid15r self-assigned this Dec 13, 2024
@arkid15r
Copy link
Collaborator

Hi @pmarkoo
thanks for filing this!

As far as I remember It was our decision back in 2022 to have English as a fallback.

Even though there is no technical difficulty to change the behavior I doubt we'll do it for v0.
However, It makes total sense to revisit the implementation for v1 in my opinion.

When using the country_holidays function without specifying the language parameter (i.e., setting it to None or omitting it), the holiday names should consistently be returned in the country's original language as per the documentation.

Could you add a link to the documentation you mentioned in your post?

Thank you!

@pmarkoo
Copy link
Author

pmarkoo commented Dec 19, 2024

Hello @arkid15r, thank you for considering this!

When I mentioned the documentation, I was specifically referring to the docstring of the country_holidays function in the code itself: https://github.com/vacanza/holidays/blob/dev/holidays/utils.py#L72

English as a fallback is not really a problem. My main concern is that reliance on the LANG environment variable when the language parameter is unset is largely unknown unless one digs into the code. This implicit behavior leads to inconsistencies across environments and may confuse users. Sorry if this is just my own ignorance or lack of experience with locale-related engineering.

@arkid15r
Copy link
Collaborator

No, this is a valid point. I believe we need to update the docs while keeping English translation as a fallback.
I'm open to consider alternative opinions for v1.

@fedemolina
Copy link

fedemolina commented Jan 10, 2025

In version 0.60 the language does not work as expected in Jupyter .

For example, running the following code in Jupyter I got:

`min_year = 2021
max_year = 2022
country = "PE"

country_code = country
years = [min_year, max_year]
country_holidays_dict = country_holidays(country_code, years=years, language="en")
country_holidays_dict

holidays_data = [
(str(date), name) for date, name in country_holidays_dict.items()
]

holiday_names = {holiday_name: holiday_name for _, holiday_name in holidays_data}

print(holiday_names)`

returns:

{'Año Nuevo': 'Año Nuevo', 'Jueves Santo': 'Jueves Santo', 'Viernes Santo': 'Viernes Santo', 'Domingo de Resurrección': 'Domingo de Resurrección', 'Día del Trabajo': 'Día del Trabajo', 'San Pedro y San Pablo': 'San Pedro y San Pablo', 'Día de la Independencia': 'Día de la Independencia', 'Día de la Gran Parada Militar': 'Día de la Gran Parada Militar', 'Santa Rosa de Lima': 'Santa Rosa de Lima', 'Combate de Angamos': 'Combate de Angamos', 'Todos Los Santos': 'Todos Los Santos', 'Inmaculada Concepción': 'Inmaculada Concepción', 'Navidad del Señor': 'Navidad del Señor', 'Batalla de Junín': 'Batalla de Junín', 'Batalla de Ayacucho': 'Batalla de Ayacucho'}

however if I run the same code in a script from the terminal I got

{"New Year's Day": "New Year's Day", 'Maundy Thursday': 'Maundy Thursday', 'Good Friday': 'Good Friday', 'Easter Sunday': 'Easter Sunday', 'Labor Day': 'Labor Day', 'Saint Peter and Saint Paul': 'Saint Peter and Saint Paul', 'Independence Day': 'Independence Day', 'Great Military Parade Day': 'Great Military Parade Day', 'Rose of Lima Day': 'Rose of Lima Day', 'Battle of Angamos Day': 'Battle of Angamos Day', "All Saints' Day": "All Saints' Day", 'Immaculate Conception Day': 'Immaculate Conception Day', 'Christmas Day': 'Christmas Day', 'Battle of Junín Day': 'Battle of Junín Day', 'Battle of Ayacucho Day': 'Battle of Ayacucho Day'}

changing the language to spanish in the same script I got:

{'Año Nuevo': 'Año Nuevo', 'Jueves Santo': 'Jueves Santo', 'Viernes Santo': 'Viernes Santo', 'Domingo de Resurrección': 'Domingo de Resurrección', 'Día del Trabajo': 'Día del Trabajo', 'San Pedro y San Pablo': 'San Pedro y San Pablo', 'Día de la Independencia': 'Día de la Independencia', 'Día de la Gran Parada Militar': 'Día de la Gran Parada Militar', 'Santa Rosa de Lima': 'Santa Rosa de Lima', 'Combate de Angamos': 'Combate de Angamos', 'Todos Los Santos': 'Todos Los Santos', 'Inmaculada Concepción': 'Inmaculada Concepción', 'Navidad del Señor': 'Navidad del Señor', 'Batalla de Junín': 'Batalla de Junín', 'Batalla de Ayacucho': 'Batalla de Ayacucho'}

So looks like the problem arises just in Jupyter.

running locale in the terminal I got

LANG="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_CTYPE="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8"

running locale in Jupyter I got:

LANG="" LC_COLLATE="C" LC_CTYPE="UTF-8" LC_MESSAGES="C" LC_MONETARY="C" LC_NUMERIC="C" LC_TIME="C" LC_ALL=

@KJhellico
Copy link
Collaborator

country_holidays_dict = country_holidays(country_code, years=years, language="en")

Correct language value is en_US.

@fedemolina
Copy link

country_holidays_dict = country_holidays(country_code, years=years, language="en")

Correct language value is en_US.

Now it works as expected.

But documentation said:

:param language: The language which the returned holiday names will be translated into. It must be an ISO 639-1 (2-letter) language code. If the language translation is not supported the original holiday names will be used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants