-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Add to_numeric_br()
function to convert Brazilian-formatted numbers
#60998
Comments
I've encountered a similar need to handle Brazilian number formatting and created a function that might be helpful. It addresses the different import pandas as pd
import numpy as np
def to_numeric_br(series, errors="raise"):
"""
Converts Brazilian-style numeric strings (1.234,56) into float.
Parameters:
----------
series : pandas.Series
Data to be converted.
errors : str, default 'raise'
- 'raise' : Throws an error for invalid values.
- 'coerce' : Converts invalid values to NaN.
- 'ignore' : Returns the original data in case of error.
Returns:
-------
pandas.Series with numeric values.
"""
def converter(x):
if pd.isna(x):
return x
try:
return float(x.replace(".", "").replace(",", "."))
except ValueError:
if errors == "raise":
raise
elif errors == "coerce":
return np.nan
elif errors == "ignore":
return x
else:
raise ValueError("Invalid error value")
return series.apply(converter)
Example usage:
df = pd.DataFrame({"values": ["1.234,56", "5.600,75", "100,50", "invalid"]})
df["converted_coerce"] = to_numeric_br(df["values"], errors="coerce")
df["converted_ignore"] = to_numeric_br(df["values"], errors="ignore")
print(df)
try:
df["converted_raise"] = to_numeric_br(df["values"], errors="raise")
except ValueError as e:
print(f"Caught exception as expected: {e}")
This function handles NaN values gracefully and provides flexibility in how errors are managed. While integrating this directly into pd.to_numeric with a locale option would be ideal, this standalone function could be a useful workaround in the meantime. I hope this contributes to the discussion! |
Thanks for pointing that out, @Liam3851 If #4674 and #56934 already added support for specifying decimal and a thousand separators in Could you confirm if this feature is already fully implemented in the latest Pandas release? If so, users in Brazil could simply use If there are any remaining gaps, I'd be happy to adjust my proposal accordingly. |
Thank you, @itayg2341 looks a great solution. |
@Veras-D I'd suggest you could re-open this, as I don't believe #56934 was ever merged. cc: @mroeschke |
Thanks for the clarification @Liam3851! I've reopened the issue. Let me know if there's anything I can do to help move this forward. |
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
I wish I could use Pandas to easily convert numbers formatted in the Brazilian style (
1.234,56
) into numeric types.Currently,
pd.to_numeric()
does not support this format, and users have to manually apply.str.replace(".", "").replace(",", ".")
, which is not intuitive.This feature would simplify data handling for users in Brazil and other countries with similar numerical formats.
Feature Description
Add a new function to_numeric_br() to automatically convert strings with the Brazilian numeric format into floats.
Proposed Implementation (Pseudocode)
Expected Behavior
Expected Output:
Alternatively, instead of a standalone function, this could be implemented as an enhancement to
pd.to_numeric()
, adding alocale="br"
parameter.Alternative Solutions
Currently, users must manually apply string replacements before using
pd.to_numeric()
, like this:While this works, it is not user-friendly, especially for beginners.
Another alternative is using third-party packages like babel, but this requires additional dependencies and is not built into Pandas.
Additional Context
to_numeric_br()
) or alocale
parameter inpd.to_numeric()
?The text was updated successfully, but these errors were encountered: