You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
File ~/opt/anaconda3/envs/p39/lib/python3.9/site-packages/ydata_profiling/profile_report.py:560, in ProfileReport.compare(self, other, config)
544 """Compare this report with another ProfileReport
545 Alias for:
546 ```
(...)
556 Comparison ProfileReport
557 """
558 from ydata_profiling.compare_reports import compare
--> 560 return compare([self, other], config if config is not None else self.config)
File ~/opt/anaconda3/envs/p39/lib/python3.9/site-packages/ydata_profiling/compare_reports.py:302, in compare(reports, config, compute)
300 for report in reports[1:]:
301 cols_2_compare = [col for col in base_features if col in report.df.columns] # type: ignore
--> 302 report.df = report.df.loc[:, cols_2_compare] # type: ignore
303 reports = [r for r in reports if not r.df.empty] # type: ignore
304 if len(reports) == 1:
File ~/opt/anaconda3/envs/p39/lib/python3.9/site-packages/pyspark/sql/dataframe.py:1988, in DataFrame.getattr(self, name)
1978 """Returns the :class:Column denoted by name.
1979
1980 .. versionadded:: 1.3.0
(...)
1985 [Row(age=2), Row(age=5)]
1986 """
1987 if name not in self.columns:
-> 1988 raise AttributeError(
1989 "'%s' object has no attribute '%s'" % (self.class.name, name)
1990 )
1991 jc = self._jdf.apply(name)
1992 return Column(jc)
AttributeError: 'DataFrame' object has no attribute 'loc'
Expected Behaviour
I expect to be able to compare reports related to spark dataframes
Current Behaviour
When running ydata profiling I can generate the report for Spark dataframe but the compare method gives an error.
AttributeError Traceback (most recent call last)
Cell In[56], line 1
----> 1 spark_comparison_report = spark_train_report.compare(spark_test_report)
File ~/opt/anaconda3/envs/p39/lib/python3.9/site-packages/ydata_profiling/profile_report.py:560, in ProfileReport.compare(self, other, config)
544 """Compare this report with another ProfileReport
545 Alias for:
546 ```
(...)
556 Comparison ProfileReport
557 """
558 from ydata_profiling.compare_reports import compare
--> 560 return compare([self, other], config if config is not None else self.config)
File ~/opt/anaconda3/envs/p39/lib/python3.9/site-packages/ydata_profiling/compare_reports.py:302, in compare(reports, config, compute)
300 for report in reports[1:]:
301 cols_2_compare = [col for col in base_features if col in report.df.columns] # type: ignore
--> 302 report.df = report.df.loc[:, cols_2_compare] # type: ignore
303 reports = [r for r in reports if not r.df.empty] # type: ignore
304 if len(reports) == 1:
File ~/opt/anaconda3/envs/p39/lib/python3.9/site-packages/pyspark/sql/dataframe.py:1988, in DataFrame.getattr(self, name)
1978 """Returns the :class:
Column
denoted byname
.1979
1980 .. versionadded:: 1.3.0
(...)
1985 [Row(age=2), Row(age=5)]
1986 """
1987 if name not in self.columns:
-> 1988 raise AttributeError(
1989 "'%s' object has no attribute '%s'" % (self.class.name, name)
1990 )
1991 jc = self._jdf.apply(name)
1992 return Column(jc)
AttributeError: 'DataFrame' object has no attribute 'loc'
Expected Behaviour
I expect to be able to compare reports related to spark dataframes
Data Description
Code that reproduces the bug
pandas-profiling version
4.6.4
Dependencies
OS
Mac OS
Checklist
The text was updated successfully, but these errors were encountered: