You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi. I´m using the library to find association rules in a dataset. In order to do that, I´m passing the output of the three algorithms to the association_rules() function. The documentation says these are equivalent in terms of parameters and output, but I´m getting on the following error only with the output from fpmax() :
KeyError: 'frozenset({120})You are likely getting this error because the DataFrame is missing antecedent and/or consequent information. You can try using the `support_only=True` option'
A minimal code example of my implementation would be like
from mlxtend.frequent_patterns import fpgrowth
from mlxtend.frequent_patterns import fpmax
### Assume baskets_matrix is an ad_hoc pandas df.
### This works OK
freq_items_1 = fpgrowth(baskets_matrix, min_support=0.1)
freq_items_2 = fpmax(baskets_matrix, min_support=0.1)
### This also works OK
AR_1 =association_rules(freq_items_1, metric="confidence", min_threshold=0.5)
### This raises the error
AR_2 =association_rules(freq_items_2, metric="confidence", min_threshold=0.5)
Since all other factors are the same, I have to assume that there is a difference in the output of fpgrowth and fpmax which is not clearly documented.
I also noticed that the documentation refers to the association_rules() function as generate_rules() which leads to further confussion.
Suggest a potential improvement or addition
I would like to ask if it´s possible to clarify if the output from the different algoriths are indeed different or there is another issue here.
Also, I think it will be useful for anyone using the library to have this remarks added on the documentatinon.
Thanks in advance!
The text was updated successfully, but these errors were encountered:
As per the documentation "FP-Max is a variant of FP-Growth, which focuses on obtaining maximal itemsets. An itemset X is said to maximal if X is frequent and there exists no frequent super-pattern containing X. In other words, a frequent pattern X cannot be sub-pattern of larger frequent pattern to qualify for the definition maximal itemset."
That being said, I am getting the error too when using FP-Max.
Same here, when mining frequent itemsets with fp-growth it works fine, but when using fp-max I get the same error. a example of my code is:
Assume negated is a one-hot encoded dataframe
max = fpmax(negated, min_support=0.3, use_colnames=True, max_len=5)
max
rules = association_rules(max,metric="confidence", min_threshold=0.85) # Error appears here
Works well
max = fpgrowth(negated, min_support=0.3, use_colnames=True, max_len=5)
max
rules = association_rules(max,metric="confidence", min_threshold=0.85)
Describe the documentation issue
Hi. I´m using the library to find association rules in a dataset. In order to do that, I´m passing the output of the three algorithms to the
association_rules()
function. The documentation says these are equivalent in terms of parameters and output, but I´m getting on the following error only with the output fromfpmax()
:A minimal code example of my implementation would be like
Since all other factors are the same, I have to assume that there is a difference in the output of
fpgrowth
andfpmax
which is not clearly documented.I also noticed that the documentation refers to the
association_rules()
function asgenerate_rules()
which leads to further confussion.Suggest a potential improvement or addition
I would like to ask if it´s possible to clarify if the output from the different algoriths are indeed different or there is another issue here.
Also, I think it will be useful for anyone using the library to have this remarks added on the documentatinon.
Thanks in advance!
The text was updated successfully, but these errors were encountered: