Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent hallucinations on candidate words #25

Open
jimthompson5802 opened this issue Nov 23, 2024 · 0 comments
Open

Intermittent hallucinations on candidate words #25

jimthompson5802 opened this issue Nov 23, 2024 · 0 comments

Comments

@jimthompson5802
Copy link
Owner

This recommendation LLM_RECOMMENDER: RECOMMENDED WORDS ['astute', 'clever', 'smart', 'wise'] contiains "astute" and "clever", which are not part of the puzzle words.

python src/agent/app_embedvec.py 
Enter 'file' to read words from a file or 'image' to read words from an image: image
Please enter the image file location: data/connection_puzzle_2024_10_27.png
Puzzle Words: ['fresh', 'prince', 'bel', 'air', 'quality', 'bar', 'cute', 'mermaid', 'lux', 'wise', 'tramp', 'mood', 'feeling', 'smart', 'mole', 'rascals']

Generating vocabulary for the words...this may take about a minute

Generating embeddings for the definitions

ENTERED EMBEDVEC RECOMMENDATION
(101, 101)
(101, 101)
candidate_lists size: 78

EMBEDVEC_RECOMMENDER: RECOMMENDED WORDS ['bel', 'lux', 'prince', 'wise'] with connection Names and titles; the words are related to names or titles often associated with nobility or wisdom.
Is the recommendation accepted? (y/g/b/p/o/n): n
Recommendation ['bel', 'lux', 'prince', 'wise'] is incorrect
Changing the recommender from 'embedvec_recommender' to 'llm_recommender'
attempt_count: 1
words_remaining: ['rascals', 'mole', 'smart', 'feeling', 'mood', 'tramp', 'wise', 'lux', 'mermaid', 'cute', 'bar', 'quality', 'air', 'bel', 'prince', 'fresh']

LLM_RECOMMENDER: RECOMMENDED WORDS ['mermaid', 'mole', 'prince', 'tramp'] with connection Characters in stories
Is the recommendation accepted? (y/g/b/p/o/n): o
Recommendation ['mole', 'tramp', 'mermaid', 'prince'] is incorrect, one away from correct

>>>Number of single topic groups: 1

>>>One-away group recommendations:
Recommended Group: ['tramp', 'mermaid', 'prince', 'rascals']
Connection Description: The word 'rascals' is most connected to the 'anchor_words' because it can represent mischievous or adventurous characters often found in fairy tales and stories. In many tales, characters that are rascals add to the narrative by causing trouble or embarking on adventures, similar to how a tramp might wander or how unique characters like mermaids and princes play specific roles within stories. Thus, 'rascals' fits well within the context of fairy tales and folklore, which is the common connection among the anchor words.
using one_away_group_recommendation

LLM_RECOMMENDER: RECOMMENDED WORDS ['mermaid', 'prince', 'rascals', 'tramp'] with connection The word 'rascals' is most connected to the 'anchor_words' because it can represent mischievous or adventurous characters often found in fairy tales and stories. In many tales, characters that are rascals add to the narrative by causing trouble or embarking on adventures, similar to how a tramp might wander or how unique characters like mermaids and princes play specific roles within stories. Thus, 'rascals' fits well within the context of fairy tales and folklore, which is the common connection among the anchor words.
Is the recommendation accepted? (y/g/b/p/o/n): p
Recommendation ['tramp', 'mermaid', 'prince', 'rascals'] is correct
attempt_count: 1
words_remaining: ['fresh', 'bel', 'air', 'quality', 'bar', 'cute', 'lux', 'wise', 'mood', 'feeling', 'smart', 'mole']

LLM_RECOMMENDER: RECOMMENDED WORDS ['air', 'bar', 'fresh', 'quality'] with connection Words associated with 'air'
Is the recommendation accepted? (y/g/b/p/o/n): n
Recommendation ['fresh', 'air', 'quality', 'bar'] is incorrect
attempt_count: 1
words_remaining: ['mole', 'smart', 'feeling', 'mood', 'wise', 'lux', 'cute', 'bar', 'quality', 'air', 'bel', 'fresh']

LLM_RECOMMENDER: RECOMMENDED WORDS ['astute', 'clever', 'smart', 'wise'] with connection Intelligence-related adjectives
Is the recommendation accepted? (y/g/b/p/o/n): n
FAILED TO SOLVE THE CONNECTION PUZZLE TOO MANY MISTAKES!!!


FINAL PUZZLE STATE:
{   'found_count': 1,
    'found_purple': True,
    'invalid_connections': [   (   'a160fd0972eb055d4297ff2fda91b9d7',
                                   ['bel', 'lux', 'prince', 'wise']),
                               (   '81f8c0d767052e323af50a7d3a5d0e50',
                                   ['mole', 'tramp', 'mermaid', 'prince']),
                               (   'e5058f5c947906f66aae7f9222cf7916',
                                   ['fresh', 'air', 'quality', 'bar']),
                               (   '82c7fbf27a2f9beff2bb0f38dc7579f7',
                                   ['smart', 'wise', 'clever', 'astute'])],
    'llm_retry_count': 0,
    'llm_temperature': 0.7,
    'mistake_count': 4,
    'puzzle_recommender': 'llm_recommender',
    'puzzle_status': 'initialized',
    'puzzle_step': 'puzzle_completed',
    'recommendation_count': 5,
    'recommended_connection': '',
    'recommended_correct': False,
    'recommended_words': [],
    'tool_to_use': 'END',
    'vocabulary_df':         word  ...                                          embedding
0      fresh  ...  [-0.0008035926730372012, -0.013773292303085327...
1      fresh  ...  [0.04081568121910095, -0.011401723138988018, -...
2      fresh  ...  [-0.02652222104370594, -0.0336439311504364, -0...
3      fresh  ...  [0.025376707315444946, 0.008187086321413517, -...
4      fresh  ...  [0.04619362950325012, -0.05760391056537628, -0...
..       ...  ...                                                ...
96      mole  ...  [0.00973152369260788, 0.0035174943041056395, -...
97   rascals  ...  [0.059400442987680435, -0.024503977969288826, ...
98   rascals  ...  [0.027869636192917824, -0.028265511617064476, ...
99   rascals  ...  [0.0577474981546402, 0.0037206350825726986, -0...
100  rascals  ...  [0.031364284455776215, 0.008480755612254143, -...

[101 rows x 3 columns],
    'words_remaining': [   'mole',
                           'smart',
                           'feeling',
                           'mood',
                           'wise',
                           'lux',
                           'cute',
                           'bar',
                           'quality',
                           'air',
                           'bel',
                           'fresh'],
    'workflow_instructions': '**Instructions**\n'
                             '\n'
                             'use "setup_puzzle" tool to initialize the puzzle '
                             'if the "puzzle_status" is not initialized.\n'
                             '\n'
                             'if "puzzle_step" is "puzzle_completed" then use '
                             '"END" tool.\n'
                             '\n'
                             'Use the table to select the appropriate tool.\n'
                             '\n'
                             '|puzzle_recommender| puzzle_step | tool |\n'
                             '| --- | --- | --- |\n'
                             '|embedvec_recommender| next_recommendation | '
                             'get_embedvec_recommendation |\n'
                             '|embedvec_recommender| have_recommendation | '
                             'apply_recommendation |\n'
                             '|llm_recommender| next_recommendation | '
                             'get_recommendation |\n'
                             '|llm_recommender| have_recommendation | '
                             'apply_recommendation |\n'
                             '\n'
                             'If no tool is selected, use "ABORT" tool.\n'}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant