Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Give a similarity function between questionnaires #41

Open
woodthom2 opened this issue May 31, 2024 · 1 comment
Open

Give a similarity function between questionnaires #41

woodthom2 opened this issue May 31, 2024 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@woodthom2
Copy link
Contributor

Description

Eve Cheng has done some experiments with the Word Movers Distance algorithm which gives the distance between two sequences of sentence embeddings.

Can Harmony use this to say that the GAD-7 is e.g. 60% similar to the PHQ-9?

See Colab notebook:
https://github.com/harmonydata/experiments/blob/main/harmony_wmd_experiment.ipynb

We also have a demo of Harmony integrated with external data sources: https://harmonycataloguelookup.azurewebsites.net/

Source code is at: https://github.com/harmonydata/harmony_catalogue_lookup_dash

See mockup

https://github.com/harmonydata/hackathon/blob/main/instrument_level.png

Rationale

The use case would be:

as a research psychologist, I’ve got one small study here, one small study there. Individually they don’t give enough statistical power, but can they do it together? So can we combine Study A and Study B to get enough statistical power for my research question?

Word Movers Distance is a candidate but it's not necessarily how we solve it. It might be too slow for example.

Maybe a simple solution is just to have a threshold and we report the number of questions in Instrument A matching questions in Instrument B at that threshold.

@woodthom2 woodthom2 added the enhancement New feature or request label May 31, 2024
@woodthom2
Copy link
Contributor Author

Message from @eoinmcelroy

They have done some work where they have catalogued the most common measures of employee wellbeing used in research involving companies. They have 100+ measures, and wanted to know if Harmony could be used to map the semantic overlap, to see if there are core dimensions of workplace wellbeing. 

I think this would be challenging at the item-level, but I was wondering if this could be done at the scale level. This is something we had talked about before. How feasible/much work would it be to implement a scale-level analysis feature to Harmony? If not feasible, do you think Harmony could handle 100 or so measures? We could trim some out based on certain rules (e.g. not commonly used), but this would still leave us with around 30 scales.

@woodthom2 woodthom2 self-assigned this Sep 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant