Give a similarity function between questionnaires #41

woodthom2 · 2024-05-31T16:07:19Z

Description

Eve Cheng has done some experiments with the Word Movers Distance algorithm which gives the distance between two sequences of sentence embeddings.

Can Harmony use this to say that the GAD-7 is e.g. 60% similar to the PHQ-9?

See Colab notebook:
https://github.com/harmonydata/experiments/blob/main/harmony_wmd_experiment.ipynb

We also have a demo of Harmony integrated with external data sources: https://harmonycataloguelookup.azurewebsites.net/

Source code is at: https://github.com/harmonydata/harmony_catalogue_lookup_dash

See mockup

https://github.com/harmonydata/hackathon/blob/main/instrument_level.png

Rationale

The use case would be:

as a research psychologist, I’ve got one small study here, one small study there. Individually they don’t give enough statistical power, but can they do it together? So can we combine Study A and Study B to get enough statistical power for my research question?

Word Movers Distance is a candidate but it's not necessarily how we solve it. It might be too slow for example.

Maybe a simple solution is just to have a threshold and we report the number of questions in Instrument A matching questions in Instrument B at that threshold.

The text was updated successfully, but these errors were encountered:

woodthom2 · 2024-09-26T15:18:21Z

Message from @eoinmcelroy

They have done some work where they have catalogued the most common measures of employee wellbeing used in research involving companies. They have 100+ measures, and wanted to know if Harmony could be used to map the semantic overlap, to see if there are core dimensions of workplace wellbeing. 

I think this would be challenging at the item-level, but I was wondering if this could be done at the scale level. This is something we had talked about before. How feasible/much work would it be to implement a scale-level analysis feature to Harmony? If not feasible, do you think Harmony could handle 100 or so measures? We could trim some out based on certain rules (e.g. not commonly used), but this would still leave us with around 30 scales.

woodthom2 added the enhancement New feature or request label May 31, 2024

woodthom2 self-assigned this Sep 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Give a similarity function between questionnaires #41

Give a similarity function between questionnaires #41

woodthom2 commented May 31, 2024

woodthom2 commented Sep 26, 2024

Give a similarity function between questionnaires #41

Give a similarity function between questionnaires #41

Comments

woodthom2 commented May 31, 2024

Description

Rationale

woodthom2 commented Sep 26, 2024