Understanding models results #17

sherifelsabbagh · 2023-12-07T12:33:56Z

Hi,

I have a question related to results of model building.

In the statistics file, I can see a line like this

cdk8 t1_f7_p0 36 11 94 131 0.766 0.383 0.084 0.511 0.426 0.638 0.65 1.833 4 8.883 aaAAHH

I understand that f7 refers to 7 features but when I can see only six features aaAAHH ( 2 aromatics, 2 acceptors and 2 hydrophobic )... so where is the 7th feature.

also it is written 4 unique features, should it be 3 ? A,a and H ?

the last thing, when I download the xyz file, how can I view this and relate it to the above features because when I open it in pymol I only see 3 spheres ...

DrrDom · 2023-12-09T12:31:03Z

You are right, f7 is expected to designate that there are 7 features in a pharmacophore. This seems like a bug, but after quick investigation I could not figure out the source of the error. I'll label this issue as a bug to fix in a future. This should not affect output models.

Unique features are features with distinct coordinates. In your case I expect that aromatic and hydrophobic features have the same coordinates, therefore each pair is counted as a single feature. Two acceptors have different coordinates. So, overall there are two acceptors and two pairs of a and H features with different coordinates, that means 4 unique features. The name could be confusing. The reason for that to better discriminate spatial complexity of pharmacophore models.

To see all features in pymol you may force to show them as spheres. Alternatively you may use a pymol script - #15

julianaamorim · 2024-01-21T23:11:14Z

Hi again,

I would like to understand what the criteria are for selecting the best model since there is no alignment. Recall> precision> FPR, etc ?
Isn't the screening of a database more limited with the same coordinate for different features (a and H) in a same model? Or not...

Thanks......

DrrDom · 2024-01-22T14:29:10Z

I would like to understand what the criteria are for selecting the best model since there is no alignment. Recall> precision> FPR, etc ?

if you ask about selection of the final model to be used for virtual screening, this is completely on your choice as in any other cases, alignment will not help with that. You may choose a model with the highest precision value to retrieve actives with higher probability (conservative strategy), or you may choose a models with larger recall to increase chances to retrieve diverse hits.

If you ask about how models internally selected on each iteration, there is a function strategy_extract_trainset in gen_pharm_models.py. It is also described in the paper. There are different criteria for different modeling strategies.

if clust_strategy == 2:
    df = df.sort_values(by=['recall', 'F2', 'F05'], ascending=False).reset_index(drop=True)
    if df['F2'].iloc[0] == 1.0:
        df = df[(df['recall'] == 1.0) & (df['F2'] == 1.0)]
    elif df[df['F2'] >= 0.8].shape[0] <= 100:
        df = df[(df['recall'] == 1) & (df['F2'] >= 0.8)]
    else:
        df = df[(df['recall'] == 1) & (df['F2'] >= df['F2'].loc[100])]
elif clust_strategy == 1:
    df = df.sort_values(by=['recall', 'F05', 'F2'], ascending=False).reset_index(drop=True)
    df = df[df['F05'] >= 0.8] if df[df['F05'] >= 0.8].shape[0] <= 100 else df[df['F05'] >= df['F05'].loc[100]]

Isn't the screening of a database more limited with the same coordinate for different features (a and H) in a same model? Or not

Yes, it is more limited, because if a and H features have the same coordinates such a model can match only aromatic groups. H feature alone matches also saturated carbocycles and alkyl groups.

Hope this will help.

DrrDom added the bug Something isn't working label Dec 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Understanding models results #17

Understanding models results #17

sherifelsabbagh commented Dec 7, 2023

DrrDom commented Dec 9, 2023

julianaamorim commented Jan 21, 2024

DrrDom commented Jan 22, 2024

Understanding models results #17

Understanding models results #17

Comments

sherifelsabbagh commented Dec 7, 2023

DrrDom commented Dec 9, 2023

julianaamorim commented Jan 21, 2024

DrrDom commented Jan 22, 2024