Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

method for missing? #108

Open
pdimens opened this issue Oct 2, 2019 · 7 comments
Open

method for missing? #108

pdimens opened this issue Oct 2, 2019 · 7 comments

Comments

@pdimens
Copy link

pdimens commented Oct 2, 2019

Is there a simple(ish?) method to perform the correction but skip missing values, and output the corrected array with missing respecting their original indices (but not used in the calculations)?

Reading that back to myself, it doesn't feel like it's worded too clearly, so maybe an example:

julia> pvals = [0.001, 0.01, missing, 0.03, 0.5];

julia> adjust(pvals, Bonferroni())
4-element Array{Union{Missing,Float64},1}
 0.004
 0.04
missing
 0.12
 1.0
@pdimens
Copy link
Author

pdimens commented Oct 2, 2019

Thinking about it some more, maybe something like a findall for the missing values, get an array of those indices, then omit the missing with skipmmissing(array) |> collect , calculate the correction, and finally re-add missing into the output array at the original indices with insert! ?

@pdimens
Copy link
Author

pdimens commented Oct 2, 2019

Here is how I handled the situation in my own code. I don't know if it would merit adding to your package:

        # make a copy without the missing values
        p_no_miss = skipmissing(P_array) |> collect

        # get indices of where original missing are
        miss_idx = findall(i -> i === missing, P_array)

        # do the correction
        correct = adjust(p_no_miss, correctionmethod) |> Array{Any,1}

        # re-add missing to original positions
        for i in miss_idx
            insert!(correct, i, missing)
        end

@juliangehring
Copy link
Owner

Thanks for bringing up the handling of missing values. Your approach looks good to me, not sure if there is a more elegant way of removing and reinserting missing values exists. It is definitely worth exploring if missing values should better be handled by the adjust methods themselves.

@juliangehring
Copy link
Owner

juliangehring commented Oct 6, 2019

@pdimens Just to understand your case a bit better: How did you generate the original p-values and why are some values missing?

@pdimens
Copy link
Author

pdimens commented Oct 6, 2019

That's a pretty fair question. The p-values were generated with a chi-squared test. When performed on all the data, it works ok, but if the data is partitioned by group, some groups have a particular locus (genetics work) entirely missing, which I also didn't realize would have happened.

The actual code is here: https://github.com/pdimens/PopGen.jl/blob/master/src/HardyWeinberg.jl if the specific implementation matters.

@juliangehring
Copy link
Owner

Okay, thanks for the details - that is interesting to see.

@pdimens
Copy link
Author

pdimens commented Jul 20, 2020

Having learned quite a bit since opening this issue, the PR submitted performs this a lot more elegantly than the code suggested above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants