Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide locus, chromosome, and haplotype components of genetic, breeding, ... values #215

Open
gregorgorjanc opened this issue Dec 23, 2024 · 3 comments

Comments

@gregorgorjanc
Copy link
Contributor

gregorgorjanc commented Dec 23, 2024

For pedagogical and some research reasons, it would be neat to show how the whole-genome genetic value is composed of locus-specific values, chromosome-specific values, or haplotype values.

This clearly would not be very efficient memory-wise, but calculations are all already done at the C++ level.

I am not fully clear on how the API would look like. Say, something like:

  • whole-genome value (default): gv(pop) returning a matrix of nInd * nTraits
  • chromosome-specific value: gv(pop, return="chr") returning a matrix of nInd * nTraits * nChr
  • locus-specific value: gv(pop, return="loc") returning a matrix of nInd * nTraits * nLoc
  • haplotype-specific value:
    • gv(pop, return="chr", returnHap=TRUE) returning a matrix of nInd * nTraits * nChr * ploidy
    • gv(pop, return="loc", returnHap=TRUE) returning a matrix of nInd * nTraits * nLoc * ploidy

So, the function signature would be gv(pop, return="whole-genome-or-similar", returnHap=FALSE).

Similarly for other values (bv, dd, aa). Obviously for some not all of the above made sense.

Since everything is already calculated at C++ level, we would mostly have to think how to return these results to the R level ...

Thoughts?

@gaynorr
Copy link
Owner

gaynorr commented Dec 24, 2024

Epistasis will be an issue. Centering of the values is also a little tricky. What do you doe with the intercept?

Otherwise, genetic values are pretty straightforward to calculate in R code. The breeding values are fairly easy too with the allele substitution effect being exported.

I generally desire avoid too much functionality in existing functions. I find a large chunk of users try to read and understand every argument in a function. For example, I often see code where every argument to a function is explicitly written out, even though the intent in AlphaSimR has been to set sensible defaults to avoid users needing to worry about them in most cases. Thus, I think a design with a larger number of functions might be preferable to fewer functions with more arguments.

@gregorgorjanc
Copy link
Contributor Author

I agree that gv and bv might be best targets, though eventually someone will ask for dd, while aa is not straightforward/possible.

I would follow the same centering as elsewhere in the package for consistency.

I know what you mean about arguments. Pros and cons as always. I prefer less functions, if possible, but also dislike too many arguments. I think the above proposal is not bad.

@gregorgorjanc
Copy link
Contributor Author

Just to add that I always appreciate good defaults in AlphaSimR! I learned a lot from AlphaSimR;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants