Skip to content

Latest commit

 

History

History
136 lines (108 loc) · 6.38 KB

TODO

File metadata and controls

136 lines (108 loc) · 6.38 KB

##-*- mode: org -*- Emacs org: use [Tab] “all the time”

NAMESPACE and “internal” functions:

Look at ./NAMESPACE and replace ALL the Imports() by ImportsFrom(.)

What do we export / what not? <–> file:man/pcalg-internal.Rd

Currently, there is ./man/pcalg-internal.Rd with man \alias{}es and only few functions in “Usage”. The concept of such “internal” functions really predates the use of NAMESPACEs and is now obsolete.

Many of these should be removed for the .Rdany of these should be *removed for the .Rd page *and from NAMESPACE. The others (few?) would be kept, but then also “well” documented, i.e., on a different *.Rd file.

Goal: Get rid of ./man/pcalg-internal.Rd entirely.

Remarks on specific functions and issues:

skeleton(): keep back compatibility update= c(“stable”,”original”) <<—MM

in code

but MM’s tests show NOT QUITE back compatibility

pag2mag() return adj.matrix instead of object -> named pag2magAM()

whereas dag2pag() does return more than just an adj.matrix. Change BEFORE release

pc(), probably fci() etc: loses variable labels and works with “1”, “2”,…???

–> file:/u/maechler/R/MM/Pkg-ex/pcalg/Borsboom_mail.R is an example where he explicitly calls an other package just to get sensible labeled plots.

fci() et al : sparse matrices [Matrix] ?

  • returns the adjacency matrix “@ amat” (among other things), a simple matrix with entries in {0,1,2,3}. It would be nice to allow sparse matrices here, e.g., by using the “Matrix” class from the Matrix package.

    This makes sense mostly if it’s realistic to have quite sparse and relatively large sets of variables.

gAlgo-class: consider using setMethod(., “gAlgo”)

instead of all methods (plot, summary) for both pcAlgo and fciAlgo

myCItest() in Vignette vignettes/pcalgDoc.Rnw

instead of lm() twice, use lm.fit once (multivariate regression). This will probably be much faster.

ida(): argument ‘all.dags’ is never used, i.e., never tested.

dsepTest(): gibt 0 / 1 zurück; wieso nicht FALSE/TRUE wie dsep()? A:”P-value”

I’ve introduced ‘max.chordal = 10’ to ‘backdoor()’

which was hidden in the code previously. Have you ever tried larger/smaller?

gSquareBin(), gSquareDis():

  • returns a P-value but not the test statistic. Should really return an object of (standard S3 class) “htest” (which contains P-value, test statistic, …). But they have been documented to do what they do, and so we keep them.

pc(, verbose=TRUE) for a “large” example with 18 vars: *much output;

and the 10 rules at the very end. Better: verbose in { 0,1,2 (, 3) } and verbose=1 should give much less than TRUE now

Parallelize option

Allow ‘tiers’ (as in Tetrad), and ‘background knowledge’ (about orientations etc).

?? Where is all the robustness gone that pcAlgo() had?

Maybe pcAlgo() should not be deprecated because it allows robust correlation methods? —> indepTest argument: Qn ? –> Martin


Style: am[which(am != 0)] is faster than am[am != 0]

for a sparse am of dim 50 x 50 ==> Using the more ugly notation is ok!

find.unsh.triple: remove arg ‘p’

allDags() bekommt noch keine Hilfeseite -> moechte bald eine grosse Aenderung der Funktion machen (inkl. Parameter)

Letztest Bsp in fci(): p-1 statt p im Argument, weil eine (latente) Var gelöscht wurde

Make a fast dsep function for testing (using a lot of precomputation; e.g. ancestors, parents that are not married, so that finding moral graph becomes trivial; we then only need to check separation; could also tsort all nodes and precompute one moral graph for every element in the tsorted node sequence (~p graphs); then, the test boils down to a mere separation test

Boost C++ library needed for Alain’s GIES

does need a correct ./configure, somewhat analogous to Martin’s Rmpfr.

Data: Mit Markus gesprochen (3.Sept.2010):

  • Die data/test_*.rda Daten welche nur in tests/ gebraucht werden, sollen nicht “exponiert” werden.
  • Die andern sind momentan “falsch” dokumentiert, sowohl formal (-> R CMD check), als auch inhaltlich {die Namen strings sind “dokumentiert”; sonst kein Inhalt}.
  • Auch hat es z.T. mehrere Objekte pro *.rda die inhaltlich zusammengehören; MM denkt, die sollten wohl zusammen in eine Liste (mit kurzem Namen!).

Wo gebraucht:

pcalg$ grep-r binaryData ./man/pc.Rd:data(binaryData) ./man/skeleton.Rd:data(binaryData) pcalg$ grep-r discreteData ./inst/doc/pcalgDoc.Snw:data set \code{discreteData} (which consists of $p=5$ discrete ./inst/doc/pcalgDoc.Snw:data(discreteData) ./man/pc.Rd:data(discreteData) ./man/skeleton.Rd:data(discreteData) pcalg$ grep-r gaussianData ./inst/doc/pcalgDoc.Snw: data(gaussianData) ./inst/doc/pcalgDoc.Snw:\code{gaussianData} (which consists of $p=8$ variables and $n=5000$ ./inst/doc/pcalgDoc.Snw:data(gaussianData) ./inst/doc/pcalgDoc.Snw:data(gaussianData) ./inst/doc/pcalgDoc.Snw:data(gaussianData) ./inst/doc/pcalgDoc.Snw:data(gaussianData) ## contains data matrix datG pcalg$ grep-r idaData ./inst/doc/pcalgDoc.Snw:data(idaData) pcalg$ grep-r latentVarGraph ./inst/doc/pcalgDoc.Snw:data(latentVarGraph)

## Here, you get their contents:

for(f in list.files(“~/R/Pkgs/pcalg/data/”, full.names=TRUE)) { bf <- basename(f) nam <- sub(“\.rda$”, ”, bf) cat(“\n”, bf, “:\n-----------------\n”, sep=”“) attach(f) print(ls.str(pos=2)) detach() }

DESCRIPTION:

Depends: too many; I do not believe they are all “dependently” needed.

Remove more ‘Depends:’: vcd, RBGL.

  • vcd: I’ve eliminated it, and all the ‘R CMD check’ still run
  • RBGL: .. ok, only a few needed
  • graph: import quite a few; now found all examples – is this (not attaching ‘graph’) acceptable to the current pcalg users ?

: ‘abind’ and ‘corpcor’, as all packages now DO have a namespace!

  • abind: abind() needed in gSquareBin() & *Dis() – once only, each
  • corpcor: pseudoinverse() [ -> fast.svd() .. non-optimal “A %% diag(.)” ]

Change convention: If something strange happens with a test, KEEP edge

Make up some clever way to deal with NAs in the continuous case (phps also in other cases) and prepare test functions for users