-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support at least one binary open standard out of the box #315
Comments
To understand this correctly: this is about moving rang::resolve("readODS")
#> resolved: 1 package(s). Unresolved package(s): 0
#> $`cran::readODS`
#> The latest version of `readODS` [cran] at 2023-08-14 was 2.0.0, which has 30 unique dependencies (18 with no dependencies.)
rang::resolve("arrow")
#> resolved: 1 package(s). Unresolved package(s): 0
#> $`cran::arrow`
#> The latest version of `arrow` [cran] at 2023-08-14 was 12.0.1.1, which has 14 unique dependencies (9 with no dependencies.) Created on 2023-09-13 with reprex v2.0.2 |
@schochastics Yes, it is about moving either There was a time when all packages were in |
I think this is a better comparison: the additional packages that need to be installed by introducing it to Imports. original_deps <- c("tools", "stats", "utils", "foreign", "haven", "curl", "data.table", "readxl", "tibble", "stringi", "writexl", "lifecycle", "R.utils")
ori <- rang::resolve(original_deps, snapshot_date = Sys.Date())
#> Warning: Some package(s) can't be resolved: cran::tools, cran::stats,
#> cran::utils
nrow(rang:::.generate_installation_order(ori))
#> [1] 38
arrow <- rang::resolve(c(original_deps, "arrow"), snapshot_date = Sys.Date())
#> Warning: Some package(s) can't be resolved: cran::tools, cran::stats,
#> cran::utils
nrow(rang:::.generate_installation_order(arrow))
#> [1] 41
readODS <- rang::resolve(c(original_deps, "readODS"), snapshot_date = Sys.Date())
#> Warning: Some package(s) can't be resolved: cran::tools, cran::stats,
#> cran::utils
nrow(rang:::.generate_installation_order(readODS))
#> [1] 40 Created on 2023-09-13 with reprex v2.0.2 |
The disadvantage is no desktop software support.
|
hmm tough decision, but i think my vote is on arrow given its importance for DS. |
Let's go with |
arrow
to Imports
TODOs
|
|
Regarding #315 (comment), another disadvantage of I noticed |
@wlandau Thank you for the input. Unfortunately, your input came at a very bad time point, where Of course, I don't want you to have bad experience using I believe in Agile and we can make mistakes too. Because you represent the user community to give us feedback, we will listen to it. I am willing to remove In these few days, please, if you really don't want |
arrow
to Imports
@wlandau I just wanted to let you know that |
@chainsawriot, thank you very much for accommodating. This change really helps my team develop our infrastructure and tools. |
This decision has kind of sat wrong with me for a long time. I get the concern about If That said, I think depending on So another option might be to work upstream on reverse dependencies that don't seem appropriate to remove their reliance on |
In fact, as of this writing, > tools::package_dependencies(package = "rio", reverse = TRUE)
$rio
[1] "allMT" "boxr" "bruceR"
[4] "childfree" "cloudstoR" "datamods"
[7] "dataquieR" "DistPlotter" "dpmr"
[10] "editData" "epiCleanr" "estadistica"
[13] "ExPanDaR" "framecleaner" "genogeographer"
[16] "gesisdata" "heterogen" "IGoRRR"
[19] "importinegi" "ISRaD" "kibior"
[22] "metaConvert" "mmstat4" "NormalityAssessment"
[25] "normfluodbf" "octopus" "pewdata"
[28] "PRISMA2020" "psData" "ropercenter"
[31] "tfrmtbuilder" "varsExplore" "welo" |
@jsonbecker Thank you for the feedback. I agree with you that the package was meant to be an easy, unified wrapper for interactive usage. But as things naturally evolved, we also need to adapt to the (new) reality that R package developers also use With this reality, it increases the complexity for adjusting the supported formats in the "Default" and "Suggest" tier. Increasing the default formats is nice (like @schochastics and I did for rio v1.0.0 to support Having said that, please keep this discourse going. Maybe we can find a good solution to this. Footnotes
|
Just a slight update: To understanding the packages that use tools::package_dependencies(packages = "rio", reverse = TRUE, recursive = TRUE)
#> $rio
#> [1] "allMT" "boxr" "bruceR"
#> [4] "childfree" "cloudstoR" "datamods"
#> [7] "dataquieR" "DistPlotter" "dpmr"
#> [10] "editData" "epiCleanr" "estadistica"
#> [13] "ExPanDaR" "framecleaner" "genogeographer"
#> [16] "gesisdata" "heterogen" "IGoRRR"
#> [19] "importinegi" "ISRaD" "kibior"
#> [22] "metaConvert" "mmstat4" "NormalityAssessment"
#> [25] "normfluodbf" "octopus" "pewdata"
#> [28] "PRISMA2020" "psData" "ropercenter"
#> [31] "tfrmtbuilder" "varsExplore" "welo"
#> [34] "ChineseNames" "PsychWordVec" "TestAnaAPP"
#> [37] "esquisse" "moreparty" "safetyGraphics"
#> [40] "vvdoctor" "ggplotAssist" "rrtable"
#> [43] "SemNetCleaner" "presenter" "tidybins"
#> [46] "validata" "shinyrecipes" "FMAT"
#> [49] "webr" "scicomptools" Created on 2024-05-14 with reprex v2.1.0 |
Keep an eye on this |
According to cransay, |
https://cran.r-project.org/web//packages/nanoparquet/index.html Min R version is 4.0.0. |
@wlandau I am thinking about adding back parquet support. But this time, I would like to try it with I don't want to repeat the same thing like v1.0.0, i.e. you only noticed the added Thank you very much! |
Maybe I should reach out to the datamods team, e.g. @pvictor . |
I'm not sure I understand the problem, is it because datamods (and esquisse) depends on rio? |
@pvictor Now, with the release of
Thank you very much! Footnotes
|
@wlandau @pvictor I just wanted to let you know that I have produced a branch that adds back the default support for parquet using It would be super nice if you could give it a test and see if it has any impact on your use cases. From my testing, it increases the compiling time by 21 seconds on a blank state Rocker container. I was wondering if this level of increase in compiling is acceptable. At least it is not "several minutes" as mentioned here. I will consider your comments / evaluations before merging it to |
It's been a while since I looked at this thread, and the stuff I maintain no longer strongly depends on |
Great work @chainsawriot , it's great that {rio} support parquet files! |
|
@chainsawriot To be explicit, of course I do not expect you to fix anything for big-endian in P. S. Otherwise any solution from my side will be ugly: either I need to peg |
@barracuda156 Thank you for chipping in. I actually don't mind moving I will give you an update. I really hope that I can finish it before the CRAN summer break. |
@barracuda156 I rolled back for now and it should be on CRAN soon. But I really hope that you can help @gaborcsardi to make |
@chainsawriot Thank you, update merged in macports/macports-ports@97b00a2 |
Except those plain text formats, all binary formats supported by this package out of the box are proprietary formats (Excel, SAS, Stata, SPSS), provided by
openxlsx
,haven
, andreadxl
. These formats are popular and I support that they should remain the default. However, a proposal is to support at least one open binary format, which is 3 vs 1. I believe it's fairer. It also allows one to convert proprietary formats to a fast but open binary format out of the box.From our list, there are Apache Parquet, feather, fst, and OASIS ODS. I think Parquet is the ideal candidate for this because it is fast and popular. One drawback is that Desktop application for opening Parquet file is not ubiquitous. ODS on the other hand is much slower but has an edge that Excel, LibreOffice, and Google Sheets all support it.
Disclosures of Possible Conflicts of Interest: I am also the maintainer of
readODS
The text was updated successfully, but these errors were encountered: