-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
what to do to update to the latest dataverse 4.6.2 #8
Comments
Dataverse 4.7 is out: https://github.com/IQSS/dataverse/releases/tag/v4.7 I'm one of the developers. How can I help? |
Also, the Dataverse team could use some help getting up to speed with Docker. Please see IQSS/dataverse#3938 about how we'd like to start by attempting to use Docker in development environments. Production use would potentially follow. |
@pdurbin we want to modify the docker file in this repo to deploy from the latest version (4.7) |
@pdurbin @moayadnajd Sorry, didn't see this issue until it was called out in the Dataverse repo. I'll see what needs to be done to upgrade to 4.6.x and 4.7. @moayadnajd In IQSS/dataverse#3938 you mentioned that you need to use handles. Could you let me what you need and what you tried? |
@craig-willis we are working in a platform and we send dataset through the API and thats good but we have handle.net account and we need to disabled DOI and use handle support in 4.6.X or 4.7
iam not fully familiar with dataverse so i don't know how to debug that process Thanks for your help |
@moayadnajd can you please make a pull request with your change to that line? You'll need to change the "unzip" line below it as well. |
@pdurbin yes i did change the unzip as i said the process done and i will make pull request |
@moayadnajd One of the bigger changes from the usual Dataverse deploy process is that we pre-generate the database DDL. This needs to be done for each version. I don't know if this is still the case, but in previous versions Dataverse uses EclipseLink to create the database schema during initial startup. @pdurbin You don't happen to have the generated DDL/schema handy for 4.6.x and 4.7? I usually generate it by changing the ddl-generation.output-mode to script during startup. |
@craig-willis how i can change the DDL is it easy to do ? |
@craig-willis i saw this link https://datasets.socialhistory.org/dataset.xhtml?persistentId=hdl:10622/HPIC74 they are using handle instead of the DOI in v.4.3 |
@moayadnajd I need to document the DDL generation process, it's not very straightforward. I also expect other things have changed in the install process between 4.2 and 4.7. I'm not sure about handle support in 4.3. The documentation seems to suggest that handle support is incomplete in 4.3 (http://guides.dataverse.org/en/4.3/installation/config.html). |
@craig-willis ohh so we will wait until there is update to the docker image when you expect it will be ready for latest version ? |
@moayadnajd I'll try to have something in the next day or two. @pdurbin Disregard my previous question, I've generated the DDL. |
@moayadnajd I've pushed two images to Dockerhub from my personal fork, one for Dataverse 4.7 and one for Solr (it looks like the schema.xml changed since we last built our images). craigwillis/dataverse:4.7 You can see the changes here: This hasn't been fully tested and I've done nothing specific to change the configuration for handles. Any feedback welcome. |
@craig-willis thanks! @moayadnajd @omaralsoudanii the Handle config is documented at http://guides.dataverse.org/en/4.7/installation/config.html#persistent-identifiers-and-publishing-datasets but if you have any trouble with Handle support, please open an issue at https://github.com/IQSS/dataverse/issues . Thanks. I was wondering what DDL meant (talking about it with @pameyer and @bjonnh at http://irclog.iq.harvard.edu/dataverse/2017-07-06#i_54184 ) but from master...craig-willis:upgrade-4.7 it's now obvious to me that it comes from |
Thanks, @pdurbin. By DDL I was referring to data definition language, aka SQL schema. During my initial Dockerization of Dataverse, I had problems with EclipseLink as it is configured. In the Docker environment, containers can be easily brought up and down. The current persistence.xml defaults to “create-tables” when the webapp is deployed. If a container is restarted, resulting in redeployment of the webapp, startup fails during table creation. I worked around this with the static DDL/schema which is used to initialize the database during container startup if it's never been run before, and setting eclipselink.ddl-generation to "none" in persistence.xml. There may be a better way to do this with EclipseLink, but I never found it. |
@craig-willis thank you for your help we will start testing and gave you feedback soon |
@pdurbin @craig-willis the build is working now , but the handle registration fails , trying to click on the handle link in the dataset leads to this page in handle.net : The handle you requested -- 20.500.11766/SHEQW3 -- cannot be found. Please contact us if you wish to report this error. Please include information regarding where you found the handle we added the private key file in the jvm option through dataverse.handlenet.admcredfile and used curl for database options mentioned in the documentation to use handle generation through these options mentioned in the documentaion : curl -X PUT -d doi http://localhost:8080/api/admin/settings/:Protocol curl -X PUT -d 10.xxxx http://localhost:8080/api/admin/settings/:Authority is there anyway to debug the handle errors or to check what's the problem ? |
@omaralsoudanii I'm not familiar with the handle configuration. With our Docker image, the log file should be in /usr/local/glassfish4/glassfish/domains/domain1/logs in the Dataverse container. |
@omaralsoudanii for Handle you should be using "hdl" rather than "doi" like this:
For more on this topic, please see http://guides.dataverse.org/en/4.7/installation/config.html#configuring-dataverse-for-handles @omaralsoudanii for help with Dataverse, please contact us at http://guides.dataverse.org/en/4.7/installation/intro.html#getting-help |
@pdurbin sorry i just copied them from the documentaion, I know the value should be hdl ,This is the real values for my curl : you can find the handle in use in our DSPACE production server at https://mel.cgiar.org/repo , |
@omaralsoudanii ah, ok. Hmm. I don't know a lot about Handle support and it's pretty new, included in Dataverse 4.6.2 and higher. Can you please try one or more of the channels listed at http://guides.dataverse.org/en/4.7/installation/intro.html#getting-help ? Thanks! |
@pdurbin thank you i posted there , this error in logs mention : |
@omaralsoudanii thanks, I replied to you at http://irclog.iq.harvard.edu/dataverse/2017-07-11 |
@craig-willis hi! I just left a comment at IQSS/dataverse#4040 (comment) about how I'm trying to use your images on DockerHub in an OpenShift environment. Do you have any time for jump in #dataverse on freenode or wherever you like to talk about what I'm up to? 😄 |
I had a nice chat today with @bodom0015 about this repo at https://gitter.im/nds-org/ndslabs?at=59bff3d9cfeed2eb65247bfb . I got the NDS Dataverse 4.2.3 Docker image deployed to OpenShift and I'm working through problems. I'm posting status updates at IQSS/dataverse#4040 |
@craig-willis @moayadnajd @omaralsoudanii heads up that I'm planning on chatting with @aculich and others about Docker and Kubernetes in a couple hours (10:30am eastern) if you're interested in joining the discussion at #dataverse on freenode. It'll be logged at http://irclog.iq.harvard.edu/dataverse/2018-03-09 and for more background, you can read http://irclog.iq.harvard.edu/dataverse/2018-03-02#i_63995 . Thanks. |
Thanks @pdurbin. I wish I could join but will check out the chat logs. |
@craig-willis no worries. I'm making an attempt to list in one place all the Docker and Kubernetes-related stuff going on so I hope you don't mind that I listed your GitHub username over on my "Dev Efforts by the Dataverse Community" spreadsheet at https://docs.google.com/spreadsheets/d/1pl9U0_CtWQ3oz6ZllvSHeyB0EG1M_vZEC_aZ7hREnhE/edit?usp=sharing For more context on what that spreadsheet is about, please see my " Which GitHub issues are being worked on by the Dataverse community?" post at https://groups.google.com/d/msg/dataverse-community/X2diSWYll0w/ikp1TGcfBgAJ At the moment, I'm trying to characterize your status as wanting to move to official IQSS Docker images once they have been blessed by IQSS. This is the "status" column of the spreadsheet above. Thanks! |
Thanks, @pdurbin. That sounds like a good characterization. Read through notes of your discussion with @aculich -- thank you both for sharing. @Xarthisius @amoeba @mbjones -- you might also be interested The idea of integrating repository systems like Dataverse with containerized analysis environments is something we've discussed extensively in the NDS meetings and is actively part of the Whole Tale project. On the one hand is supporting the ability for researchers to explore/analyze/collaborate around data in research repositories. On the other is defining a way to publish a new class of research object -- data + code/notebook + image or image definition -- that would complement a data or paper publication, and support users re-running these (e.g., via link from repository or publisher site). The Whole Tale project (http://wholetale.org/) defines a "Tale" as shareable/preservable research objects that combine data + code/narrative (e.g., notebook) with the computational environment for reproducibility (e.g., Docker image definition). Initial collaboration is with DataOne and Globus to define a package format that would allow the tale to be published and then run on-demand by users via the Whole Tale or similar platform. Serialization format discussion is going on here whole-tale/whole-tale#24. Whole Tale is also developing a framework to pull data from external sources (e.g. via DOI/URL) for users to actively work on -- a sort of BinderHub with data. If anyone is interested in continuing this discussion (I certainly am), maybe we can find a good place? We started a forum coming out of a related workshop this summer https://groups.google.com/forum/#!forum/container-analysis-environment, which hits others interested in this topic (SciServer, Cyverse, etc), but has almost no activity and misses the repository community. |
@craig-willis I'm glad you got something out of those notes. 😄 My impulse to say that this discussion should happen anywhere but in some random GitHub issue like this but at least it's public so people can read it. More on on this later. I believe I first heard about the Whole Tale project when @victoriastodden gave a keynote address at the 2017 Dataverse Community Meeting entitled "Toward a Reproducible Scholarly Record". See slide 20 and beyond at https://osf.io/5euj9/ via https://projects.iq.harvard.edu/dcm2017/agenda#widget-2 . It sounds neat. I'm not sure how to get computation people talking to repository people. I don't know if this is reflected in my notes but @aculich chatted about this problem. Some thoughts on various channels:
Of course, I'm not sure how much time I have to invest in this conversation personally. I'm happy to join a mailing list or whatever. Maybe you can come to a future Dataverse Community Meeting. 😄 Or maybe you and I could have a phone call some day to compare notes. By the way, I did take a quick look at BinderHub yesterday. I put some first impressions into jupyterhub/mybinder.org-user-guide#80 |
@craig-willis would you be interesting in joining the discussion at https://groups.google.com/d/msg/dataverse-community/VG6gTMEd_Ps/Xy7jDhVoBwAJ ? it was kicked off last Friday by @sean-dooher Kevin Yang, and @aculich |
@pdurbin @craig-willis @moayadnajd
To
and after build accessing : $DOMAIN_NAME/dataverse/root @pdurbin i noticed that the official docker support for dataverse mentions that it's used for development only , Is there a production version ? Thank you ! |
@omaralsoudanii thanks for opening IQSS/dataverse#4665 asking about a production version of Dataverse running in Docker. The short answer is no, there isn't one, but let's discuss further in that issue. |
@craig-willis Are you aware of the new https://github.com/IQSS/dataverse-docker repo? It's being maintained by the community and was recently updated to the latest version of Dataverse, which is 4.9.2 (see IQSS/dataverse-docker#3 ). How do you feel about using it for NDS Labs Workbench? |
@pdurbin We will be very happy to move to the community images. "Specs" describing application stacks for Workbench are pretty straightforward, so in the end this will be a PR to https://github.com/nds-org/ndslabs-specs/tree/master/dataverse. A couple of things:
|
@craig-willis hi! It was an absolute pleasure to chat with you at the 2018 Whole Tale workshop! I know the conversation in this GitHub issue is getting long, but I guess we'll keep going with it here. As you indicated, the fix for this issue is to update the configs at https://github.com/nds-org/ndslabs-specs/tree/master/dataverse to point to a newer version of Dataverse than 4.2.3 (4.9.2 is the latest release as of this writing). Currently, Docker images are being pulled from https://hub.docker.com/r/ndslabs and you'd like to know where you can pull newer images from. You should not pull images from https://hub.docker.com/r/iqss because @IQSS doesn't have the resources at this time to push images there and maintain them. Anything you find under "iqss" is still highly experimental and perhaps somewhat tied to OpenShift. You can read more about these experiments at http://guides.dataverse.org/en/4.9.2/developers/containers.html#future-production-use-on-minishift-openshift-kubernetes So, if not "iqss", where should you pull images from? I'm hoping that you can pull images that the Dataverse community has built. I recently created https://github.com/IQSS/dataverse-docker for the community and @4tikhonov @wilkos-dans and @xibriz from the Dataverse community have been iterating on the images there. What I don't know is if they have pushed images for public use such as your NDS Labs Workbench, which I have documented from the Dataverse perspective at http://guides.dataverse.org/en/4.9.2/installation/prep.html#nds-labs-workbench-for-testing-only . To be clear, NDS Labs Workbench is only for testing so if the Dataverse images aren't perfect, that's fine. They can continue to be improved in the future. I found some images at https://hub.docker.com/r/vtycloud/ that seem to belong to @4tikhonov. If he approves of the use of those images for the purpose of spinning up Dataverse for testing with NDS Labs Workbench, that's fine with me. I also wanted to circle back to the conversation above "idea of integrating repository systems like Dataverse with containerized analysis environments" and how there seems to be no good place to discuss this. This issue probably isn't the best place. 😄 During the Whole Tale workshop I mentioned at new forum that was launched by @aculich and others at PEARC18 during a talk called "On Launching a Research Computing Q&A Site using StackExchange and Discourse". Here's the abstract from https://pearc18.conference-program.com/?page_id=10&id=bof117&sess=sess200
The website for the new forum is https://ask.cyberinfrastructure.org and perhaps it would be a good place to have a vendor-neutral discussion. I was involved a bit in the effort to seed the site with questions at launch and I know they are friendly to questions about data repositories. For example, I created the question at https://ask.cyberinfrastructure.org/t/where-can-i-find-introductory-material-on-publishing-research-data/154 and I would be happy to create another one about integrating data repositories with containerized analysis environments. Anyway, it sounded like this new forum wasn't on your radar so I wanted to put it there. 😄 |
@pdurbin Indeed -- it was great to finally meet you and I thoroughly enjoyed our discussions. On the topic of the community Dataverse images, if you don't plan to push to the Thanks for the pointer to https://ask.cyberinfrastructure.org. I was at PEARC and should've tracked on this, but apparently have a poor radar. |
@craig-willis I wanted to give you a heads up that IQSS/dataverse#5317 was merged yesterday which means that you should have a bit more raw material to work with inside of Kubernetes, if you want. Specifically, we have added SQL files to create the database schema for Dataverse for all version from 4.0 to 4.9.4 and have added instructions for ourselves to continue adding these SQL files for all future releases. I considered mentioning this on whole-tale/whole-tale#49 but this issue seemed a bit more on topic. Maybe we should update the title from 4.6.2 to 4.9.4, since that's the latest Dataverse release. Time flies! One other thing I'll mention is that I'd love for you to chat with @poikilotherm at some point. Over at http://irclog.iq.harvard.edu/dataverse/2018-11-28#i_80180 I was suggesting that I could create for him a git repo under IQSS called dataverse-kubernetes because he has some ideas of how to do docker and kubernetes properly, like you do, but he might benefit from a collaborator like yourself. I don't know. I'm always happy to create repos like this under IQSS, flag them as community-supported in our guides, and create the proper "team" in GitHub to let the hacking begin. Sometimes great things happen. Sometimes the projects never really take off. Mornings in http://chat.dataverse.org are usually a good time to catch him, because timezones. 😄 |
Great news, @pdurbin. I'll hopefully have time to revisit this next week (also in the context of IQSS/dataverse-docker#8) and will make a note to try to connect on IRC. |
@craig-willis please take a look at IQSS/dataverse#5373 because @poikilotherm wants to host a community call on Dataverse and Kubernetes. Everyone is welcome, of course! |
what should i change to make it work with the latest version
i changed the download link to be 4.6.2 but it not working
The text was updated successfully, but these errors were encountered: