Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDM_SOURCE.VOCABULARY_VERSION #28

Open
ericaVoss opened this issue Aug 10, 2016 · 6 comments
Open

CDM_SOURCE.VOCABULARY_VERSION #28

ericaVoss opened this issue Aug 10, 2016 · 6 comments

Comments

@ericaVoss
Copy link
Contributor

I think I see a copy of the SynPUF out on Postgres Public. If this is truly a build then the CDM_SOURCE.VOCABULARY_VERSION would be more informative if you pulled this information off the Vocab you ran off of:

SELECT VOCABULARY_VERSION 
FROM vocabulary 
WHERE VOCABULARY_ID = 'None';

This way you know exactly which Vocab was used in the build.

@ChristopheLambert
Copy link
Contributor

Where do you think we should report this information? Here is the result of your query:

select VOCABULARY_VERSION from vocabulary where VOCABULARY_ID = 'None';

vocabulary_version

v5.0 08-JUN-16

@ericaVoss
Copy link
Contributor Author

Yes, put that in the CDM_SOURCE.VOCABULARY_VERSION column instead of "Version 5". The answer is more informative that just "Version 5" and still contains the V5 information.

@ChristopheLambert
Copy link
Contributor

I agree this is more informative. I'm confused what you are asking me to do. We do not ship vocabulary files as part of the ready-to-go data -- the VOCABULARY.csv file you get when you download the vocabulary contains the version information you are requesting. I don't believe any of the non-vocabulary tables contain a field for vocabulary version.

Perhaps I'm not understanding what Postgres Public is. I uploaded the CDM5 tables, excluding the vocabulary files to ftp://ftp.ohdsi.org/synpuf. Perhaps you are talking about a database somebody else loaded, and did not include a proper vocabulary table?

@ericaVoss
Copy link
Contributor Author

I'm asking that the CDM Builder for SynPUF populates this CDM_SOURCE.VOCABULARY_VERSION. CDM_SOURCE.VOCABULARY_VERSION is part of the CDM not the Vocabulary. The SynPUF CDM Builder should be generating the information in this table. Instead of "Version 5" it should tell the user what Vocabulary the CDM was built with. This is important because if you build SynPUF on Vocabulary 20160311 and I build on 20160620 we'll have slightly different builds. Users of SynPUF should know what Vocabulary the CDM was built on.

I can see however on the FTP there is no CDM_SOURCE defined so I'm guessing the SynPUF builder doesn't build this table? However on the OHDSI Postgres box I think I see a copy of SynPUF with the CDM_SOURCE filled out. I already send an email to @leeevans to try to figure out what is there and how it go there. Let me get back to you when I know more.

@ChristopheLambert
Copy link
Contributor

Ah, I see. No the SynPUF builder does not create the CDM_SOURCE table, and you are right that it should.

@ericaVoss
Copy link
Contributor Author

@ChristopheLambert - learned the problem, the data I was looking at was not your SynPUF but the Lee's 1K sample SynPUF. HAHAHA - so I logged an issue for a problem that isn't technically yours. However, the CDM_SOURCE table is a good table to have. Check out our manuals if you want examples of how it should be populated:
https://github.com/OHDSI/ETL-CDMBuilder/tree/master/man

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants