Skip to content

Releases: tedunderwood/noveltmmeta

NovelTM Metadata, Publication State

28 May 17:01
Compare
Choose a tag to compare

Metadata about 210,266 volumes identified as English-language fiction in HathiTrust Digital Library, 1700-2009. The metadata is divided into seven lists; several of the short lists have been manually corrected. This release of metadata for "NovelTM Datasets for English-Language Fiction" was timed for publication of the accompanying article. It represents missing data in the date field better than the first release did, and also removes duplicate rows in volumemeta.tsv and weighted_subset.tsv, guided by a suggestion from Matt Wilkens.

NovelTM Metadata with Missing Dates

26 Apr 02:31
Compare
Choose a tag to compare

In previous releases, the values 0 and 2100 were used to mark missing values in inferreddate and latestcomp columns. To avoid confusion, they have been replaced with blanks that will read as NaN.

NovelTM metadata without confusing categories from MARC 008

04 Oct 12:15
Compare
Choose a tag to compare

For final release, we filtered out some genre categories that we had derived from the fixed-length MARC leader in field 008. These were often conflicting, and seemed likely to create confusion.

NovelTM metadata (first release)

28 Aug 15:53
Compare
Choose a tag to compare
v1.0

changing title