Index card datasets for training and evaulating models for conversion of index cards to structured data/metadata
AI & ML interests
π€ Hugging Face x πΈ BigScience initiative to create open source community resources for LAMs.
Recent Activity
View all activity
This collection contains models, datasets and spaces related to historic language models
-
dbmdz/bert-base-historic-multilingual-cased
Fill-Mask β’ 0.1B β’ Updated β’ 175 β’ 8 -
dbmdz/bert-base-historic-multilingual-64k-td-cased
Fill-Mask β’ 0.1B β’ Updated β’ 16 β’ 1 -
Riksarkivet/bert-base-cased-swe-historical
Fill-Mask β’ 0.1B β’ Updated β’ 29 β’ 4 -
dell-research-harvard/AmericanStories
Updated β’ 6.58k β’ 153
Datasets which can help train or evaluate various approaches to automatic metadata generation and extraction.
-
biglam/doab-metadata-extraction
Viewer β’ Updated β’ 8.09k β’ 265 β’ 12 -
biglam/rubenstein-manuscript-catalog
Viewer β’ Updated β’ 49.7k β’ 162 β’ 3 -
biglam/bpl-card-catalog
Viewer β’ Updated β’ 838k β’ 839 β’ 5 -
biglam/harvard-library-bibliographic-dataset
Viewer β’ Updated β’ 11.1M β’ 528 β’ 2
Historic Newspaper Datasets on the Hub
-
The Newspaper Navigator Dataset: Extracting And Analyzing Visual Content from 16 Million Historic Newspaper Pages in Chronicling America
Paper β’ 2005.01583 β’ Published β’ 2 -
bigscience-historical-texts/hipe2020
Updated β’ 47 β’ 3 -
bigscience-historical-texts/HIPE2020_sent-split
Updated β’ 30 -
biglam/bnl_newspapers1841-1879
Viewer β’ Updated β’ 631k β’ 73 β’ 2
Index card datasets for training and evaulating models for conversion of index cards to structured data/metadata
Datasets which can help train or evaluate various approaches to automatic metadata generation and extraction.
-
biglam/doab-metadata-extraction
Viewer β’ Updated β’ 8.09k β’ 265 β’ 12 -
biglam/rubenstein-manuscript-catalog
Viewer β’ Updated β’ 49.7k β’ 162 β’ 3 -
biglam/bpl-card-catalog
Viewer β’ Updated β’ 838k β’ 839 β’ 5 -
biglam/harvard-library-bibliographic-dataset
Viewer β’ Updated β’ 11.1M β’ 528 β’ 2
This collection contains models, datasets and spaces related to historic language models
-
dbmdz/bert-base-historic-multilingual-cased
Fill-Mask β’ 0.1B β’ Updated β’ 175 β’ 8 -
dbmdz/bert-base-historic-multilingual-64k-td-cased
Fill-Mask β’ 0.1B β’ Updated β’ 16 β’ 1 -
Riksarkivet/bert-base-cased-swe-historical
Fill-Mask β’ 0.1B β’ Updated β’ 29 β’ 4 -
dell-research-harvard/AmericanStories
Updated β’ 6.58k β’ 153
Historic Newspaper Datasets on the Hub
-
The Newspaper Navigator Dataset: Extracting And Analyzing Visual Content from 16 Million Historic Newspaper Pages in Chronicling America
Paper β’ 2005.01583 β’ Published β’ 2 -
bigscience-historical-texts/hipe2020
Updated β’ 47 β’ 3 -
bigscience-historical-texts/HIPE2020_sent-split
Updated β’ 30 -
biglam/bnl_newspapers1841-1879
Viewer β’ Updated β’ 631k β’ 73 β’ 2