Skip to content
This repository has been archived by the owner on Nov 30, 2024. It is now read-only.

Commit

Permalink
Guard against repository being blank
Browse files Browse the repository at this point in the history
Fixes #85
  • Loading branch information
edsu committed Apr 5, 2022
1 parent 696363e commit 286a566
Show file tree
Hide file tree
Showing 6 changed files with 119 additions and 83 deletions.
2 changes: 1 addition & 1 deletion package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@
"description": "A clearinghouse for tweet datasets",
"version": "0.0.2",
"author": "Documenting the Now <[email protected]>",
"engines": {
"node": "12"
},
"dependencies": {
"@material-ui/core": "^4.11.2",
"@material-ui/styles": "^4.11.2",
Expand Down
6 changes: 3 additions & 3 deletions src/components/datasets.js
Original file line number Diff line number Diff line change
Expand Up @@ -257,13 +257,13 @@ function filterSearch(datasets, search) {
const pattern = new RegExp(search, 'i')
const slugs = []
for (const d of datasets) {
if (d.title.match(pattern)) {
if (d.title && d.title.match(pattern)) {
slugs.push(d.slug)
} else if (d.description.match(pattern)) {
} else if (d.description && d.description.match(pattern)) {
slugs.push(d.slug)
} else if (d.creators.map(c => c.name).join(' ').match(pattern)) {
slugs.push(d.slug)
} else if (d.repository.match(pattern)) {
} else if (d.repository && d.repository.match(pattern)) {
slugs.push(d.slug)
} else if (d.subjects.join(' ').match(pattern)) {
slugs.push(d.slug)
Expand Down
4 changes: 2 additions & 2 deletions src/datasets/aspw-twitter-dataset-2021-11-30.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ dates:
- end: '2021-11-27'
start: '2020-11-12'
published: 2021-11-30
repository:
repository: GitHub
subjects:
- coronavirus
- pandemia
Expand All @@ -18,7 +18,7 @@ subjects:
- church
- border crisis
- vaccinations
title: the Social Archive of the Polish Web
title: The Social Archive of the Polish Web
tweets: 4617353
url: https://github.com/mw0000/aspw-twitter-dataset-2021-11-30
---
Expand Down
177 changes: 105 additions & 72 deletions static/data/datasets.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,37 @@
[
{
"title": "The Social Archive of the Polish Web",
"creators": [
{
"name": "Marcin Wilkowski",
"email": "aspw[at]wilkowski.org"
}
],
"added": "2021-11-30T23:42:14.000Z",
"published": "2021-11-30T00:00:00.000Z",
"dates": [
{
"start": "2020-11-12",
"end": "2021-11-27"
}
],
"repository": "GitHub",
"subjects": [
"coronavirus",
"pandemia",
"politics",
"media",
"cities",
"LGBT",
"church",
"border crisis",
"vaccinations"
],
"tweets": 4617353,
"url": "https://github.com/mw0000/aspw-twitter-dataset-2021-11-30",
"slug": "aspw-twitter-dataset-2021-11-30",
"description": "<p>4617353 tweets IDs (4398351 unique) in Polish language covering topics like: coronavirus pandemia, politics, media, cities, LGBT, church, border crisis, vaccinations. For details, see meta.csv in every directory. All this data together with the URLs of web pages linked within that tweets can be accessed in <a href=\"https://github.com/mw0000/aspw-public-archive\">https://github.com/mw0000/aspw-public-archive</a> or <a href=\"https://aspw.pl/pakiety\">https://aspw.pl/pakiety</a>.</p>"
},
{
"title": "#retweetthe8th: 2018 Referendum to repeal the 8th Amendment of the Constitution of Ireland",
"creators": [
Expand Down Expand Up @@ -2811,7 +2844,7 @@
],
"tweets": 5655632,
"url": "http://dx.doi.org/10.7910/DVN/TQBLWZ",
"slug": "20170907-end-of-term-2016-us-government-twitter-archive",
"slug": "20170907-end-of-term-2016-u-s-government-twitter-archive",
"description": "<p>This dataset contains the tweet ids of 5,655,632 tweets that were collected from approximately 3000 Twitter accounts affiliated with the U.S. government. They were collected between October 21, 2016 and January 21, 2017 from the Twitter API using Social Feed Manager. This dataset was created as part of the End of Term Web Archiving initiative. The lists of accounts came from the U.S. Digital Registry and by public submissions.</p>"
},
{
Expand Down Expand Up @@ -2845,7 +2878,7 @@
],
"tweets": 5655632,
"url": "http://dx.doi.org/10.7910/DVN/TQBLWZ",
"slug": "20170907-end-of-term-2016-u-s-government-twitter-archive",
"slug": "20170907-end-of-term-2016-us-government-twitter-archive",
"description": "<p>This dataset contains the tweet ids of 5,655,632 tweets that were collected from approximately 3000 Twitter accounts affiliated with the U.S. government. They were collected between October 21, 2016 and January 21, 2017 from the Twitter API using Social Feed Manager. This dataset was created as part of the End of Term Web Archiving initiative. The lists of accounts came from the U.S. Digital Registry and by public submissions.</p>"
},
{
Expand Down Expand Up @@ -3312,13 +3345,13 @@
],
"repository": "Harvard Dataverse",
"subjects": [
"Womensmarch",
"Women",
"Activism",
"Politics"
],
"tweets": 7275228,
"url": "https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/5ZVMOR",
"slug": "20170203-womens-march-tweet-ids",
"slug": "20170203-women-s-march-tweet-ids",
"description": "<p>This dataset contains the tweet ids of 7,275,228 tweets related to the Women's March on January 21, 2017. They were collected between December 19, 2016 and January 23, 2017 from the Twitter API using Social Feed Manager. See included README.txt for additional information.</p>"
},
{
Expand All @@ -3343,13 +3376,13 @@
],
"repository": "Harvard Dataverse",
"subjects": [
"Women",
"Womensmarch",
"Activism",
"Politics"
],
"tweets": 7275228,
"url": "https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/5ZVMOR",
"slug": "20170203-women-s-march-tweet-ids",
"slug": "20170203-womens-march-tweet-ids",
"description": "<p>This dataset contains the tweet ids of 7,275,228 tweets related to the Women's March on January 21, 2017. They were collected between December 19, 2016 and January 23, 2017 from the Twitter API using Social Feed Manager. See included README.txt for additional information.</p>"
},
{
Expand Down Expand Up @@ -3645,31 +3678,6 @@
"slug": "20161230-the-fall-of-aleppo-tweets-aleppo-2016-12-13-through-2016-12-29",
"description": "<p>8,595,589 tweet ids for aleppo tweets captured during the fall of Aleppo in December 2016. Tweets can be \"rehydrated\" with Documenting the Now's twarc (<a href=\"https://github.com/DocNow/twarc\">https://github.com/DocNow/twarc</a>). twarc.py --hydrate aleppo<em>tweet</em>ids.txt > aleppo.json</p>"
},
{
"title": "#elxn42 tweets (42nd Canadian Federal Election)",
"creators": [
{
"name": "Nick Ruest",
"email": "[email protected]"
}
],
"added": "2016-12-24T15:14:07.000Z",
"published": "2015-11-19T00:00:00.000Z",
"dates": [
{
"start": "2015-07-25",
"end": "2015-11-05"
}
],
"repository": "Scholars Portal Dataverse",
"subjects": [
"Politics"
],
"tweets": 3039804,
"url": "http://hdl.handle.net/10864/11270",
"slug": "20161224-elxn42-tweets-42nd-canadian-federal-election",
"description": "<p>Tweet ids for #elxn42 tweets.</p>"
},
{
"title": "Ferguson Tweets",
"creators": [
Expand Down Expand Up @@ -3700,6 +3708,31 @@
"slug": "20161224-ferguson-tweets",
"description": "<p>This item represents a collection of 13,480,000 tweet IDs that mentioned 'ferguson' from 2014-08-10 to 2014-08-27 and 15,080,078 tweet IDs that mention \"ferguson\" between 2014-11-11 and 2014-12-08.\nThe first set includes tweets for the two week period after the shooting of Michael Brown, and the second range includes tweets around the grand jury's decision not to indict police office Darren Wilson which was announced on 2014-11-24.\nThe first set of tweets were collected by Ed Summers at the University of Maryland and the second was a collaboration between Molly Loyd, Gregory Coleman, Kimberly Lamke, Benjamin Sugar and Ed Summers.</p>"
},
{
"title": "#elxn42 tweets (42nd Canadian Federal Election)",
"creators": [
{
"name": "Nick Ruest",
"email": "[email protected]"
}
],
"added": "2016-12-24T15:14:07.000Z",
"published": "2015-11-19T00:00:00.000Z",
"dates": [
{
"start": "2015-07-25",
"end": "2015-11-05"
}
],
"repository": "Scholars Portal Dataverse",
"subjects": [
"Politics"
],
"tweets": 3039804,
"url": "http://hdl.handle.net/10864/11270",
"slug": "20161224-elxn42-tweets-42nd-canadian-federal-election",
"description": "<p>Tweet ids for #elxn42 tweets.</p>"
},
{
"title": "Yes All Women Twitter Dataset",
"creators": [
Expand Down Expand Up @@ -3802,54 +3835,54 @@
"description": "<p>Tweet ids for #panamapapers tweets.</p>"
},
{
"title": "#thechalkening tweets",
"title": "#paris #Bataclan #parisattacks #porteouverte tweets",
"creators": [
{
"name": "Nick Ruest",
"email": "[email protected]"
}
],
"added": "2016-12-23T22:40:17.000Z",
"published": "2016-04-13T00:00:00.000Z",
"published": "2015-12-12T00:00:00.000Z",
"dates": [
{
"start": "2016-04-03",
"end": "2016-06-06"
"start": "2015-11-04",
"end": "2015-12-08"
}
],
"repository": "Scholars Portal Dataverse",
"subjects": [
"Politics"
],
"tweets": 115524,
"url": "http://hdl.handle.net/10864/11591",
"slug": "20161223-thechalkening-tweets",
"description": "<p>Tweet ids for #thechalkening tweets.</p>"
"tweets": 14939154,
"url": "http://hdl.handle.net/10864/11312",
"slug": "20161223-paris-bataclan-parisattacks-porteouverte-tweets",
"description": "<p>Tweet ids for #paris #Bataclan #parisattacks #porteouverte tweets.</p>"
},
{
"title": "#paris #Bataclan #parisattacks #porteouverte tweets",
"title": "#thechalkening tweets",
"creators": [
{
"name": "Nick Ruest",
"email": "[email protected]"
}
],
"added": "2016-12-23T22:40:17.000Z",
"published": "2015-12-12T00:00:00.000Z",
"published": "2016-04-13T00:00:00.000Z",
"dates": [
{
"start": "2015-11-04",
"end": "2015-12-08"
"start": "2016-04-03",
"end": "2016-06-06"
}
],
"repository": "Scholars Portal Dataverse",
"subjects": [
"Politics"
],
"tweets": 14939154,
"url": "http://hdl.handle.net/10864/11312",
"slug": "20161223-paris-bataclan-parisattacks-porteouverte-tweets",
"description": "<p>Tweet ids for #paris #Bataclan #parisattacks #porteouverte tweets.</p>"
"tweets": 115524,
"url": "http://hdl.handle.net/10864/11591",
"slug": "20161223-thechalkening-tweets",
"description": "<p>Tweet ids for #thechalkening tweets.</p>"
},
{
"title": "#YMMfire tweets",
Expand All @@ -3876,31 +3909,6 @@
"slug": "20161223-ymmfire-tweets",
"description": "<p>Tweet ids for #YMMfire tweets captured during the 2016 Fort McMurray Wildfire from 2016-05-01 to 2016-06-25.</p>"
},
{
"title": "Election 2012 Tweet ID dataset",
"creators": [
{
"name": "Microsoft",
"email": null
}
],
"added": "2016-12-23T22:03:14.000Z",
"published": "2016-05-12T00:00:00.000Z",
"dates": [
{
"start": "2012-07-01",
"end": "2012-11-07"
}
],
"repository": "Microsoft",
"subjects": [
"Politics"
],
"tweets": 38000000,
"url": "https://www.microsoft.com/en-us/download/details.aspx?id=52598",
"slug": "20161223-election-2012-tweet-id-dataset",
"description": "<p>This data set identifies 38M tweets collected for the analysis of social media messages related to the 2012 U.S. Presidential election. The data set provides tweet IDs for tweets containing the words \"obama\", \"romney\", or both (case-insensitive matching) during the period from July 1, 2012 through November 7, 2012. The paper, “Online and Social Media Data As an Imperfect Continuous Panel Survey.” PLoS ONE 11(1): e0145406 by Diaz et al. provides further description of the dataset.</p>"
},
{
"title": "2016 United States Presidential Election Tweet Ids",
"creators": [
Expand Down Expand Up @@ -3934,6 +3942,31 @@
"slug": "20161223-2016-united-states-presidential-election-tweet-ids",
"description": "<p>This dataset contains the tweet ids of approximately 280 million tweets related to the 2016 United States presidential election. They were collected between July 13, 2016 and November 10, 2016 from the Twitter API using Social Feed Manager. These tweet ids are broken up into 12 collections. Each collection was collected either from the GET statuses/user_timeline method of the Twitter REST API or the POST statuses/filter method of the Twitter Stream API.</p>"
},
{
"title": "Election 2012 Tweet ID dataset",
"creators": [
{
"name": "Microsoft",
"email": null
}
],
"added": "2016-12-23T22:03:14.000Z",
"published": "2016-05-12T00:00:00.000Z",
"dates": [
{
"start": "2012-07-01",
"end": "2012-11-07"
}
],
"repository": "Microsoft",
"subjects": [
"Politics"
],
"tweets": 38000000,
"url": "https://www.microsoft.com/en-us/download/details.aspx?id=52598",
"slug": "20161223-election-2012-tweet-id-dataset",
"description": "<p>This data set identifies 38M tweets collected for the analysis of social media messages related to the 2012 U.S. Presidential election. The data set provides tweet IDs for tweets containing the words \"obama\", \"romney\", or both (case-insensitive matching) during the period from July 1, 2012 through November 7, 2012. The paper, “Online and Social Media Data As an Imperfect Continuous Panel Survey.” PLoS ONE 11(1): e0145406 by Diaz et al. provides further description of the dataset.</p>"
},
{
"title": "#JeSuisCharlie, #JeSuisAhmed, #JeSuisJuif, #CharlieHebdo tweets",
"creators": [
Expand Down
Loading

0 comments on commit 286a566

Please sign in to comment.