Skip to content
Change the repository type filter

All

    Repositories list

    • The core pipeline used to ingest online news stories in the Media Cloud archive.
      Python
      Apache License 2.0
      52384Updated Nov 19, 2024Nov 19, 2024
    • UNDER CONSTRUCTION - A package containing a library of issue validators in a flexibly deployable wrapper.
      Jupyter Notebook
      1040Updated Nov 19, 2024Nov 19, 2024
    • Code that drives the public web-based tools for the Media Cloud Online News Archive and Directory.
      JavaScript
      Apache License 2.0
      1510495Updated Nov 13, 2024Nov 13, 2024
    • Public client for consuming content from the Media Cloud Online News Archive & Directory.
      Python
      Apache License 2.0
      277132Updated Nov 7, 2024Nov 7, 2024
    • Internal library to allow querying multiple media platforms with a consistent API.
      Python
      2050Updated Nov 4, 2024Nov 4, 2024
    • How Media Cloud approaches extracting metadata from online news stories
      Python
      Apache License 2.0
      51250Updated Oct 26, 2024Oct 26, 2024
    • Internal API server that offers search access to the Media Cloud Online News Archive (in Elasticsearch).
      Python
      GNU Affero General Public License v3.0
      31100Updated Oct 25, 2024Oct 25, 2024
    • sous-chef

      Public
      Configurable Data Analytics Pipeline
      Python
      0180Updated Oct 21, 2024Oct 21, 2024
    • Intelligently fetch lists of URLs from a large collection of RSS Feeds as part of the Media Cloud Directory.
      Python
      Apache License 2.0
      66111Updated Oct 16, 2024Oct 16, 2024
    • An internal client library to access the new Mediacloud news archive search.
      Python
      Apache License 2.0
      2031Updated Oct 10, 2024Oct 10, 2024
    • Find rss, atom, xml, and rdf feeds on webpages
      Python
      MIT License
      133041Updated Oct 10, 2024Oct 10, 2024
    • simple toolkit of tools for consuming sitemaps
      Python
      Apache License 2.0
      1420Updated Oct 9, 2024Oct 9, 2024
    • mc-manage

      Public
      Python
      0000Updated Oct 8, 2024Oct 8, 2024
    • sc-buffet

      Public
      Sous-chef buffet - Self-service data access for sous-chef.
      Python
      0050Updated Oct 3, 2024Oct 3, 2024
    • Daily performance metrics for the mediacloud application
      Python
      0010Updated Sep 20, 2024Sep 20, 2024
    • devops tools
      Python
      Apache License 2.0
      0010Updated Aug 26, 2024Aug 26, 2024
    • A Python client for the CLIFF geoparsing tool
      Python
      MIT License
      5501Updated May 21, 2024May 21, 2024
    • A client library to access the Wayback Machine news archive search.
      Python
      Apache License 2.0
      2410Updated Dec 15, 2023Dec 15, 2023
    • web-tools

      Public archive
      The shared repository for Media Cloud web apps (Explorer, Source Manager, Topic Mapper)
      JavaScript
      Apache License 2.0
      3064314Updated Dec 14, 2023Dec 14, 2023
    • A set of jupyter notebooks demonstrating how to use the Media Cloud API.
      Jupyter Notebook
      143400Updated Dec 13, 2023Dec 13, 2023
    • backend

      Public archive
      Media Cloud is an open source, open data platform that allows researchers to answer quantitative questions about the content of online media.
      Python
      GNU Affero General Public License v3.0
      8728113125Updated Nov 20, 2023Nov 20, 2023
    • Dokku app that serves a static HTML catch-all page, displayed for bad domains
      HTML
      0000Updated Oct 25, 2023Oct 25, 2023
    • A simple homepage for the CLIFF project
      HTML
      MIT License
      1100Updated May 30, 2023May 30, 2023
    • Tag news stories based on models trained on the NYT corpus.
      Python
      Apache License 2.0
      124016Updated Mar 1, 2023Mar 1, 2023
    • Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.
      Python
      Other
      3023022Updated Nov 7, 2022Nov 7, 2022
    • glimpse

      Public archive
      Get a glimpse of attention to a topic on social media.
      Python
      Apache License 2.0
      2280Updated Sep 19, 2022Sep 19, 2022
    • Helpful micro-service to return results from word2vec models
      Python
      MIT License
      4200Updated Jul 29, 2022Jul 29, 2022
    • cliff-annotator

      Public archive
      A lightweight server to allow HTTP requests to the Stanford Named Entity Recognized and a heavily modified CLAVIN geoparser.
      Java
      Apache License 2.0
      35119910Updated May 20, 2022May 20, 2022
    • Notebook demonstrating how to create and update a Media Cloud collection.
      Jupyter Notebook
      0000Updated Mar 30, 2022Mar 30, 2022
    • Temporal server configuration
      0000Updated Jan 4, 2022Jan 4, 2022