OSN Use Cases - Status
The following is a summary of active use cases that either have shared data or are soon to store data on OSN. PoC: point of contact.
Intensively Managed Landscapes Critical Zone Observatory
Scientific PoC: Praveen Kumar (pkumar3691@gmail.com, kumar1@illinois.edu)
Technical PoC: Luigi Marini (lmarini@illinois.edu)
Institution: University of Illinois at Urbana-Champaign
Related disciplines: Earth sciences, Hydrology, Ecology, Microbiology
General dataset description: multi-format data from field instruments and remote sensors
Total dataset size: 50 TB
Currently stored OSN data: yes
Type: instrument data, LiDAR source
Volume: 3 TB
Access methods: Clowder instance, S3 API
Terra Fusion
Scientific PoC: Larry Di Girolamo (gdi@illinois.edu)
Technical PoC: Don Petravick (petravic@illinois.edu), Matias Carrasco-Kind (mcarras2@illinois.edu)
Institution: University of Illinois at Urbana-Champaign
Related disciplines: Earth sciences
General dataset description: multi-instrument integrated data cubes from NASA’s Terra mission satellite sensors
Total dataset size: 2.4 TB
Currently stored OSN data: yes
Type: HDF5 datasets
Volume: 160 TB
Access methods: S3 API
Pangeo Transfer Benchmarking
Scientific PoC: Ryan Abernathey (rpa@ldeo.columbia.edu)
Technical PoC: Ryan Abernathey (rpa@ldeo.columbia.edu)
Institution: Pangeo multi-institution collaboration
Related disciplines: Earth sciences
General dataset description: multi-instrument data
Total dataset size: 173 TB
Currently stored OSN data: yes
Type: HDF5 datasets
Volume: 15 TB
Access methods: S3 API
Bhargava Spectral Bioimaging Lab
Scientific PoC: Rohit Bhargava (rxb@illinois.edu), Shachi Mittal (mitta@illinois.edu)
Technical PoC: Shachi Mittal (mitta@illinois.edu)
Institution: University of Illinois at Urbana-Champaign
Related disciplines: Spectroscopy, biomedicine, cancer research, ML
General dataset description: Annotated multispectral biopsy images
Total dataset size: 2 PB
Currently stored OSN data: in progress - working on data migration
Type: spectral images and calibration data from cancer biopsy samples in HDF5 format
Volume: 10 TB
Access methods: S3 API
HTRC Extracted Features
Scientific PoC: Stephen Downie (jdownie@illinois.edu), Jacob Jett (jjett2@illinois.edu)
Technical PoC: Boris Capitanu (capitanu@illinois.edu)
Institution: University of Illinois at Urbana-Champaign
Related disciplines: Library and information sciences, digital humanities
General dataset description: Metadata and unigrams for volumes in HathiTrust
Total dataset size: 20+ TB
Currently stored OSN data: in progress - awaiting coordination with HTRC
Type: XML metadata files and text files containing unigrams from individual volumes in a file structure
Volume: 6 TB
Access methods: S3 API, Rsync is a desirable requirement
Global Ocean Modeling
Scientific PoC: Chris Hill (cnh@mit.edu)
Technical PoC: Jim Culbert (culbertj@mghpcc.org)
Institution: MIT, MGHPCC
Related disciplines: Climatology and Oceanography
General dataset description: large-scale simulation outcomes
Total dataset size: 4 PB
Currently stored OSN data: pending - iRODS setup in process
Type: TBD
Volume: 5 TB
Access methods: S3 API
Past or Deprecated Use Cases:
Combined Array for Research in Millimeter-wave Astronomy (CARMA)
Scientific PoC: Athol Kemball (akemball@illinois.edu)
Technical PoC: Athol Kemball (akemball@illinois.edu)
Institution: University of Illinois at Urbana-Champaign
Related disciplines: Astronomy
General dataset description: Millimeter array observation blocks
Total dataset size: 50+ TB
Currently stored OSN data: pending - need to migrate from FITS to ALMA data format
Type: FITS/ALMA data format
Volume: 25 TB
Access methods: S3 API