Seed Effort Work Breakdown Structure

Future work
Current work
Completed work

1. NDS Labs

  1. Developer/User Tools
    1. Encapsulation of services
      1. e.g. Docker containers, JSON descriptions
      2. Determine how to encapsulate external instances of services as well (e.g. Globus, BrownDog, ...)
    2. Cluster provisioning & monitoring tools
      1. Command line tools leveraging Kubernetes to simplify the process of setting up a cluster
      2. Containerized ELK as standard service deployed
    3. Command line & REST API interface
      1. Allow users to setup new project spaces and add users
      2. Allow user to add a new service
      3. Allow user to select and deploy services
      4. Allow user to provision resources, e.g. amount of CPUs, storage, computation resource to use, storage resource to use
      5. Tools to monitor deployed services, start/stop services
    4. Graphical Web Interface on top of REST interface
      1. Catalog of available services
      2. Canvas to deploy/control services
      3. Links to deployed services
      4. Links to specific service logs
      5. GUI access to all CLI functionality (e.g. admin abilities)?
    5. Catalog of services
      1. Database of service (e.g. persistently within etcd)
      2. Easy means of updating added services (e.g. fetching latest versions)
      3. Web portal (i.e. app store like thing for service)
        • Establish workflow for ingesting new services (as automatic and simple as possible)
        • Ability to browse services along with documentation on each service (e.g. external links, APIs, ...)
        • Database
        • Interface allowing separate NDS Labs deployments to pull services from here
    6. Development Environment
      1. Encapsulate development environment around deployed tools and provide as a single container
      2. Documentation on developing within NDS Labs
      3. YouTube videos on developing within NDS Labs
    7. Standup official NDS Labs instance
      1. Deploy on resources such as Nebula
      2. Web interface to request account/access (will be used by pilots)
    8. Production Deployment Support
      1. Ability to issue a command or push a button from GUI to deploy on users resources (e.g. AWS)
      2. Explore means of indexing data added in future to deployed services (to be used by NDS share portal)
    9. Easy update mechanism to allow distributed instances to remain up to date
    10. Explore means of centralized authentication across resources as well as deployed services
    11. Maintenance
      1. Refactor code base, separate into distinct components
      2. Add documentation to code

  2. Support for Multiple Resources
    1. Modification of tools to abstract away underlying resources and allow multiple resources to be leveraged simultaneously
    2. Provide means of allow users to enter credentials (e.g. Amazon account, XRAC)
    3. Intelligent resourcing tools (e.g. select compute resources near data)
    4. Manual resourcing tools (e.g. CLI/GUI modifications allowing users to deploy specific services on specific resources)
    5. Migration tools to move services across underlying platforms
    6. Deploy official NDS Labs instance across available resources
      1. SDSC Cloud and TACC Rodeo
      2. PSC Jetstream
      3. Amazon

  3. Populate Services Catalog
    1. Select several technologies for each required NDS component (e.g. archives, publishing, etc.) and encapsulate them
      1. e.g. DataONE, IRODS, SciServer, ...
    2. Identify other relevant technologies for each required NDS component and engage its developers to encapsulate them

  4. Tool Launcher (as one mechanism of running code near data)
    1. Support for various data sources/services within NDS Labs
    2. Support for various tools within NDS Labs (e.g. Jupyter, RStudio, ...)

2. NDS Share

  1. Repository of Last Resort
      1. Globus endpoint on NCSA hardware (santiago)
      2. Skinned front face
    2. ...
    3. Build up and categorize datasets per domain which can be utilized for experimentation with NDS Labs

  2. NDS Share Portal
    1. Federated search across NDS component archives and NDS Labs deployed resources
      1. Utilize repos listed here:
      2. Pilot effort among DataNets/DIBBs
    2. Web interface
      1. Search box
      2. Catalog of available archives

  3. Repository Recommender (decision support tool to suggest an archive machine database on a number of criteria, e.g. scientific domain)
    1. Explore leveraging DataNET SEAD virtual archive component
    2. Command Line tool
      1. Run in folder containing data
    3. Web Interface
    4. Data migration support
      1. Should an archive/repository being going away, identify a new archive, and a plan to move the data to the new repository (possibly and executable workflow)

  4. Published datasets
    1. Identify storage options
    2. Mint DOIs

3. NDS Mission

  1. Protocols, Interfaces, Standards
    1. Work with RDA to identify standards that can be implemented within components that can be leveraged by US NDS
      1. Disseminate recommendations to components
      2. Serve as a testbed for RDA efforts
        1. Rice Genome Variant Discovery
        2. ...
      3. Implementation tasks:
        1. Data Types Registry WG: Host DTR
        2. Data Types Registry WG: Add support in BrownDog to use DTR to identify types and generate previews
        3. Data Description Registry Interoperability WG: Leverage Research Data Switchboard towards federated search portal in NDS Share

        4. Data Description Registry Interoperability WG: Implement utilized protocols within collaborating archive efforts

        5. Metadata Standards Directory WG: Add support for Metadata transformations in BrownDog, use case TERRA

        6. ...
    2. Work towards implementing standards within funded NDS efforts
      1. Towards data preservation within project and the long term data management of its data products
      2. Begin implementing support for:
        1. OAI-PMH
          • Adapters that transform native interface?
          • Directly modify code?
        2. BagIt
        3. ...

  2. Outreach
    1. NDSC Workshops
      1. Two per year
    2. NDS Labs and NDS Share Tutorials
      1. NDSC workshops
      2. YouTube recordings
      3. Other conferences, workshops, venues (e.g. IEEE eScience, SC, XSEDE, ...)
    3. NDS Labs and Share user support
      1. Pilot efforts
      2. Others?
    4. Community organization
      1. Committee recruitment and meetings
      2. Explore sustainability plans
      3. Maintain NDS wiki(s)
      4. Social media (e.g. Twitter, blog, etc...)
    5. NDS website