Description

The Comprehensive Knowledge Archive Network (CKAN) is an open source web-based system for the storage, management and distribution of open data. From the CKAN User Guide:

"CKAN is a tool for making open data websites. (Think of a content management system like WordPress - but for data, instead of pages and blog posts.) It helps you manage and publish collections of data. It is used by national and local governments, research institutions, and other organizations who collect a lot of data."

Demonstration video

This video demonstrates how to start CKAN in Labs Workbench, register for a CKAN account, and add an account as a sysadmin user.

Getting started with CKAN

Note:

Customizing ckan.ini

This is an advanced feature of Labs Workbench.  In Labs Workbench, every application runs as a Docker container, which is similar to a very light-weight virtual machine. One of the features of Docker is to map folders (or volumes) into the container (similar to mounting a drive on your computer).  

To customize the ckan.ini:

Uploading data using the CKAN API

This example will show you how to create a dataset via the CKAN REST API, add extra metadata fields, and upload files to the dataset.

This example assumes that you have access to the curl command.  If you do not have curl installed locally, you can run it from the console application for your CKAN instance. Open the console and run:

apt-get install curl nano -y

This will install curl and a simple editor called nano.  

cd /home/<you>

To upload data via curl, you'll need some information:


This example assumes that you've created an organization in CKAN. To get the organization ID of an existing organization:


$ curl https://$CKAN_HOST/api/action/organization_list
{
    "help": "https://$CKAN_HOST/api/3/action/help_show?name=organization_list",
    "result": [
        "test"
    ],
    "success": true
}

The organization here is "test".

Now, create a file called dataset.json containing the following (assumes your organization name from the above request is "test"). You can either create files using nano or using the File Manager application in Workbench:

{    
	"name": "new_dataset",    
	"notes": "A long description of my dataset",    
	"owner_org": "test",
    "extras": [
        { "key": "key1", "value": "value1" },
        { "key": "key2", "value": "value2" },
        { "key": "key3", "value": "value3" },
        { "key": "key4", "value": "value4" },
        { "key": "key5", "value": "value5" }
    ]
}

Note, we're adding 5 different custom metadata fields (the main user interface only supports 3).

Create a new dataset using this information:

curl https://$CKAN_HOST/api/action/package_create -d=@dataset.json  -H "Authorization: $CKAN_API_KEY

You should get some JSON output indicating that the package was successfully created.

Next, upload a file to your dataset:

curl https://$CKAN_HOST/api/action/resource_create --form upload=@test.csv --form package_id=new_dataset -H "Authorization: $CKAN_KEY"

After uploading the file, you should be able to browse to your CKAN instance and view the dataset and uploaded file.

For more information, see the CKAN API documents.


See also