CKAN

Description

The Comprehensive Knowledge Archive Network (CKAN) is an open source web-based system for the storage, management and distribution of open data. From the CKAN User Guide:

"CKAN is a tool for making open data websites. (Think of a content management system like WordPress - but for data, instead of pages and blog posts.) It helps you manage and publish collections of data. It is used by national and local governments, research institutions, and other organizations who collect a lot of data."

Demonstration video

This video demonstrates how to start CKAN in Labs Workbench, register for a CKAN account, and add an account as a sysadmin user.

Getting started with CKAN

  • Select the "Register" link and enter the required information. Select "Create Account"
  • By default, new CKAN users cannot create new datasets or upload data.  You will first need to add your account as a sysadmin user.
  • To add yourself as a syadmin user, you'll use the CKAN command-line interface
    • First, in Labs Workbench open the Console for the CKAN application
    • Next, run the following commands:

      . /usr/lib/ckan/default/bin/activate
      cd /usr/lib/ckan/default/src/ckan
      paster sysadmin add <your-ckan-username> -c /etc/ckan/default/ckan.ini
  • At this point, your user is now an administrator. You can create organizations, upload datasets, and manage users.
  • As a sysadmin, you can also customize the UI via the /ckan-admin/config page.

Note:

  • If you're trying to use the Explore/preview feature on a dataset, it must be public.  This appears to be a bug in CKAN (the interface "spins" but never loads).

Customizing ckan.ini

This is an advanced feature of Labs Workbench.  In Labs Workbench, every application runs as a Docker container, which is similar to a very light-weight virtual machine. One of the features of Docker is to map folders (or volumes) into the container (similar to mounting a drive on your computer).  

To customize the ckan.ini:

  • In a running instance of CKAN, open the console
  • cd /home/<username>
  • mkdir ckan
  • cp /etc/ckan/default/ckan.ini /home/<workbench-username>/ckan/
  • This copies the pre-configured ckan.ini to your Workbench home directory.
  • Stop the CKAN instance
  • You can now edit the copied file using the built in File Manager or a command line editor.  For testing, open your copy of the ckan.ini and change the default locale from English "en" to French "fr".
  • To map this file into your CKAN instance, select the "Edit" button for CKAN
  • On the "Data" tab add a new folder mapping from your local "ckan/ckan.ini" to "/etc/ckan/default/ckan.ini". Select the "plus" button to add it, then save.
  • Start your CKAN instance.
  • Once started, you should see the interface in French.

Uploading data using the CKAN API

This example will show you how to create a dataset via the CKAN REST API, add extra metadata fields, and upload files to the dataset.

This example assumes that you have access to the curl command.  If you do not have curl installed locally, you can run it from the console application for your CKAN instance. Open the console and run:

apt-get install curl nano -y

This will install curl and a simple editor called nano.  

cd /home/<you>

To upload data via curl, you'll need some information:

  • Your API key, which you can find on your user page in CKAN after logging in (we'll call it CKAN_API_KEY)
  • The hostname for your CKAN instance (we'll call it CKAN_HOST)
  • The ID of an organization in CKAN


This example assumes that you've created an organization in CKAN. To get the organization ID of an existing organization:


$ curl https://$CKAN_HOST/api/action/organization_list
{
    "help": "https://$CKAN_HOST/api/3/action/help_show?name=organization_list",
    "result": [
        "test"
    ],
    "success": true
}

The organization here is "test".

Now, create a file called dataset.json containing the following (assumes your organization name from the above request is "test"). You can either create files using nano or using the File Manager application in Workbench:

{    
	"name": "new_dataset",    
	"notes": "A long description of my dataset",    
	"owner_org": "test",
    "extras": [
        { "key": "key1", "value": "value1" },
        { "key": "key2", "value": "value2" },
        { "key": "key3", "value": "value3" },
        { "key": "key4", "value": "value4" },
        { "key": "key5", "value": "value5" }
    ]
}

Note, we're adding 5 different custom metadata fields (the main user interface only supports 3).

Create a new dataset using this information:

curl https://$CKAN_HOST/api/action/package_create -d=@dataset.json  -H "Authorization: $CKAN_API_KEY

You should get some JSON output indicating that the package was successfully created.

Next, upload a file to your dataset:

curl https://$CKAN_HOST/api/action/resource_create --form upload=@test.csv --form package_id=new_dataset -H "Authorization: $CKAN_KEY"

After uploading the file, you should be able to browse to your CKAN instance and view the dataset and uploaded file.

For more information, see the CKAN API documents.


See also