Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

What is the ThinkChicago Workbench?

The ThinkChicago Workbench is a cloud-based service with a set of general-purpose development and data analysis environments to help you to explore your ideas with the ThinkChicago data. All applications run as Docker containers on a system hosted by the National Center for Supercomputing Applications (NCSA)  at the University of Illinois at Urbana-Champaign. 

A few guidelines:

  • Start only the services that you need. Each account has limited resources. You will likely only need to run one or two of the provided services.
  • For the larger datasets, don't try to read them all at once. You'll need to work with subsets of the data. 
  • If you have questions/problems, post to  Slack (https://thinkchicago.slack.com/)

What data is available?

The ThinkChicago workbench provides access to the following datasets. Many of these datasets are also available via the City of Chicago Data Portal REST API.

DatasetDescriptionAPISize (Format)Records

2FM Tech Challenge

Fleet and Facilities Management (2FM) vehicle and equipment data.N/A534M CSV
Array of Things LocationsLocations of Array of Things sensor nodes.REST API6.5K CSV; 28K JSON

Crimes 2001 - present

Incidents of crime since 2001.REST API1.4G 5G CSV6.39 million

Divvy Trips

Individual Divvy bike sharing trips, including the origin, destination, and timestamps for each tripREST API2.4G CSV.8G CSV11.5 million

Divvy Bicycle Stations (historical)

Historical availability of bicycles and docks to return bicycles at the Divvy stations.REST API9.4G CSV14G CSV87 million

Taxi Trips

Taxi trips reported to the City of Chicago.REST API8.0GB CSV42GB CSV111 million rows


Each of these datasets is available in the /shared directory of any running application in Workbench. 

Note that some of these files are large and you will not be able to read the entire file into memory.  Please prepare accordingly (and see resource limits below).  You can either work with subsets of the data (via commands like head -1000) or use the provided REST APIs.

F.A.Q.

What applications are available?

...