Install Open Spending

Installation and setup

Steps to set up Open Spending/WDMMG from scratch. This may not be comprehensive. Please edit.

Use a virtualenv!

For an installation we recommend highly to use a virtualenv environment. The following documentation assumes you do and have activated the virtual environment in /path/to/env/. If not create one with:

$ virtualenv --no-site-packages /path/to/env

Now activate the environment. Your prompt will be prefixed with the name of the environment:

$ cd /path/to/env
/path/to/env$ . /path/to/env/bin/activate
(env)/path/to/env$

Install Mongodb

Install mongodb on your machine. A version >= 1.5.3 is required (for $or queries).

Create and edit a paster configuration

Create a development.ini file. This has the config options needed to run the wdmmg:

(env)/path/to/env/wdmmg$ paster make-config wdmmg development.ini

Edit development.ini with relevant details for your local machine. The options in the file are commented. Some of the important options in [app:main] are:

mongodb.database = wdmmg_dev

extra_public_paths =
    path/to/extra/public/resources
    /another/extra/public/resource/from/plugin

# setup plugins
wdmmg.plugins = treemap datatables etc.

# Credentials for retrieving data from Google Documents.
gdocs_username = <your username>
gdocs_password = <your password>

Setup Solr

Install Solr and create a solr configuration home directory to use with Open Spending, e.g. copy the example folder of solr installation. Remove schema.xml file and symlink solr/wdmmg_schema.xml from the wdmmg package instead. Start solr with the folder as a parameter.

In your development.ini you have to configure the solr url, typically http://localhost:8983/solr

solr.url = http://localhost:8983/solr

Setup Celery

Celery is used to manage job queues for background tasks. It is installed as automatically as a dependency of wdmmg. You can find several celery* commands in your virtualenv’s /bin directry.

Adapt celeryconfig.py to your needs. The default configuration wdmmg/celeryconfig.py uses mongodb as a storage and queue backend. Run Celery with:

(env)/path/to/env/wdmmg$ celeryd celeryconfig.py

Test the installation

Run tests, edit anything that is broken.:

(env)/path/to/env/wdmmg$ nosetests

Import data into Open Spending

To import data into OpenSpending you need a data loader. The Open Spending project ships a set of loaders in the package wdmmg-ext that is required for these steps. These loaders load a dataset into the database. The installation of wdmmg-ext is described in Check out and install Open Spending and related packages.

Loading a big dataset can take a long time.

Load a complete dataset

To load a dataset you have to first download it. With the installation of wdmmg a script datapkg was automatically generated in your virtualenv’s bin directory. Your development.ini file defines a getdata_cache directory. The default is ./pylons_data/getdata inside the wdmmg package directory. We will now downlad the “cra” data package to that directory:

(env)/path/to/env/wdmmg$ datapkg download ckan://ukgov-finances-cra \
> ./pylons_data/getdata

Now you can load the cra2010 dataset into the database.:

(env)/path/to/env/wdmmg$ paster load cra2010

After that you want to update the solr index. We provide a paster command for that:

(env)/path/to/env/wdmmg$ paster solr load cra2010

Load sample data

Alternatively you can load a set of sample data and update the solr index for it. Be aware that this will empty the database first.:

(env)/path/to/env/wdmmg$ paster fixtures setup
(env)/path/to/env/wdmmg$ paster solr load cofog
(env)/path/to/env/wdmmg$ paster solr load cra

Run the site

Finally, run the site from development.ini:

(env)/path/to/env/wdmmg$ paster serve --reload development.ini

How to upgrade production service

3 dbs/systems:
  • data.wheredoesmymoneygo.org - P1
  • data.wheredoesmymoneygo.org.2 - P2
  • data.staging.wheredoesmymoneygo.org - S1

Suppose P1 = active production, P2 = inactive production

  • Shut down write to the main system (Hack way: LimitExcept GET in Apache)
  • Dump active production db and load into inactive production db
  • Upgrade inactive system and set to use inactive production db * Use it and test it
  • Switch over from P1 to P2 * In apache move wsgi script to point to the other one and reboot apache2
  • If this fails just switch back and you are operational again