As often is the way, events have conspired to prevent me from producing this third and final part in this How & Why of Local Government Spending Data as soon as I wanted. So my apologies to those eagerly awaiting this latest.
To quickly recap, in Part 1 I addressed issues around why pick on spending data as a start point for Linked Data in Local Government, and indeed why go for Linked Data at all. In Part 2, I used some of the excellent work that Stuart Harrison at Lichfield District Council has done in this area, as examples to demonstrate how you can publish spending data as Linked Data, for both human and programmatic consumption.
I am presuming that you are still with me on my basic assumptions “…publishing this [local government spending] data is a good thing” and “Publishing Local Authority data, such as local spending data, as ‘Linked Data’ is also a good thing”, plus the technique of using URIs to name things in a globally unique way (that also provides a link to more information) is not providing you with mental indigestion. So, I now want to move on to some of the issues that are causing debate in the community which come under the headings of ontologies identifiers.
An ontology, according to Wikipeda, is a formal representation of knowledge as a set of concepts within a domain – an ontology provides a shared vocabulary, which can be used to model a domain – that is, the type of objects and/or concepts that exist, and their properties and relations. So in our quest to publish spending data what ontology should we use? The Payments Ontology, with the accompanying guide to it’s application, is what is needed. Using it, it becomes possible to describe individual payments, or expenditure lines, and their relationship between the authority (payment:payer) the supplier (payment:payee) category (payment:expenditureCategory) etc. The next question is how do you identify the things that you are relating together using this ontology.
Lets take this one step at a time:
- Give the expenditure line, or individual payment, an identifier possibly generated by our accounts system. eg. 8605670.
- Make that identifier unique to our local authority by prefixing it with our internet domain name. eg. http://spending.lichfielddc.gov.uk/spend/8605670 – note the prefix of ‘http://’. This enables anyone wanting detail about this item to follow the link to our site to get the information.
- Associate a payer with the payment with an RDF statement (or triple) using the Payments Ontology:
Note I am using an identifier for the payer that is published by statistics.data.gov.uk. That is so that everyone else will unambiguously understand which authority is the one responsible for the payment.
- Follow the same approach for associating the payee http://spending.lichfielddc.gov.uk/spend/8605670
- And then repeat the process for categorisation, payment value etc.
This immediately throws up a couple of questions, such as why use a locally defined identifier for the payee – surely there is an identifier I can use that other will recognise, such as company or VAT number! – there are, but as of the moment there are no established sets of URI identifiers for these. OpenCorporates.com are doing some excellent work in this area, but Companies House, the logical choice for publishing such identifiers, have yet to do so. Pragmatically it is probably a good idea to have a local identifier anyway and then associate it with another publicly recognised identifier:
Because this is all very new and still emerging, we now find ourselves in a bit of a chicken-or-egg situation. I presume that most authorities have not built a mini spending website, like Lichfield District Council has, to serve up details when someone follows a link like this: http://spending.lichfielddc.gov.uk/spend/8605670
You could still use such an identifier using your authority domain, and plan to back it up later with a web service to provide more information later. Or you could let someone else, who takes a copy of your raw data, do it for you as OpenlyLocal might: http://openlylocal.com/financial_transactions/135/2010/33854 or maybe how the project we are working on with LGID might: http://id.spending.esd.org.uk/Payment/36UF/ds00024616. If the open flexible world of Linked Data it doesn’t matter too much which domain an identifier is published from, or for that matter how many [related] identifiers are used for the same thing.
It does matter however, for those looking to the identifying URI for some idea of authority. As I say above, technically it doesn’t matter who’s domain the identifier comes from, but I believe it would be better overall if it came from the authority who’s payment it is identifying. Which puts us back in the chicken-or-egg situation as to resolving the URI to serve up more information. The joy of Linked Data is that, provided aggregators consider the possibility of being able to identify source authorities data accurately when they encode it, it should be possible to automatically retrofit links between URIs at a later date.
In summary over this series of posts we are seeing a technology which, although it has obvious benefits, is still early on the development curve; being applied to a process which is also new and scary for many. An ideal breading ground for cries of pain, assertions of ‘it doesn’t work’ or ‘not worth bothering’, yet with the potential to provide a powerful foundation for a future open, accessible, and beneficial to authorities, government, citizens, and UK Plc data rich environment. Yes it is worth bothering, just don’t expect benefits on day, or even month, one.
This post was also published on the Nodalities Blog