I’ll cover the rest of the event in another post but here I’d like to write about the topic of my presentation. You can see slides below or watch on the Slideshare website to see along side my notes:
It took me a while to think of a subject to talk about but eventually I started considering the role higher education institutions play in mashups and in particular what we can bring to the party.
This actually builds on some ideas I’ve been thinking about for a while. On Christmas Eve last year I posted as part of our 25 days series an entry about 2010 being the year of open data. Edge Hill was closed by that point so I’m not surprised few people read it but I said:
I believe there will be an increasing call for Higher Education to open up its data. Whether that’s information about courses using the XCRI format, or getting information out of the institutional VLE in a format that suits the user not the developer, there is lots that can be done. I’m not pretending this is an easy task but surely if it can be done it should because it’s the right thing to do.
So my presentation expanded on some of these ideas. Firstly we need to accept that what we do online isn’t going to suit everyone. HEI websites are huge unwieldy beasts. Doing a Google search for site:edgehill.ac.uk produces over 8,000 results; warwick.ac.uk has 236,000 pages! Combine that with Sturgeon’s Law and we’re in trouble:
Sturgeon’s Law: 90% of everything is crud.
[Before anyone says it… yes that means 90% of what I say is crap!]
If we accept that our websites aren’t going to deliver everything to everyone we have two options: firstly we could throw resources at the problem to add more and more content, but we know from experience how that ends up:
Alternatively we can strip down to our core audience and find other ways to satisfy the so-called “long tail”. To me that means providing data in an open, accessible form that users can take and use in ways that suit them. Let’s do that.
At IWMW 2008, Tony Hirst submitted an innovation competition entry to show what autodiscoverable feeds HEIs feature on their homepages. It seems in the two years since Aberdeen the number of sites with discoverable feeds has crept up but is still less than half.
Typically these feeds contain easily available information. News stories are recycled press releases. Often forthcoming events are available as an RSS or Atom feed or even as an iCal feed that can be subscribed to in Google or Outlook Calendar.
Universities also run courses and there’s a standard format for publishing them – XCRI-CAP.
So far, so general, but what other information are people looking for? Freedom of Information legislation came into force on 1st January 2005 applying to all public bodies including Universities. WhatDoTheyKnow from MySociety allows anyone to submit and track FOI requests – simultaneously the most awesome and scary thing for anyone working in a public sector organisation!
Most HEI websites also contain FOI pages or a publication scheme but often the information available is locked up in difficult to access documents. PDFs and unstructured webpages are typically the format of choice. We can make this information more open by publishing in more accessible formats. Maybe uploading to Google Docs (which allows export as CSV or through an API) would be an easy thing to do.
We also have systems containing interesting data – HR, Student Record System, VLE, Library Catalogue – but getting information out can be difficult. When procuring new systems we need to be asking the right questions about vendor’s approach to open data and APIs and building this into the requirements specification.
So my challenge is for us to create data.ac.uk. It would be great to do something sector wide along the lines of data.gov.uk (but, y’know, better!) but an easier model to get started with is something like the Guardian’s Data Store. Let’s start in the areas we have control over:
Let’s create a webpage, publish some links to existing data. If we have spreadsheets upload them to Google Docs and post the link. If we have systems with a rubbish API, let’s knock up a wrapper layer do expose something more useful. data.ac.uk isn’t going to happen overnight but each of us can do our bit to build a more open sector.
In the pub following the event the discussion continued with Brian Kelly from UKOLN making an interesting point:
If that’s not a good enough reason to take open data seriously, I don’t know what is.
One final point about the presentation. I noticed a tweet from Alison Gow of the Liverpool Daily Post and Echo:
Not sure about bravery – I didn’t actually recognise Julian until after the presentation was over which is probably a very good thing!