Tag Archives: mysociety

data.ac.uk

Last Friday I went to Liver and Mash, the Liverpool Mashed Libraries event held at Parr Street Studio 2. I’d been asked to speak by Mandy Phillips (formerly of this Parish) about – erm – something!

I’ll cover the rest of the event in another post but here I’d like to write about the topic of my presentation. You can see slides below or watch on the Slideshare website to see along side my notes:

It took me a while to think of a subject to talk about but eventually I started considering the role higher education institutions play in mashups and in particular what we can bring to the party.

This actually builds on some ideas I’ve been thinking about for a while. On Christmas Eve last year I posted as part of our 25 days series an entry about 2010 being the year of open data. Edge Hill was closed by that point so I’m not surprised few people read it but I said:

I believe there will be an increasing call for Higher Education to open up its data. Whether that’s information about courses using the XCRI format, or getting information out of the institutional VLE in a format that suits the user not the developer, there is lots that can be done. I’m not pretending this is an easy task but surely if it can be done it should because it’s the right thing to do.

So my presentation expanded on some of these ideas. Firstly we need to accept that what we do online isn’t going to suit everyone. HEI websites are huge unwieldy beasts. Doing a Google search for site:edgehill.ac.uk produces over 8,000 results; warwick.ac.uk has 236,000 pages! Combine that with Sturgeon’s Law and we’re in trouble:

Sturgeon’s Law: 90% of everything is crud.

[Before anyone says it… yes that means 90% of what I say is crap!]

If we accept that our websites aren’t going to deliver everything to everyone we have two options: firstly we could throw resources at the problem to add more and more content, but we know from experience how that ends up:

Alternatively we can strip down to our core audience and find other ways to satisfy the so-called “long tail”. To me that means providing data in an open, accessible form that users can take and use in ways that suit them. Let’s do that.

At IWMW 2008, Tony Hirst submitted an innovation competition entry to show what autodiscoverable feeds HEIs feature on their homepages. It seems in the two years since Aberdeen the number of sites with discoverable feeds has crept up but is still less than half.

Typically these feeds contain easily available information. News stories are recycled press releases. Often forthcoming events are available as an RSS or Atom feed or even as an iCal feed that can be subscribed to in Google or Outlook Calendar.

Universities also run courses and there’s a standard format for publishing them – XCRI-CAP.

So far, so general, but what other information are people looking for? Freedom of Information legislation came into force on 1st January 2005 applying to all public bodies including Universities. WhatDoTheyKnow from MySociety allows anyone to submit and track FOI requests – simultaneously the most awesome and scary thing for anyone working in a public sector organisation!

Most HEI websites also contain FOI pages or a publication scheme but often the information available is locked up in difficult to access documents. PDFs and unstructured webpages are typically the format of choice. We can make this information more open by publishing in more accessible formats. Maybe uploading to Google Docs (which allows export as CSV or through an API) would be an easy thing to do.

We also have systems containing interesting data – HR, Student Record System, VLE, Library Catalogue – but getting information out can be difficult. When procuring new systems we need to be asking the right questions about vendor’s approach to open data and APIs and building this into the requirements specification.

So my challenge is for us to create data.ac.uk. It would be great to do something sector wide along the lines of data.gov.uk (but, y’know, better!) but an easier model to get started with is something like the Guardian’s Data Store. Let’s start in the areas we have control over:

  • data.metropolis.ac.uk
  • www.metropolis.ac.uk/library/data
  • people.metropolis.ac.uk/~smith/data

Let’s create a webpage, publish some links to existing data. If we have spreadsheets upload them to Google Docs and post the link. If we have systems with a rubbish API, let’s knock up a wrapper layer do expose something more useful. data.ac.uk isn’t going to happen overnight but each of us can do our bit to build a more open sector.

In the pub following the event the discussion continued with Brian Kelly from UKOLN making an interesting point:

Interesting thought from @briankelly post #mashliv: expect the telegraph/daily mail to hit on public sector/HEIs about transparency/opendata

If that’s not a good enough reason to take open data seriously, I don’t know what is.

One final point about the presentation. I noticed a tweet from Alison Gow of the Liverpool Daily Post and Echo:

Not sure about bravery – I didn’t actually recognise Julian until after the presentation was over which is probably a very good thing!

2010: The Year of Open Data?

I don’t like to predict the future – usually because I’m wrong – but I’m going to put my neck out on one point for the coming year.  2010 will be the year that data becomes important.

I’ve long been a believer in opening up sources of data.  As far as possible, we try to practice what we preach by supplying feeds of courses, news stories, events and so on.  We also make extensive use of our own data feeds so I’m always interested to see what other people are doing.  Over the last year there has been growing support for opening up data to see what can be done with it and there’s potentially more exciting stuff to come.

A big part of what many consider to be “Web 2.0” is open APIs to allow connections to be made and they have undoubtedly let to the success of services like Twitter.

Following in their footsteps have been journalists, both professional and amateur, who are making increasing use of data sources and in many cases republishing them.  The MPs expenses issue showed an interesting contrast in approaches.  While the Daily Telegraph broke the story and relied on internal man power to trawl through the receipts for juicy information the Guardian took a different route.  As soon as the redacted details were published, the Guardian launched a website allowing the public to help sort through pages and identify pages of interest.  Both the Guardian and the Times have active data teams releasing much of their sources for the public to mashup.

The non commercial sector have produced arguably more useful sources of data.  MySociety have a set of sites which do some really cool things to help the public better engage with their community and government.

In the next few months there looks set to be even more activity.  The government asked Tim B-L to advise on ways to make the government more open and whether due to his influence or other factors there are changes on the horizon.

But it’s set to be the election, which must be held before [June], which could do the most.  Data-based projects look set to pop up everywhere.  One project – The Straight Choice – will track flyers and leaflets distributed by candidates in order to track promises during and after the election.  Tweetminster tracks Twitter accounts belonging to MPs and PPCs and has some nice tools to visualise and engage with them.

I believe there will be an increasing call for Higher Education to open up its data.  Whether that’s information about courses using the XCRI format, or getting information out of the institutional VLE in a format that suits the user not the developer, there is lots that can be done.  I’m not pretending this is an easy task but surely if it can be done it should because it’s the right thing to do.

Since I started writing this entry a few days ago, the Google Blog post on The Meaning of Open. Of course they say things much better than I could, so I’ll leave you with one final quote:

Open will win. It will win on the Internet and will then cascade across many walks of life: The future of government is transparency. The future of commerce is information symmetry. The future of culture is freedom. The future of science and medicine is collaboration. The future of entertainment is participation. Each of these futures depends on an open Internet.

Let’s do our bit to contribute to that future.

Mapumental: where can I live?

Channel 4 and mySociety – the non-profit organisation who build cool stuff for the public good – have teamed up to create a new website to help people work out where to live, work or holiday.

Mapumental, currently in invite-only beta, takes data about public transport, house prices, senic-icity and combines them with free mapping to clearly show where you can get to in a given time. I’ll discuss some of the data in a moment, but first watch the demo:

For travelling into Edge Hill you can see that most of North and central Liverpool is accessible by public transport in an hour or less. Nudging the time up to 1h15m allows me to get the train in, which is pretty much spot on:

Mapumental 1:00 Mapumental 1:15

The data they combine comes from an interesting range of sources. Traveline supply the National Public Transport Data Repository (for ~£9000 – a snip!). House prices for England and Wales is supplied by the Land Registry. The other data, however, is free!

The base mapping layer is from OpenStreetMap – a project to create a free (as in beer and speech) map similar to the ones available from Google Maps, or even from the OS. It’s created by volunteers who go out with GPS and plot the routes online. Almost all major roads are on there already and certain areas have excellent quality coverage – take a look at South Liverpool for an example of how good it can get.

Edge Hill University Faculty of Health Copyright Bryan Pready, Creative Commons LicenceThe scenic-icity of places was determined by mashing up some other data. Geograph is a project aiming to have a photo of every 1km x 1km grid square in the country. All photographs submitted are under Creative Commons licence so you’re free to use them (with some restrictions).

mySociety took the images and created a game, ScenicOrNot, asking people to rate how scenic a photo looks – nearly 15,000 people took part building up the third layer of information.

The kind of information Mapumental exposes is stuff that’s previously only been known through experience or painful manual analysis of train/bus timetables and estate agent windows. In a time when many people are trying harder to make better use of public transport, knowing all your options is essential.

If you’ve not come across mySociety before, check out some of their other websites:

Channel 4’s involvement in the project is through its new 4iP fund for investing in public service media.