Liver and Mash

I’ve already blogged about my own Mashed Library Liverpool talk but I promised to say something about the rest of the event, so here goes!

Mandy Phillips and Owen Stephens

Mandy Phillips kicks of Liver and Mash

The day kicked off with welcome and introductions from Mandy and Owen. I’d heard bits about Mashed Library events before and I know the basics of Mashups but I didn’t really know who would be there and and what to expect. There was a good mix of attendees and speakers presenting “lightening talks”, “Pecha Kucha 20:20″ talks and workshops. The thing that persuaded me to agree to speak and convinced me that it wouldn’t just be a bunch of librarians (!!) was the scattering of local speakers…

Alison Gow

Alison Gow

Alison is Executive Editor (Digital) for Trinity Mirror Merseyside, publishers of the Liverpool Daily Post and Echo. Despite “knowing” her through the Twitter, Friday’s Mashed Libraries event was the first time I’d met her IRL! The slides of her talk “Open Curation of Data” are online covering some of the things journalists and the newspaper industry have had to deal with since the superinterweb came along.

Aidan McGuire and Julian Todd

Julian Todd and Aidan McGuire on ScraperWiki

Aidan and Julian demonstrated ScraperWiki a project supported by 4iP and aiming free data from inaccessible sources and make it available for those who wish to use it in new and innovative ways, for example mashups. “Screen Scraping” isn’t a new idea but typically it’s done by individuals, embedded into their own systems. If the scraped website changes then the feed breaks and there’s no way for others to build on the work done.

ScraperWiki aims to change that by providing a community driven source for storing scrapers. It’s like Wikipedia for code allowing you to take and modify a scraper I’ve written for your own purposes.

There are already dozens of scraped data sources and more are being added every day. It currently supports Python but my language of choice – PHP – will be added soon so I’ll be giving it a go then.

John McKerrell

John McKerrell on Mapping

John’s talk about mapping had the most interest so he presented it to all attendees briefly covering mapping APIs, OpenStreetMap and tracking your location with mapme.at.

Phil Bradley

The first Pecha Kucha 20:20 talk was about social media search tools. I wasn’t writing down the links so check on Phil’s Slideshare page for the presentation coming out. I will say that Google’s support for Twitter is now much better than he seemed to suggest – for example allowing you to drill into tweets for a particular time. It can also be more reliable than search.twitter.com when using shared IP addresses at a conference.

Gary Green

Gary Green 20/20 talk

Gary mentioned that this was his first presentation so I’m not sure a 20:20 talk was the best idea but he handled it pretty well!

Tony Hirst

Tony Hirst talking about Yahoo! Pipes

The afternoon was dedicated to one of three workshops – Arduino with Adrian Mcewen, Mapping with John McKerrell or Mashups with Tony Hirst. I’ve done a bit of each before so I sat at the back of Tony’s talk to try to soak up some new tips.

After a final cake break there was the prize giving for the mashup suggestions competition.

@briankelly, @m8nd1 and @ostephens presenting prizes

So all in all a really interesting day! Congratulations to Mandy Phillips and all the organising team for an excellent event.

data.ac.uk

Last Friday I went to Liver and Mash, the Liverpool Mashed Libraries event held at Parr Street Studio 2. I’d been asked to speak by Mandy Phillips (formerly of this Parish) about – erm – something!

I’ll cover the rest of the event in another post but here I’d like to write about the topic of my presentation. You can see slides below or watch on the Slideshare website to see along side my notes:

It took me a while to think of a subject to talk about but eventually I started considering the role higher education institutions play in mashups and in particular what we can bring to the party.

This actually builds on some ideas I’ve been thinking about for a while. On Christmas Eve last year I posted as part of our 25 days series an entry about 2010 being the year of open data. Edge Hill was closed by that point so I’m not surprised few people read it but I said:

I believe there will be an increasing call for Higher Education to open up its data. Whether that’s information about courses using the XCRI format, or getting information out of the institutional VLE in a format that suits the user not the developer, there is lots that can be done. I’m not pretending this is an easy task but surely if it can be done it should because it’s the right thing to do.

So my presentation expanded on some of these ideas. Firstly we need to accept that what we do online isn’t going to suit everyone. HEI websites are huge unwieldy beasts. Doing a Google search for site:edgehill.ac.uk produces over 8,000 results; warwick.ac.uk has 236,000 pages! Combine that with Sturgeon’s Law and we’re in trouble:

Sturgeon’s Law: 90% of everything is crud.

[Before anyone says it… yes that means 90% of what I say is crap!]

If we accept that our websites aren’t going to deliver everything to everyone we have two options: firstly we could throw resources at the problem to add more and more content, but we know from experience how that ends up:

Alternatively we can strip down to our core audience and find other ways to satisfy the so-called “long tail”. To me that means providing data in an open, accessible form that users can take and use in ways that suit them. Let’s do that.

At IWMW 2008, Tony Hirst submitted an innovation competition entry to show what autodiscoverable feeds HEIs feature on their homepages. It seems in the two years since Aberdeen the number of sites with discoverable feeds has crept up but is still less than half.

Typically these feeds contain easily available information. News stories are recycled press releases. Often forthcoming events are available as an RSS or Atom feed or even as an iCal feed that can be subscribed to in Google or Outlook Calendar.

Universities also run courses and there’s a standard format for publishing them – XCRI-CAP.

So far, so general, but what other information are people looking for? Freedom of Information legislation came into force on 1st January 2005 applying to all public bodies including Universities. WhatDoTheyKnow from MySociety allows anyone to submit and track FOI requests – simultaneously the most awesome and scary thing for anyone working in a public sector organisation!

Most HEI websites also contain FOI pages or a publication scheme but often the information available is locked up in difficult to access documents. PDFs and unstructured webpages are typically the format of choice. We can make this information more open by publishing in more accessible formats. Maybe uploading to Google Docs (which allows export as CSV or through an API) would be an easy thing to do.

We also have systems containing interesting data – HR, Student Record System, VLE, Library Catalogue – but getting information out can be difficult. When procuring new systems we need to be asking the right questions about vendor’s approach to open data and APIs and building this into the requirements specification.

So my challenge is for us to create data.ac.uk. It would be great to do something sector wide along the lines of data.gov.uk (but, y’know, better!) but an easier model to get started with is something like the Guardian’s Data Store. Let’s start in the areas we have control over:

  • data.metropolis.ac.uk
  • www.metropolis.ac.uk/library/data
  • people.metropolis.ac.uk/~smith/data

Let’s create a webpage, publish some links to existing data. If we have spreadsheets upload them to Google Docs and post the link. If we have systems with a rubbish API, let’s knock up a wrapper layer do expose something more useful. data.ac.uk isn’t going to happen overnight but each of us can do our bit to build a more open sector.

In the pub following the event the discussion continued with Brian Kelly from UKOLN making an interesting point:

Interesting thought from @briankelly post #mashliv: expect the telegraph/daily mail to hit on public sector/HEIs about transparency/opendata

If that’s not a good enough reason to take open data seriously, I don’t know what is.

One final point about the presentation. I noticed a tweet from Alison Gow of the Liverpool Daily Post and Echo:

Not sure about bravery – I didn’t actually recognise Julian until after the presentation was over which is probably a very good thing!

>