The Tales of Xcri-Cap

One of the first projects in my new job here at Edge Hill was JISC’s XCRI-CAP. Where would we be without acronyms? I was thoroughly pleased to have some new ones. Here’s the definitions:

  • JISC – a company that pushes innovative digital technology into UK education
  • XCRI-CAP – eXchanging Course Related Information, Course Advertising Profile. It’s a UK standard to describe course information for marketing.

JISC’s goal with XCRI-CAP is to share course information with the organisations who publish it, such as

Xcri - eXchanging Course Related Information




We’re making a feed for our Health CPD courses in the XCRI format so prospective students can find the course they’re looking for.

We moved all our health courses onto WordPress – our CMS (Content Management System) for most of our course information now. It was a good clean break, as we got to redesign the information’s structure to fit the XCRI specification. With the health courses in the CMS, we made the XCRI feed. It passes the course information to the feed aggregators so they can share the information.

Here’s a little information about each phase:

Database Mapping

This involved translating the old database content into the new xcri fields, where possible, and retaining the other useful information. We used the opportunity to strip out any useless information.

Database Mapping

This was the ‘get out your spreadsheet and be very thorough’ phase. It was pretty important because the information had to be labelled right so websites could understand it and use it properly.

We continued to the content management system…

Built a new Content Management System (CMS)

We actually extended WordPress, the CMS we use for most of our courses, by making a plugin that saves the health CPD modules. WordPress makes it fairly easy to add new functionality by providing hooks you can latch onto to adapt the software – basically slots to inject your customisations in.

Using WordPress’s custom post types we added courses and presentations to the CMS, and customised the admin area with the right input boxes. A presentation is a living instance of a course – for example, on the Tissue Viability course the Feb 2012 intake and July 2012 intake are different presentations.

WordPress makes it quite easy to add functionality by using existing plugins. We wanted more precise control over the admin page, so we did some of the building ourselves.

Customised WordPress admin menu

Once the CMS was done we transferred the course information into it.

 Built new web pages

We made a page template that displays an individual course’s information.

Front end website page template

We’ve still got some work to do on the template, like adding side navigation that links to similar courses.

Making the feed

We built an example feed that holds dummy information first.  We passed this through the Xcri 1.2 validator to make sure it was right. Eventually it stopped telling us we had made massive human errors, so the feed was right. Here’s the static version:

Static Xcri Feed

We made the real feed by passing the course information into this template from database.

And that’s where we are so far. Its nearly finished, and hopefully will make it much easier for people to find the right course.

Was 2010 the year of Open Data?

sometimes you throw a sixIn a little-read post published last Christmas Eve as part of our previous 25 days project I suggested 2010 might be the year open data became important:

I don’t like to predict the future – usually because I’m wrong – but I’m going to put my neck out on one point for the coming year. 2010 will be the year that data becomes important.

So let’s look at what’s happened over the last year.

  • Ordnance Survey Code-Point® Open data containing the location of every postcode in the country. With this people have been able to build some nice cool services like a wrapper API to give you XML/CSV/JSON/RDF as well as a hackable URL: (that’s Edge Hill, by the way)
  • The OS also released a bunch of other data from road atlases in raster format through to vector contour data.  Of particular interest is OS VectorMap in vector and raster format – that’s the same scale as their paper Landranger maps and while it doesn’t have quite as much data, they’re beautifully rendered and suitable for many uses, but sadly not for walking.

OS VectorMap of Ormskirk. Crown copyright and database rights 2010 Ordnance Survey.

  • Manchester has taken a very positive step in releasing transport data (their site is down as I type) – is it too much to hope that Merseytravel will follow suit?
  • London has gone one step further with the London Datastore.
  • now has over 4600 datasets.  Some of them are probably useful.

In May I gave a talk at Liver and Mash expanding on some ideas about Since then lots of other people have been discussing in far more detail than I, including the prolific Tony Hirst from the Open University who have become (I believe) the first with the release of

So things are starting to move in the Higher Education open data world. I think things will move further with the HEFCE consultation on providing information to prospective students and maybe XCRI’s time has come!

Maybe 2011 will be the year people start to do data without even thinking about it?

Last Friday I went to Liver and Mash, the Liverpool Mashed Libraries event held at Parr Street Studio 2. I’d been asked to speak by Mandy Phillips (formerly of this Parish) about – erm – something!

I’ll cover the rest of the event in another post but here I’d like to write about the topic of my presentation. You can see slides below or watch on the Slideshare website to see along side my notes:

It took me a while to think of a subject to talk about but eventually I started considering the role higher education institutions play in mashups and in particular what we can bring to the party.

This actually builds on some ideas I’ve been thinking about for a while. On Christmas Eve last year I posted as part of our 25 days series an entry about 2010 being the year of open data. Edge Hill was closed by that point so I’m not surprised few people read it but I said:

I believe there will be an increasing call for Higher Education to open up its data. Whether that’s information about courses using the XCRI format, or getting information out of the institutional VLE in a format that suits the user not the developer, there is lots that can be done. I’m not pretending this is an easy task but surely if it can be done it should because it’s the right thing to do.

So my presentation expanded on some of these ideas. Firstly we need to accept that what we do online isn’t going to suit everyone. HEI websites are huge unwieldy beasts. Doing a Google search for produces over 8,000 results; has 236,000 pages! Combine that with Sturgeon’s Law and we’re in trouble:

Sturgeon’s Law: 90% of everything is crud.

[Before anyone says it… yes that means 90% of what I say is crap!]

If we accept that our websites aren’t going to deliver everything to everyone we have two options: firstly we could throw resources at the problem to add more and more content, but we know from experience how that ends up:

Alternatively we can strip down to our core audience and find other ways to satisfy the so-called “long tail”. To me that means providing data in an open, accessible form that users can take and use in ways that suit them. Let’s do that.

At IWMW 2008, Tony Hirst submitted an innovation competition entry to show what autodiscoverable feeds HEIs feature on their homepages. It seems in the two years since Aberdeen the number of sites with discoverable feeds has crept up but is still less than half.

Typically these feeds contain easily available information. News stories are recycled press releases. Often forthcoming events are available as an RSS or Atom feed or even as an iCal feed that can be subscribed to in Google or Outlook Calendar.

Universities also run courses and there’s a standard format for publishing them – XCRI-CAP.

So far, so general, but what other information are people looking for? Freedom of Information legislation came into force on 1st January 2005 applying to all public bodies including Universities. WhatDoTheyKnow from MySociety allows anyone to submit and track FOI requests – simultaneously the most awesome and scary thing for anyone working in a public sector organisation!

Most HEI websites also contain FOI pages or a publication scheme but often the information available is locked up in difficult to access documents. PDFs and unstructured webpages are typically the format of choice. We can make this information more open by publishing in more accessible formats. Maybe uploading to Google Docs (which allows export as CSV or through an API) would be an easy thing to do.

We also have systems containing interesting data – HR, Student Record System, VLE, Library Catalogue – but getting information out can be difficult. When procuring new systems we need to be asking the right questions about vendor’s approach to open data and APIs and building this into the requirements specification.

So my challenge is for us to create It would be great to do something sector wide along the lines of (but, y’know, better!) but an easier model to get started with is something like the Guardian’s Data Store. Let’s start in the areas we have control over:


Let’s create a webpage, publish some links to existing data. If we have spreadsheets upload them to Google Docs and post the link. If we have systems with a rubbish API, let’s knock up a wrapper layer do expose something more useful. isn’t going to happen overnight but each of us can do our bit to build a more open sector.

In the pub following the event the discussion continued with Brian Kelly from UKOLN making an interesting point:

Interesting thought from @briankelly post #mashliv: expect the telegraph/daily mail to hit on public sector/HEIs about transparency/opendata

If that’s not a good enough reason to take open data seriously, I don’t know what is.

One final point about the presentation. I noticed a tweet from Alison Gow of the Liverpool Daily Post and Echo:

Not sure about bravery – I didn’t actually recognise Julian until after the presentation was over which is probably a very good thing!

Choice Part 4: eProspectus

Courses HomepageSo on to the first of our web applications! Unsurprisingly, quite a large (and important) part of our audience is people interested in studying at Edge Hill and the courses we offer so it’s important that the website makes it easy for visitors to find the information they need. To meet this requirement, a lot of work has been done on the “Study” area of the site. It’s fair to say that we’ve almost completely overhauled every aspect from content to design and navigation.

I’m not going to take the credit for this – there’s other people who’ve poured over every paragraph, photo and link to get it as good as possible – but I will talk about (and take the credit for ;-)) some of the systems we’ve developed. Just like the paper prospectus, sitting at the back of all the beautifully designed copy and images is the course listings. It might not be sexy, but it’s an important part of the process students go through before they apply to Edge Hill.

We’ve had an online course database for a number of years but the Big Brief has given us the opportunity to redevelop it from scatch and look at how people find the courses they’re interested in, the best ways to present information and some less interesting things on the backend that will make managing the information easier.

XCRIOver the last few years there has been much discussion and development in the HE community of ways of expressing and sharing course related information. Initial developments were maybe a little ambitious, encompassing DCDs and module information. More recently work has been done on a more focussed project – XCRI CAP – which is looking at the course marketing side of the equation and seems to be much more managable.

With a growing buzz surrounding XCRI CAP we decided late last summer that it was the way to go when implementing our new eProspectus and just before Christmas we applied for and won JISC funding for a mini project to integrate CAP 1.1 with our systems.

Course information is now completely served from a database. The database structure is based on the information we require to produce XCRI feeds (which will be coming very soon!) so we store details not just of what courses we offer, but different presentations (for example 2008 or 2009 entry, where entry requirements might be different). With course information available in the database, we’re able to embed lists of courses into the relevant areas across the site so updating the eProspectus updates all the pages that use it.

Courses tag cloudThe new back end has allowed us to add some new features to help visitors find what they’re looking for. As well as the A to Z list, you can now get lists of keywords or tags. Tagging is a useful way of navigating around information and we’re using it in a number of ways both internally and externally. We’ll talk about tags in the future because it’s an important part of working with the web.

My CoursesThe My Courses feature is still in development. Currently it allows you to add a number of courses to a “shopping basket and then compare some key details about them. In the future, we’ll be using this to provide a personalised downloadable prospectus and allow you to save the information for future reference. For applicants this is one of the first points where they start to engage with Edge Hill online.

The search engine deserves an entire post of its own but from the eProspectus side, search results are fully integrated with the entire site search yet contain information relevant to courses rather than simply a summary extract.

Once you’ve found the course you’re interested in the course details page is broken down into hopefully manageable chunks. Different years of entry have separate summaries; module information which not everyone will be interested is shifted to a separate tab.

So that’s it for our first newly redeveloped application. If you’ve got any comments or questions, please leave a comment.

Bad URLs Part 4: Choose your future

So we’ve looked at examples of bad and good URLs, what the current situation is at Edge Hill, but what are we doing about it?

Lots, you’ll be pleased to know! As part of the development work we’ve been doing for the new website design, I’ve been taking a long hard look at how our website is structured and plan to make some changes. There are two areas to the changes – ensuring our new developments are done properly and changing existing areas of the site to fit in with the new structure.

Firstly the new developments. We’re currently working on three systems – news, events and prospectus. News was one of the examples I gave last time where we could make some improvements so let’s look at how things might change.

Firstly, all our new developments are being brought onto the main Edge Hill domain – – and each “site” placed in a top level address:

News articles will drop references to the technology used and the database IDs:

In this example the new URL is actually longer than the old one, but I can live with that because it’s more search engine friendly and the structure is human-readable. For example we can guess that the monthly archive page will be:

This idea is nothing new – for the first few years of the web most sites had a pretty logical structure – but it’s something that has been lost when moving to Content Management Systems.

The online prospectus is getting similar treatment where courses are currently referenced by an ID number the URL will contain the course title:

As part of our JISC funded mini-project, we’ll be outputting XCRI feeds from the online prospectus. The URLs for these will be really simple – just add /xcri to the end of the address:

In the news site, feeds of articles, tags, comments and much more will be available simply by adding /feed to the URL. Same will apply to search results.

All this is great for the new developments, but we do have a lot of static pages that won’t be replaced. Instead, pages will move to a flatter URL structure. For example, the Faculty of Education site will be available directly through what is currently the vanity URL meaning that most subpages also have a nice URL:

Areas of the site which were previously hidden away three or four levels deep will be made more accessible through shorter URLs.

How are we doing this? The core of the new site is a brand new symfony based application. This allows us to embed our dynamic applications – news, events and prospectus – more closely into the site than has previously been possible. symfony allows you to define routing rules which while look complex on the backend because of the way they mix up pages, produce a uniform look to the structure of the site.

For our existing content we’re using a combination of some lookup tables in our symfony application and some Apache mod_rewrite rules to detect requests for existing content. All the existing pages will be redirected to their new locations so any bookmarks will continue to work and search engines will quickly find the new versions of pages.

That’s all for this little series of posts about URLs. Hopefully it has helped explain some of my thinking behind the changes. If you’ve got any questions then drop me an email or post a comment.

eXchanging Course Related Information

XCRIThree weeks without a noise from the Web Services blog! How have you coped, dear reader?! We’ve got lots going on with some exciting developments you’ll hear about over the coming few weeks but I’m going to talk about something that’s probably not quite as exciting to most people!

Before Christmas we submitted a proposal for JISC funding for a mini-project looking into implementing and testing the XCRI format. XCRI is an application of XML which is designed for exchanging course information between organisations. For example universities could provide a feed of courses to websites which aggregate course information, reducing the need to retype information.

I’m happy to say that we heard just before the holiday that our proposal was accepted! So now the work begins on integrating XCRI into our systems. This isn’t as hard as it might be – part of the work we’re doing redeveloping the corporate website is on the eProspectus and we’re working on ensuring from the start that all the information required to output valid XCRI feeds is available from the start.

About a week ago I attended the JISC CETIS Joint Portfolio SIG and Enterprise SIG Meeting at Manchester Met. I didn’t really know what to expect but there was a session outlining the XCRI project and developments from last year so I thought it would be useful.

The first morning session was from Peter Rees Jones about ePortfolios and how HE can integrate better with companies. More acronyms than you can shake a stick at, but many interesting thoughts.

Same for John Harrison’s session on “Personal Information Brokerage”. Some obvious comparisons with OpenID, but more than that offers. Edentity clearly think that Education (and delivery companies!) have the capacity to act as a hub for implementing some of the systems they propose. Personally, I suspect that the commercial sector will do more than they give it credit for. Looking at the criteria for selection:

  1. Need for further data sharing
  2. Clear organisational boundaries
  3. Capacity for collective action
  4. Demographics

John marked them down on 3 and 4 but I disagree. If that doesn’t describe Google, Amazon, Yahoo and a bunch of other online companies (including most that get a “Web 2.0” label), I don’t know what does. Okay, standards may be slow to establish at times, but when there’s the will it can happen!

So on to XCRI. There were a few presentations from people explaining the XCRI standard and how its been implemented in institutions. Mark Stubbs gave a good overview of the standard, where it’s come from and where it’s going. I’ve been using a useful diagram handout showing the proposed XCRI-CAP 1.1 schema for the last week to check that what we’re developing for the eProspectus is heading along the right lines.

A few of the last round of XCRI mini projects displayed their work – the University of Bolton probably most closely matching the work we’re doing at Edge Hill. They’ve not yet launched their new site but I’m keeping an eye out for it!

Some of the slides (including those from Selwyn Lloyd of Phosphorix – developers behind CPD Noticeboard) are on the website, so check it out if you’re interested.