Shorter URLs

QR Code for GORecently we successfully registered an additional domain name – – for the University. Rather than simply using this as an additional alias for the main website addresses, we’re using it to provide a URL shortening service.

URL shortening services are nothing new – TinyURL was launched in 2002 – but while for years they were used to shorten web addresses in emails, with the advent of Twitter and its 140 character limits these services have gained new popularity.

These services do have some major problems however, notable, what happens if a service goes out of business either through running out of money or by the top level domain owner cancelling it? This has led many people to consider running their own service, and now that we have a nice short URL, we’re following suit.

We are using the popular YOURLS system, written in PHP with some custom plugins:

  • Lowercase URLs: we want short URLs to be case insensitive so that it doesn’t matter how people type them in
  • Top Level URLs keep their keyword for our main domain name, so maps to
  • For our own domain names we add in Google Analytics campaign keywords allowing us to determine where traffic comes from
  • URLs can be modified to include a source with just three extra characters which is then passed through as a Google Analytics medium
  • QR codes are available for all short URLs by simply adding .qr to the keyword
  • Certain keywords relate to the type of content, for example undergraduate courses have been seeded with their UCAS code, e.g. is BSc Computing

This service is currently in beta for use with the new prospectus but we’ll be making use of it further in the near future, for example exposing short URLs for pages within GO.

Let us know if you have any ideas for other things we can do with this service!

Bad URLs Part 4: Choose your future

So we’ve looked at examples of bad and good URLs, what the current situation is at Edge Hill, but what are we doing about it?

Lots, you’ll be pleased to know! As part of the development work we’ve been doing for the new website design, I’ve been taking a long hard look at how our website is structured and plan to make some changes. There are two areas to the changes – ensuring our new developments are done properly and changing existing areas of the site to fit in with the new structure.

Firstly the new developments. We’re currently working on three systems – news, events and prospectus. News was one of the examples I gave last time where we could make some improvements so let’s look at how things might change.

Firstly, all our new developments are being brought onto the main Edge Hill domain – – and each “site” placed in a top level address:

News articles will drop references to the technology used and the database IDs:

In this example the new URL is actually longer than the old one, but I can live with that because it’s more search engine friendly and the structure is human-readable. For example we can guess that the monthly archive page will be:

This idea is nothing new – for the first few years of the web most sites had a pretty logical structure – but it’s something that has been lost when moving to Content Management Systems.

The online prospectus is getting similar treatment where courses are currently referenced by an ID number the URL will contain the course title:

As part of our JISC funded mini-project, we’ll be outputting XCRI feeds from the online prospectus. The URLs for these will be really simple – just add /xcri to the end of the address:

In the news site, feeds of articles, tags, comments and much more will be available simply by adding /feed to the URL. Same will apply to search results.

All this is great for the new developments, but we do have a lot of static pages that won’t be replaced. Instead, pages will move to a flatter URL structure. For example, the Faculty of Education site will be available directly through what is currently the vanity URL meaning that most subpages also have a nice URL:

Areas of the site which were previously hidden away three or four levels deep will be made more accessible through shorter URLs.

How are we doing this? The core of the new site is a brand new symfony based application. This allows us to embed our dynamic applications – news, events and prospectus – more closely into the site than has previously been possible. symfony allows you to define routing rules which while look complex on the backend because of the way they mix up pages, produce a uniform look to the structure of the site.

For our existing content we’re using a combination of some lookup tables in our symfony application and some Apache mod_rewrite rules to detect requests for existing content. All the existing pages will be redirected to their new locations so any bookmarks will continue to work and search engines will quickly find the new versions of pages.

That’s all for this little series of posts about URLs. Hopefully it has helped explain some of my thinking behind the changes. If you’ve got any questions then drop me an email or post a comment.

Bad URLs Part 3: Confessions time

Over the last couple of posts I’ve been looking at URLs, good and bad. Now it’s time to examine what we do at Edge Hill and see how we fare!

Most of our website is currently made up of static pages so it looks something like this:

Other areas of the site aren’t quite so great:

Not terribly bad but it’s a little bit long – I wouldn’t like to read it out over the phone and because the URL is structured to mirror the department, when names change, the URL could change as well.

The site structure is quite deep which has led to some quite strange locations for pages, for example the copyright page linked to from every page on the site is within the Web Services area of the site:

For use in publications, there’s a whole bunch of “vanity URLs” like this:

And that will redirect you to the page you’re looking for. These are great – easy to read over the phone or type in but since most of them force redirect to the actual page, most people don’t know about them – if you copy and paste into a document you’re producing, you’ll get the long URL. But they’re also not universal – short URLs exist for some departments and services but not others and sometimes it can be hard to pick a good vanity URL.

When we look at some of our dynamic content however, things aren’t quite so great.

What’s wrong with this?

  • Mystery “info” server – splitting page rankings on Google
  • Tell the user what language we’re using for the page (ASP)
  • Meaningless ID numbers
  • EHU_news – why not just news?
  • Nothing that search engines can pick up on for keywords

The first site I worked on for Edge Hill – Education Partnership – is a bit of a mixed bag:

It’s fairly readable with words describing the pages rather than ID numbers, but it’s hosted as part of the GO website despite being in the main website template. There’s also the “static” in there which is a by-product of the implementation rather than being an relevant to the URL. I’ve learnt my lesson and it won’t happen again.

Overall I think we score 5/10 – no nightmare URLs but lots of scope for improvement and next time I’ll be looking at some of the plans to change how our sites are structured and maybe get a little technical about how we’re implementing it!

Bad URLs Part 2: The Beauty of URLs

Last time I gave some examples of awful URLs but not everyone gets it wrong. Let me give you some examples of truly beautify URL structures and explain the benefits of them!

BBC LogoAsk Auntie

If you ask almost any UK-based web developer for a list of the best produced websites, the Beeb will be pretty high up. They do a lot of things very well, and you’d expect so with their budget! URLs are just one example. Think of a major TV programme on the BBC and add the name after and 95% of the time that’s the address of the website. Try it out…

Considering the size of the BBC site, they seem to have a very well organised structure. Not too many levels deep – usually only one or two – and URLs stay around for a very long time. Check out Politics 97, or the Diana Remembered website. See how even when names change the content follows – now takes you to an index page linking to CBeebies and CBBC.

A new development from the BBC, still in beta, is even more impressive. Their new BBC Programmes site is an index of every TV and radio programme shown on BBC stations. For each series it lists episodes, and scheduling information. Great, but didn’t the channel listings do this already? No – that only showed the next week and didn’t contain an archive; the new site gives every series, episode and showing a unique, permanent URL.

Programmes are represented by a short alphanumeric identifier rather than their full name:

This has the advantage of being short but is hard to predict. In one of the comments on their introduction to the programmes site (and some other cool stuff they do with URLs), Michael Smethurst explains the reasoning behind their chosen structure:

We thought long and hard about the best way to make programmes addressable and, as ever, there’s no perfect solution. So…

…no channel cos not only do episodes get broadcast on multiple channels they can also change “channel ownership” over time.
and no brand > series > episode cos so many programmes don’t fit this model.
We’d love to have made human readable/hackable AND persistent urls (and have on the aggregation pages) but it just wasn’t possible

There’s another cool feature of BBC Programmes mentioned in that post:

Were also working on additional views so that in the near future by adding .json, .mobile, .rss, .atom, .iCal or .yaml to the end of the URL will give you that resource in that format.

You might not know (or care!) what each of those formats is, but what it means for every user is that they’re free to take the information that the BBC provide and use it within their own system. Already there is microformatted information embedded into every page.

Accessible UK Train TimetablesTrain Times done right

Another fantastic example of beautiful URL structure is from This site is an alternative to the awful official site which provide rail information. They offer a fully accessible interface to train times and fares in a format much easier to browse and navigate than National Rail. But along side the forms letting you search is some URL magic. Say you want to travel from Liverpool to London, simply tag it on to the end of the URL:

Not leaving right now. Okay…

Not leaving today? That’s fine too:

Want the price?

The Train Times site has so much flexibility – you can use station codes instead of the full name and it will recognise a variety of date formats. National Rail could learn a lot!

That’s enough examples for now, but there will be more later on. Next time I’ll be looking at Edge Hill’s URLs and seeing what we’re doing right, but more importantly where we can improve.