Monthly Archives: February 2008

Google Gone Phishing?

A couple of us in the office received an email from Google about Analytics and AdWords:

Hello,

It’s come to our attention that your AdWords account has not been properly linked with your Google Analytics account. If you do not link your accounts by March 5, 2008, we cannot ensure that your data from AdWords will continue to populate in your Analytics account.

It’s fast and simple to link both accounts. Please review the various scenarios below and follow the instructions that best suits your needs.

[bunch of links]

We ask for your cooperation to help us make your experience with Google Analytics the best that it can be.

Sincerely,

The Google Analytics Team

It looks fairly legitimate – the links actually point to the address shown in the email (a common tactic with scammers) and it’s certainly possible that this is something we (and me personally – I got an email to my personal Gmail address) need to do.

But you can never be certain so I Googled for “come to our attention that your AdWords” and received just one result – a forum post on Search Engine Watch about the very same email – it seems that we’re not the only ones getting it. The general consensus is hang on and wait until Google confirm it so I’ll be keeping an eye on it until then.

What’s more interesting though is that the forum post I found was less than an hour old – Google had spidered it blisteringly fast! This is a massive difference to just a few years ago when search engines would update their index maybe every couple of months if you were lucky.

Let’s see how quickly this post gets indexed!

Update: less than 20 minutes!

Web Services Project Officer

Web Services are recruiting a new member of staff. We’re looking for someone to join our team to work with the Faculty of Health to create a system to link staff, students and external partners to aid communication and better manage information.

The first half of this twelve month contract will focus on planning and implementing an extranet so we’re particularly interested in people with experience of wikis, content or document management systems. Later in the year the successful applicant will be working on other projects for the Faculty to help improve communications.

You can find the full job description and person specification on the website but if you have any questions, please do not hesitate to contact me on michael.nolan@edgehill.ac.uk or phone 01695 584195.

Bad URLs Part 4: Choose your future

So we’ve looked at examples of bad and good URLs, what the current situation is at Edge Hill, but what are we doing about it?

Lots, you’ll be pleased to know! As part of the development work we’ve been doing for the new website design, I’ve been taking a long hard look at how our website is structured and plan to make some changes. There are two areas to the changes – ensuring our new developments are done properly and changing existing areas of the site to fit in with the new structure.

Firstly the new developments. We’re currently working on three systems – news, events and prospectus. News was one of the examples I gave last time where we could make some improvements so let’s look at how things might change.

Firstly, all our new developments are being brought onto the main Edge Hill domain – www.edgehill.ac.uk – and each “site” placed in a top level address:

http://info.edgehill.ac.uk/EHU_news/
becomes:
http://www.edgehill.ac.uk/news

News articles will drop references to the technology used and the database IDs:

http://info.edgehill.ac.uk/EHU_news/story.asp?storyid=765
becomes:
http://www.edgehill.ac.uk/news/2008/01/the-performance-that-is-more-canal-than-chanel

In this example the new URL is actually longer than the old one, but I can live with that because it’s more search engine friendly and the structure is human-readable. For example we can guess that the monthly archive page will be:

http://www.edgehill.ac.uk/news/2008/01

This idea is nothing new – for the first few years of the web most sites had a pretty logical structure – but it’s something that has been lost when moving to Content Management Systems.

The online prospectus is getting similar treatment where courses are currently referenced by an ID number the URL will contain the course title:

http://info.edgehill.ac.uk/EHU_eprospectus/details/BS0041.asp
becomes:
http://www.edgehill.ac.uk/study/courses/computing

As part of our JISC funded mini-project, we’ll be outputting XCRI feeds from the online prospectus. The URLs for these will be really simple – just add /xcri to the end of the address:

http://www.edgehill.ac.uk/study/courses/xcri
http://www.edgehill.ac.uk/study/courses/computing/2009/xcri

In the news site, feeds of articles, tags, comments and much more will be available simply by adding /feed to the URL. Same will apply to search results.

All this is great for the new developments, but we do have a lot of static pages that won’t be replaced. Instead, pages will move to a flatter URL structure. For example, the Faculty of Education site will be available directly through what is currently the vanity URL meaning that most subpages also have a nice URL:

http://www.edgehill.ac.uk/Faculties/Education/Research/index.htm
becomes:
http://www.edgehill.ac.uk/education/research

Areas of the site which were previously hidden away three or four levels deep will be made more accessible through shorter URLs.

How are we doing this? The core of the new site is a brand new symfony based application. This allows us to embed our dynamic applications – news, events and prospectus – more closely into the site than has previously been possible. symfony allows you to define routing rules which while look complex on the backend because of the way they mix up pages, produce a uniform look to the structure of the site.

For our existing content we’re using a combination of some lookup tables in our symfony application and some Apache mod_rewrite rules to detect requests for existing content. All the existing pages will be redirected to their new locations so any bookmarks will continue to work and search engines will quickly find the new versions of pages.

That’s all for this little series of posts about URLs. Hopefully it has helped explain some of my thinking behind the changes. If you’ve got any questions then drop me an email or post a comment.

Bad URLs Part 3: Confessions time

Over the last couple of posts I’ve been looking at URLs, good and bad. Now it’s time to examine what we do at Edge Hill and see how we fare!

Most of our website is currently made up of static pages so it looks something like this:

http://www.edgehill.ac.uk/Faculties/Education/index.html

Other areas of the site aren’t quite so great:

http://www.edgehill.ac.uk/Faculties/FAS/English/History/index.html

Not terribly bad but it’s a little bit long – I wouldn’t like to read it out over the phone and because the URL is structured to mirror the department, when names change, the URL could change as well.

The site structure is quite deep which has led to some quite strange locations for pages, for example the copyright page linked to from every page on the site is within the Web Services area of the site:

http://www.edgehill.ac.uk/Sites/ITServices/WebServ/Copyright.htm

For use in publications, there’s a whole bunch of “vanity URLs” like this:

http://www.edgehill.ac.uk/education

And that will redirect you to the page you’re looking for. These are great – easy to read over the phone or type in but since most of them force redirect to the actual page, most people don’t know about them – if you copy and paste into a document you’re producing, you’ll get the long URL. But they’re also not universal – short URLs exist for some departments and services but not others and sometimes it can be hard to pick a good vanity URL.

When we look at some of our dynamic content however, things aren’t quite so great.

http://info.edgehill.ac.uk/EHU_news/article.asp?id=4786

What’s wrong with this?

  • Mystery “info” server – splitting page rankings on Google
  • Tell the user what language we’re using for the page (ASP)
  • Meaningless ID numbers
  • EHU_news – why not just news?
  • Nothing that search engines can pick up on for keywords

The first site I worked on for Edge Hill – Education Partnership – is a bit of a mixed bag:

https://go.edgehill.ac.uk/ep/static/primary-mentors

It’s fairly readable with words describing the pages rather than ID numbers, but it’s hosted as part of the GO website despite being in the main website template. There’s also the “static” in there which is a by-product of the implementation rather than being an relevant to the URL. I’ve learnt my lesson and it won’t happen again.

Overall I think we score 5/10 – no nightmare URLs but lots of scope for improvement and next time I’ll be looking at some of the plans to change how our sites are structured and maybe get a little technical about how we’re implementing it!