Virgin CEO: “This net neutrality thing is a load of bollocks”

Whilst I’m not one to cover a news story that’s already been covered elsewhere, TorrentFreak has a priceless quote from Virgin Media’s CEO: “This net neutrality thing is a load of bollocks”.

Virgin seem to have no problems in admitting that they are throttling sites that do not pay Virgin extra cash, despite the fact that these sites have already paid their own hosting providers.Read the full story here, and if you’re a customer, ring up, moan and consider switching

Fixing SpamAssassin’s FORGED_HOTMAIL_RCVD false positives

Lately people have been telling me about emails that I never received. A quick analysis reveals that these ended up in my Trash folder - meaning that SpamAssassin gave them a high spam rating.

After delving into Perl and examining Hotmail’s mail headers, it seems that Hotmail recently changed the structure of their headers (probably with the merge with Windows Live Mail), and as such the characteristic header style that SpamAssassin expects from Hotmail’s SMTP servers isn’t found, and it thinks that its a spammer is pretending to be hotmail.

Luckily, SpamAssassin 3.2.x has a new FORGED_HOTMAIL_RCVD2 header with the new hotmail header structure defined. But this version isn’t available in the stable Debian Etch. Version 3.2.x is, however, in the Debian Lenny repository, and being all Perl, installs and runs fine on Etch with no extra dependancies. All of your settings from Etch’s SA will still work (or at least they did for me).

To upgrade, simply type the following from a root console (The old version of SA will be removed automatically):
wget http://ftp.uk.debian.org/debian/pool/main/s/spamassassin/spamassassin_3.2.3-1_all.deb
dpkg -i spamassassin_3.2.3-1_all.deb

If you haven’t discovered the joys of Debian, I’m sure you can find a SpamAssassin 3.2.x package for your distro if you hunt hard enough.

I found the new version to be slightly slower, but more accurate, than the old version.

Publisher 2000 BorderArt causing instant reboots

I have discovered a most annoying bug with Microsoft Publisher 2000’s BorderArt feature.

It all started when I was making a poster for dad on his Windows XP PC, using his version of Publisher 2000, the Office afterthought desktop publishing application. All of a sudden, dad’s PC rebooted instantly. That is to say it went straight from working to BIOS, with no errors and no visible shutdown sequence. Thinking it was probably a rural power blip, I just reloaded Publisher and remade the poster I had lost. However, by the third instant reboot, I began to see a pattern emerge, and tested it a final time to get some proof.

Dad has virtually all Office settings on the default for Office 2000 Small Business Edition, and rarely ventures outside of Thunderbird, Firefox, Word and Publisher. He never even got round to working out what Excel does. In Publisher, If you make a rectangle shape, then click Format, Line/Border Style, More Styles…, a dialog box will appear.

In the BorderArt tab, you can select one of many tacky picture borders for your rectangle.

Except on dad’s PC, once you scroll about two-thirds of the way down the list of possible BorderArt types, the PC will instantly reboot as if you’ve hit the reset button. No errors, no blue screen of death (thanks to SP2 suppressing one maybe), nothing.

Unfortunately I cannot do more extensive testing because the reboots are very annoying, and because I don’t have a pile of Windows boxes around to test on. However, I would be very interested to hear if this has happened to anybody else, and into any insights and possible fixes would be much appreciated.

Random Topic Generator

It happens way too often. You’re in an msn conversation with someone who’s really cool, then the conversation runs dry. But you really don’t want it to end. You could be completely random like me, and change the topic if the previous topic runs dry.

/usr/share/dict/words contains a comprehensive list of wacky words (some that don’t look like real words to me, and many plurals that have apostrophes for no good reason). These could be potential topic ideas. It has many proper nouns, so first we must filter out words beginning with capital letters. Then take a random number, modulo the number of words, and find the nth element in the filtered list

x=$RANDOM;let x%=`grep "^[:lower:]" /usr/share/dict/words | wc -l`; grep "^[:lower:]" /usr/share/dict/words | head -n$x | tail -n1

Recovering Truncated or Corrupt Tar Archives

I was unfortunate enough to use resize2fs to resize an ext3 partition. The result, which at first appeared OK, was a corrupt filesystem.

Using SSHFS, I mapped ~martin on Andrew’s laptop to /media/sshfs on mine. I then told tar to make an archive of what was on my partition, and save it to the SSHFS. It errored out midway because of the severely corrupted filesystem, but I didn’t think it wouldn’t present a problem because the files in tars are simply stored back to back - it is a very basic archive tool.

After running badblocks, which was all clear, then formatting the partition to ext3 again, I attempted to extract the tar. GNU tar consumed CPU cycles for ages without writing any files, then complained about corruption. I assume it was walking the tar to check it wasn’t corrupt. This, as you can imagine, is not very handy. I perused the manpage looking for a switch that would disable the scan and just extract what it can, upto the corruption, but I could not see such a switch

I found a website that suggested finding where the last intact file in the tar ended, and feeding tar just that part of the file. They supplied a perl script to do the job.

The problem is that this perl script was unbelievably slow, and had probably only been tested on small files, not my 30GB tar, so Andrew wrote this python script to do the same thing, using information from Wikipedia and from BestSolution’s perl script. Andrew’s version started searching from the end of the tar, not the beginning, because we knew that my tar was corrupt at the end.

After a few minutes of watching it make a non-corrupt tar, we realised we were going to have to load the whole tar from the disk twice - once to repair it, and again to actually extract it. Surely a better way would be to just start extracting the truncated one, and give up when we get to the truncated file. Andrew’s Aerauntar does just that.

Usage:
python aerauntar.py archivename destinationfolder

So the moral of the story is not to tar backups from corrupt filesystems, always pay attention to errors and never assume that a program will operate as you think it will.

GNER’s AJAX Ticket Booking Website

I needed to buy some train tickets today, after getting the times from National Rail. Of course, National Rail don’t sell tickets, instead referring you to the train companies themselves. Well, I know what you’re thinking - that I could just buy tickets from my station. My station doesn’t have a ticket machine, nor is it staffed. So I try a few train companies’ sites. It seems they all subcontract our to The Train Line’s buggy system. The Train Line is a horrid site to use. It relies heavily on sessions, needs you to register before it shows you prices and generally irks me all of the time.

By chance I stumbled onto GNER’s site. They have recently moved to their own custom-designed ticket sales system, and I must say they’ve done a very good job indeed. Not only does it have a Web 2.0 “feel” (being clean and intuitive), it clearly explains the difference between the ticket times, and has AJAX light-boxes displaying each route after you click the more info buttons on them.

Furthermore, it shows a list of prices and a list of possible route-times. Clicking the price you want greys out the routes you are then not allowed to use, and clicking the route you want will grey out the ticket types that can’t be used with this route. Details of train changes are updated in realtime using AJAX as you highlight different routes. It also managed to find a great deal more routes than The Train Line did, in less time. And what’s more, you can of course buy tickets for any UK train from any UK train company. In future, I’ll be buying all of my advance tickets online from GNER, as their website is much more intuitive than the others. Good work GNER!

Integrate NHS computer systems

For all you foreigners, the NHS is Britain’s free state-funded health system.

A lot of people are opposed to the NHS’s ongoing computerization of patients’ records, because it is seen as a waste of taxpayer’s money. Whilst I agree that our government has a history of badly-implemented IT projects that have gone vastly overbudget, the NHS should have been centralized years ago. Surgeries all have their own systems, and most parts of the NHS rely at least partly on paper records. When you move house, you change your registered doctor, and your old surgery sends your records in a bundle to your new surgery.

At St. John’s College and, presumably, many other Oxford colleges, there is a policy whereby you must be registered with an Oxford doctor. This means you cannot be registered with a doctor in your home town. Last summer, when I wanted something as routine as a repeat prescription, my home surgery initially refused because I wasn’t “on their books”. After persevering, I had to fill a temporary resident form (or something to that effect.), which needed my NHS number, something I don’t carry around on me.

Of course now I’m old enough to have to pay for my prescriptions, the amount dispensed seems to have reduced. (You pay pertype of medication, no matter how much of that particular drug you are dispensed). Now I’m back in Oxford, I’ll have to make sure I top up my supplies before going home. But the doctors here have never prescribed me that medicine. Will they issue a repeat prescription when given an old repeat prescription from another surgery? I’ll probably have to book an appointment with one of the doctors, wasting a slot, just to get them to do some paperwork.

This is 2007. Virtually every other sizable organisation has integrated computer systems. Why should the NHS be any different? And why do I need to get a new repeat prescription printed for every instance of the repeat prescription. Surely that can make a form that says “Repeat prescriptions every 60 days until 01/01/2008″ that is stamped every time you make a claim off it, or something similar? The current system wastes everybody’s time and causes unnecessary inconvenience.

Hyperlinking to Piracy Sites is against UK law?

Slashdot and The Guardian recently reported about the arrest of the owner of tv-links.co.uk and the sites subsequent closing down. TV Links was a site which linked to videos on other sites (like Youtube and Veoh) where users could see TV series. The arrest was made by officers from Gloucestershire County Council trading standards in conjunction with investigators from Fact and Gloucestershire Police.

The biggest use of TV links that I know of was people watching sci-fi series that had been released in North America, but that the industry wouldn’t make available to UK viewers. So these series-followers had no legal route to get the series in the UK when it was released in America. Maybe the industry should look at why people are using TV links and provide a legal route for them to get the programmes, without making them wait weeks after it’s been aired in America. No-one wants to wait to see programmes that have been aired.

The big concern to me is the americanization of our country. It seems it is now illegal to link to a site which could be used for piracy. In effect this makes merely distributing information on how to copy copyright material illegal. Websites like BBC seem to have covered themselves by not linking to any of the sites in question when reporting news stories about piracy, but is there any difference in telling people they can get movies from The Pirate Bay, than telling them they can get movies from The Pirate Bay? An interesting question would be whether telling someone they can buy pirate DVDs at a certain place at Hemswell market also counts as “facilitation of copyright infringement”.

Fixing File Uploads after upgrading to PHP5

As some of you may know, I upgraded from PHP4 to PHP5 recently. And it went smoothly - or so I thought. Now, it seems I overlooked the fact that the reference $HTTP_POST_FILES has been phased out. This has been replaced with $_FILES. This is great, since Zen Cart, custom scripts, old phpBB installations and other old php scripts where users can upload a fail now fail spectacularly as they access an empty array.

First I wanted to write a perl script which I fed a list of files, but with 10GB of websites, this would take ages (and given a list , the number of arguments exceeded perls limit). So instead I ran this command from /var/www/vhosts
find . -name '*.php' -exec perl -p -i -e 's/\$HTTP_POST_FILES/\$_FILES/g' {} \;
find . -name '*.inc' -exec perl -p -i -e 's/\$HTTP_POST_FILES/\$_FILES/g' {} \;

to spawn a new perl instance per file, without checking jpegs and stuff.

Cycling from Rasen to Oxford

First, let me apologise for neglecting my trusty readers. But I’m back, back my magical journey.

On the Sunday at the start of freshers’ week I cycled from Middle Rasen to St John’s College, Oxford in 15 hours 59 minutes, including all stops. Unfortunately due to a dark start at 5.09am, I couldn’t set up my odometer to accurately measure the distance, but I’ll attach a map to show you. Note that I didn’t go as the crow flies (120 miles), but saught out backroads, so the distance is a bit more.

It was a nice cycle, because I wisely chose to carry virtually nothing and get my mum to bring it all in the car the following day. I wouldn’t say it was overly exerting, but I didn’t quite realise how hilly central England is, living in Lincolnshire and all. By two-thirds of the way there’s definitely only one possible speed.

Donov blogged this first. Here’s what he had to say:

Martin is INSANE
Mon, 01 Oct 2007
yesterday martin completed a 150 MILE cycle rids from his house in middle rasen to OXFORD, this journey took him 17 hours in total and has made me think that he is insane.


View Larger Map