Main

March 18, 2005

Two birds with one stone

Since my employer Sun Microsystems informed me that I was being made redundant last month I've been on so-called "gardening leave", and for the first time in a very long time I've had time to do stuff just for the hell of it. I decided I'd take a look at NetBeans, a IDE for Java that I'd heard good things about.

After grabbing the latest JDK, at Gary's suggestion I downloaded a copy of the Open Source NetBeans IDE to have a play. I've never been a particular fan of IDEs, but NetBeans is actually very good. As well as the editor and debugger there's also GUI designer and a load of other bits, and it all works together very well. I particularly like the 'as you go' syntax checking which highlights errors in your code in much the same way as the auto spellchecker works in a word processor - the erroneous code is underlined in red, and if you move the cursor over the line you get a diagnostic message. The editor also support folding, something I first saw years ago in the Occam editor.

The next job was to think of something small but useful to write. Although I've already deployed some anti-blogspam measures on this site, I'm beginning to notice a gradual increase in attacks - inevitably the spammers are getting wise to the more common tricks used to put them off. Some countermeasures such as the "answer this maths question" approach used by blogs.sun.com are trivially circumventable. The most popular and sucessful countermeasure at the moment seems to be to use a captcha, but personally I don't like them as I feel they are intrusive, and despite the hype about them the implementations often have flaws that still leave them open to attack. The problem is that HTTP is a stateless protocol, so each page has to contain enough context to enable the server to verify that the response to the captcha is correct, whether that be a hidden form field, a cookie or whatever. Because of that, any such scheme is vunerable to capture/replay attacks. Even using HTTPS to encrypt the communication channel doesn't protect against the attacker viewing the page and/or cookie source and figuring out the protection mechanism.

I therefore decided that obfustication of the communication between the webserver and browser was probably a reasonable approach, and one way of doing this was to implement comment submission using a Java applet. However MovableType uses HTML forms and HTTP POSTs requests to submit comments, and as I didn't want to rewrite the back-end I had to figure out how to get a Java applet to behave as if it were a HTML form.

Continue reading "Two birds with one stone" »

July 02, 2004

Blogspammed!

I've had 60 comments since I started this site, so I was more than a little surprised to see 200 comment notification emails in my inbox this evening - yes, you guessed it, I'd been blogspammed. Quite a few of my friends don't allow comments on their blogs for this very reason, but I'm loathe to be forced to do the same. I've had a few in the past, but tonight was something else - they all arrived within a ten minute period, they were spread across nearly all the entries on my blog and they came from a range of different IP addresses. At a guess I'd say that I'd been attacked by a swarm of zombie machines.

First thing was to take the website down to stop any more arriving and clean up the mess - MovableType really sucks at this, you have to go through each entry individually to remove them. I did think of exporting the blog content, removing the spam and reimporting it, but I suspect all the entry numbers would change, so any links that people have to the site would break. Half an hour and a case of incipient RSI later I was clean.

I've already deployed one of the more common countermeasures by renaming my comment script, plus my site runs under mod_perl so the URLs aren't the normal ones anway, but whatever hit my site was clever enough to figure out the correct URL for the comment submission script despite this.

I've now added a few more slightly more subtle countermeasures (no I'm not going to tell you what they are :-) and I'll see what happens. If I get hit again I'll have to seriously consider turning off comments, which would be a shame, but my tolerance just won't extend to deleting a series of spam tsunamis.

June 07, 2004

Ooof, I've been TimBrayed...

Tiim Bray mentioned my post with the pictures of the bluebells over on his blog, and my number of visitors have shot throught the roof, more than twice what I normally get and the day isn't even finished yet - a kind of one-man slashdot effect :-)

May 16, 2004

My last word on MovableType, I promise

Prompted by one of the bazillions of posts on the new MovableType license, I actually went and looked at the new 3.0 licenses, and I've come to the conclusion that the so-called Personal use license is so restrictive that virtually nobody who writes a blog will be able to use it:

“Non-Commercial Purposes” means use of the Software by an individual for publishing on a personal blog site on a single sever that does not directly or indirectly support any commercial efforts.

So if I mention anything about my job, it would break the clause about indirectly supporting commercial efforts. For most people their job is a major component of their lives, and it's just ridiculous to assume that they aren't at some point going to mention something that happened to them at work, or rave about some cool work-related project they are working on, for example.

I'm sure I'll probably get someone either working for SixApart or related to them telling me that I'm misinterpreting the license, but it's so vague and badly written that misinterpretation is inevitable. I've also seen people assuming that based on what was said in the original SixApart announcement that they had been given 'A nod and a wink' to carry on using MT in the same way they always had. Sorry - read the license, that's what governs whay you can and can't do, not what you read between the lines of some blog entry.

There's also an excellent analysis of the MT3.0 screwup in marketing terms in Elizabeth Lane Lawley's blog:

You don’t just need to know what the random(user) thinks, you need to know what the opinion makers and change agents think - because since Movable Type users are all publishers, with audiences, those people will have an immediate impact on other users with their public reactions.
...
The fact that the response to the new licenses surprised them so much says volumes about how little they understood their users.

Exactly so. Finally, there's a rather good comparison chart of the various MT alternatives in Unbrand's blog on unbounded.org.

May 15, 2004

MovableTripe

Well, after all the furore over the new MovableType license that I mentioned in my last post, SixApart have tried to mollify their users, and many of the MT users seem to have breathed a collective sigh of relief. However, perhaps I'm an old cynic (OK, I am an old cynic), but I just don't buy it, sorry - and here are the reasons why:

  • The people trying to defend SixApart have made much of the fact that nobody has been forced to upgrade to MT3.0, and that it's a 'Developer only' version. Well, excuse me, but have any of you actually tried to find 2.661 on the MovableType website? No? I thought not, because it's not actually there any more, as far as I can tell. So if 3.0 is only intended to be for developers, where is the non-developer version? What are new MT users supposed to use? And before anyone says "Oh I can get a copy from my friend", go read the license, you can't:

    Personal, Non-Commercial Use License. The Software is licensed to you for your personal, non-commercial use. You may install the Software on a single website, residing on a single server for your personal, non-commercial use only. Enjoy!

    and:

    Restrictions on Use. Licensor grants you the non-exclusive, non-transferable right to use the Software to manage and update your personal, non-commercial website. You may not redistribute the Software without Licensor's prior written consent. Although you may modify or alter the Software for your own use (including copies that extend, or enhance the Software), you may not distribute, transfer, or resell the modified or derivative copies of the Software; you may not use such copies for other than personal, non-commercial purposes; and you may not use such copies in a way that violates the terms of this Agreement.

    So to summarise, you can't download 2.661 from the MT webside any more, and you can't get a copy from any other source without breaking the license. Conclusion: 2.661 is dead.


  • If this isn't an attempt to force people into paying for 3.0, where is the stuff to allow you to downgrade from 3.0 back to 2.661? How do 3.0 beta testers revert back to 2.661 if they aren't happy with the 3.0 licensing terms?

  • Despite appearances, they haven't actually dropped the prices, if you read the announcement carefully, they've just increased the user counts slightly, so it's still $100 for the cheapest version.

  • Lets not forget that MT is just a few perl CGI scripts where the interface has had some graphic design talent applied - not exactly rocket science. In fact its performance is pretty appalling, and that's not the fault of perl, it's the sign of an indifferently written app. Perl, used correctly is fast enough for the light text munging needed to run a blog - I have a perl app I wrote for the day job that churns out 146MB of HTML - 31,000 files - in about 10 minutes. No way am I shelling out $100 for MT, because frankly is just isn't worth that kind of money, not when for only $84 I could by this, for example.

As my friend Elaine pointed out in her comment to my first post, SixApart are no longer just the cute Mena and Ben that started the company, they've taken money from some Venture Capitalists, and as CM Harrington puts it so succinctly in a comment Elaine referred to over over here:

6A needs to get paid. I don't think anyone disputes that. People also need to realise that 6A is no longer "Ben and Mena". It's a full-fledged VC funded company with offices around the world.

When 6A became VC funded, a very radical change occurred. The VC, because they are the ones fronting the money, are the ones calling the shots. Don't be fooled by Mena's posts. TypeKey, TypePad, the MT3 Licensing, and the lack of communication are all influenced by the VC. That's just what happens when you become funded. That is the entire point. Other people give you huge chunks of money so they can make more money.

6A probably signed a deal that told the VC that they will get X% return on investment in T time (where X is large, and T is small). How do you do that? Usually it involves doing things you wouldn't normally do to customers or a community you are a part of. The licensing scheme is the obvious result. If they didn't do this, the VC people would probably be able to sell the whole shebang to a large company and sack Mena and Ben without batting an eyelash.

I don't know either of the SixApart founders, but everything I've read suggests that they are nice people, and I have no reason to doubt this. However, they've taken the thirty pieces of silver from the Venture Capitalists, and from what has happened over the MT3.0 release it seems clear that they aren't really calling the shots any longer. I feel sorry for them, but I don't think it's reasonable for them to expect their users (many who contributed much of the intellectual capital and code in MT) to cover their debt to the VC vultures. I'll make a prediction - in 12 months, they won't be at the helm of the company any longer.

SixApart employees, start polishing your resumes.

May 14, 2004

MovableType's business model implodes

MovableType is the software I use to run this blog and is produced by a company called SixApart. Up until now it has always been free, but MT3.0 has just been released, and it's far from free. SixApart have tried to spin the announcement as best they could, but judging from the hailstorm of negative comments it's backfired very, very badly. The amusing thing is that people are registering their displeasure using the very trackback feature that SixApart themselves promoted so strongly - everyone who reads the announcement also gets to see the torrents of vituperation being hurled at SixApart. Nice PR move guys!

SixApart do claim to offer a free version of MT, but it's very restricted (one author, max of three blogs), and you have to register with their TypeKey service before you can download the new version. It won't affect me as I only run this blog, but it sure has pissed off all the people who let their family members run a blog on their machines.

It seems that SixApart have ignored one of the primary tenets of releasing something as open source (OK, I know the license is a bit odd, but you could get the source for free):

Once the genie is out of the bottle, you can't put it back in.

If you release something as open source, you need to be sure that either it doesn't have any direct revenue implications for your business, or that any revenue you lose as a consequence is compensated for by an increase elsewhere - for example either by using the release to drive adoption of other related products, or offering a superior services package on top. Even offering a 'premium' non-free version (à la OpenOffice/StarOffice) will leave you wide open to accusations of releasing 'crippleware' and of 'Not understanding the whole open source thing'. What you absolutely can't do is bait and switch - release something as free then at some point in the future try to screw money out of people for it, as Borland found with their loused up attempt to open source their InterBase RDBMs, now known as Firebird.

SixApart also seem to have forgotten two other little facts - firstly there is actually nothing that requires you to move up from the current free version (2.661), and secondly there are a plethora of free alternatives out there, some of them possibly better than MT. Alienating the bulk of your user community when your business depends entirely on an easily replaceable asset seems to be somewhat suicidal. The people who I really feel sorry for are the MT3.0 alpha and beta testers, who will have undoubtedly invested a lot of time and effort in helping get MT3.0 out of the door, only to be asked to pay for the very thing they have been working on for free. Nice.

And hey guys, if you're gonna charge for MT, when are you going to move your domain out of .org and into .com? ;-)

April 08, 2004

Opinions may vary

According to Glynn Foster,

"We still have people being all rude and not publishing their entire blog entry in their RSS feed. It's hugely infuriating having to click through to various blogs just to read them. Hopefully people are reading this and making suitable changes. Ahem."

I made a conscious decision not to include the entire content of my blog in the RSS feed, for several reasons:

  • If someone can't be bothered to click through to my site they probably aren't interested in what I have to say, so I think I shouldn't force them to read or at least scan past my posts in their entirety.
  • I'm not always interested in everything picked up by PlanetSun. I think it's inconvenient to have to scroll down though pages of stuff I'm not interested in.
  • I'm already getting a hit every 15 minutes from PlanetSun. My blog often includes photos, and I don't really want them being pulled every 15 minutes.
  • Blogs that don't strip out the HTML tags from their RSS feeds make it much more difficult to handle the feeds in a graceful way. As Mark Pilgrim points out, putting HTML in RSS feeds, whilst allowed, is potentially dangerous.
  • I access PlanetSun throught my existing aggregator, the most excellent RSS reader plugin for FireFox. As I move the cursor over each entry the contents of the RSS feed in question appear in a popup, and if the entire article is delivered in the feed I get a series of huge boxes (usually full of HTML) flying up the screen.

Perhaps the solution is to have two feeds, one containing the plaintext synopsis and the other containing the full contents of the posts?

My own personal gripe is about people who don't have comments enabled on their blogs - hopefully people are reading this and making suitable changes. Ahem.

I also have my doubts about how scaleable the PlanetSun approach is - as the number of blogs it aggregates grows it is going to put an increasing strain on the ADSL line that it is at the end of. Most blogging packages allow you so set up notifications using either email or XML/RPC, and something like that would make it possible for the individual blog owners to decide if they wanted to be on PlanetSun.

April 02, 2004

Sun weblog aggregator

I was looking through my web server stats for today (with http://awstats.sourceforge.net/, highly recommended) and I noticed that someone or something on a BT ADSL line had been accessing my site quite heavily. Being a nosey sod I went exploring to find out who it was - it turns out it is another Sun engineer down in Watford (Hi Dave!), and he's put together a web RSS aggregator for all the Sun folks he's been able to find who have external blogs. It's kinda neat - see http://planetsun.org/

December 14, 2003

Blogging disaster narrowly averted

I upgraded my webserver to the latest Solaris 10 bits on this week, and evidently I didn't get it quite right as the machine died a grisly death soon after. Being paranoid, I have a copy of the OS on a second bootable partition, as well as a complete backup of everything on tape, so getting the machine up just meant booting off the backup partition, followed by a complex dance to resize both the primary and secondary boot partitions from 2Gb to 3Gb as they were getting a little short on space. Unfortunately I didn't have a backup of my last three blog entries, including all the photos, nor of the modifications I'd made to some of the static pages on the site. However, I remembered that after I'd made the last set of changes I'd checked the layout of the site with Internet Explorer on the PC upstairs, so I was able to get all the missing content from the IE cache - about the only way that IE scores over Mozilla/Firebird is that it gives you an easy browseable list of all the stuff it has stored in it's cache. Thank goodness for big disks and generous cache settings :-)

November 07, 2003

Putting GoogleBot on a leash

As I noted earlier, I get a lot of hits from the Google web indexer GoogleBot (252 visits last month), in fact various web crawlers are the most frequent visitors to my site. Whilst most webwranglers know about the robots.txt file and how to use it to control the activities of robots when they visit your site, it is a bit of a blunt instrument, as it can only exclude entire subtrees from being indexed.

There is a more fine-grained way of controlling the way Google and other robots index your site using a <meta> tag to direct the robot. This is mentioned on the GoogleBot page linked to above, and the official specification is available here. The basic principle is very simple, you need to add a line of the form

<meta name="robots" content="noindex,follow" />

in the <head> section of your HTML documents - for MovableType you can add them to your templates. The content attribute has just four possible permutations:

  • content="index,follow"
    Index the page itself, and follow all links from the page.
  • content="noindex,follow"
    Don't index the page itself, but follow all links from the page.
  • content="index,nofollow"
    Index the page itself, but don't follow any links from the page.
  • content="noindex,nofollow"
    Don't index the page itself, and don't follow any links from the page.

Not all robots take notice of this directive, but Google certainly does, and you can use it to prevent it indexing rapidly-changing and low-content pages such as your main index page and your TrackBack entries.

It did occur to me that I could perhaps hack up my favicon.ico Apache module to insert the meta tags automatically, but for my limited usage it didn't seem worth the effort.

More favicon.ico fiddling

I finally got bored with sticking <link rel="shortcut icon" href="/favicon.ico"> tags into all my HTML files to prevent favicon.ico clutter filling my Apache logs, so I wrote an Apache module to do it automagically for me. The module allow you to specify which favicon.ico file you want for a particular subtree of your website like this, for example:

# Blog
Alias /blog /var/apache/blog
<Directory /var/apache/blog>
        SetHandler      perl-script
        PerlHandler     Apache::FavIcon
        PerlSetVar      FavIcon /blog/images/favicon.ico
        : 
</Directory>

You can also override any existing tag by using PerlSetVar FavIconOverride On flag. The module is also Apache::Filter aware, so it can also be used with CGI scripts. Download the Perl module if you want to have a play, and you can get a blank favicon.ico file here. If enough people are interested I may get round to packaging it up properly and putting it on CPAN

October 05, 2003

Blog introspection

In one of those occasionaly odd instances of synchronicity, on Friday I installed AWstats on my machine so I could look at the access logs for my site a bit more easily, and on Saturday I found this survey of weblogs referred to by The Register. This got me thinking (hey, it doesn't happen that often) about blogging in general, and my blog in particular.

Continue reading "Blog introspection" »

September 07, 2003

Why doesn't CSS have macros?

I've been fiddling with the stylesheet for this blog, and one of the annoying things is that CSS doesn't seem to have any mechanism for defining macros, for example I may want to use the same colour in several different CSS rules, and rather than having to specify it explicitly each time I'd like to define it once and then use the symbolic name. This would make it much easier to make global changes to stylesheets, e.g. changing the background colour of all elements. I have no idea why this wasn't in the CSS spec - it seems like such an obvious requirement.

Anyway, I noticed that MovableType has a mechanism for defining and using variables in its templates, like this:

<MTSetVar name="background" value="#404040">
...
.someclass {
        ... 
        background:        <MTGetVar name="background">;
        ...
}

Which seems like just the job, so I've changed my stylesheet template to use that as a way of specifying and using various global values.

However, it doesn't seem to be possible to get MoveableType to just expand the stylesheet template, and regenerating the whole site just to try out a stylesheet change very quickly gets to be a pain. I've therefore hacked up this little script to expand the MTSetVar and MTGetVar tags so you can do this:

$ expand_ss templates/stylesheet.tmpl > stylesheet.css

Update: I've found a MovableType extension script that will allow you to rebuild bits of your MT blog from the command-line here, which seems like a much better solution, as it will expand all and any MT tags.

August 14, 2003

More Apache stuff: Limiting the bandwidth your website uses

If like me your website is hosted on a machine sat on your ADSL line, the total amout of bandwidth used by people hitting your site can be a concern. My friend Stephen pointed me at this rather cool utility - Trickle - that you can use to limit the total bandwidth used by people downloading - the really useful thing is that it can limit more than one application at once, so if you are running say a website and a FTP server you can limit the aggregate bandwidth used by both. Another friend Gary who has libxml2 binaries for Solaris on his site has used it to limit his bandwidth, and says it works well, although I haven't needed to deploy it myself (yet!)

Apache logs part II - stopping those favicon.ico requests from filling your logs

Following on from my earlier blog entry about stopping worms from cluttering your Apache logs, here's how to get rid of all those failed requests for favon.ico, for which you can blame Internet Explorer, which requests this file every time it visits you website. The easiest way is to put a 16x16 Windows ICO file in your htdocs directory. You can grab a blank one from here - this is just a white square. If you wish to make your own, start off with a 16x16 GIF image, and then use something like IrfanView to convert it (if anyone knows of a lightweight, non-windows alternative I'd be glad to hear about it).

You can also make the icons show up in Mozilla if you want, but you need to explicitly put the following in the <head> section of each web page:

<link rel="shortcut icon" href="/favicon.ico">

So you could have different icons for different parts of your site, if you really cared that much ;-)

February 03, 2003

Stopping worms cluttering up your Apache logs

Fed up of 90% of your Apache log entries being failed requests for root.exe, cmd.exe and default.ida (caused by the Nmidia worm)? Here's how to stop it, put the following in your httpd.conf:

# Ignore worms
SetEnvIf        Request_URI "/(cmd\.exe|root\.exe|default\.ida)$" DontLog
RewriteEngine   on
RewriteCond     %{REQUEST_URI}  "/(cmd\.exe|root\.exe|default\.ida)$"
RewriteRule     ^.*$    - [forbidden]

and on your CustomLog line, append !DontLog, so it looks like this:

CustomLog               /var/apache/logs/access_log common env=!DontLog

The SetEnvIf and DontLog bits stop the request showing up in your access_log, and the Rewrite bits stop the failed request showing up in your error_log, as well as returning a 403 FORBIDDEN to the requesting PC.

January 18, 2003

Mozilla is a gorilla

Partly prompted by the fact that the cruddy old version of Netscrape I was using wouldn't render the default Moveable Type stylesheet properly, I finally took the plunge and switched to mozilla 1.2.1, at both home and work.

Continue reading "Mozilla is a gorilla" »