Google Experiment: Prelude

I’ve been using DreamHost to host this blog and most of my persistent online activity for the past 10 months. All in all I have been really happy with them. Others seem to have issues, but I don’t have any gripes. I’m only paying a paltry fee, so I don’t complain too much if my server isn’t five-nines reliable. However, I am starting to have a problem with one of the services available. The email support is somewhat lax. My current situation is that I am using mutt as a reader on the server itself. This requires me to SSH into the server and read the mail on there. Because DreamHost doesn’t realy expect that people will have this activity (most people check their mail by IMAP or webmail) it sometimes sees my mutt process as long-running, and decides to kill it with a HUP on the SSH connection. This is pretty annoying. Also, I get a lot of spam. No, really: on a typical day, I receive over 400 spams. Currently I am using SpamBayes for spam filtering, which works reasonably well - it catches most spam at least in the unsure folder. The same problem with mutt shows up with spambayes though: it takes up too much CPU and gets killed by DH’s watchdogs.

These problems lead me to this experiment. The hassle with my mail is just too large, and I hear that GMail’s spam filters are reasonably aggressive and work well after you train them slightly. I have decided that I will run an experiment. Starting with November 1st, all of my mail will forward to my GMail account. I will document the spam filtering (or lack of) abilities here on Base Zero, and I’ll see how it works out. I have years of data from other methods of spam filtering so I can compare.

However, this experiment isn’t only going to be about spam filtering. GMail also employs some different mail models than what I am used to. The first that I am interested in is tags. GMail uses tags instead of folders, the normal way of organizing mail. This is touted as better because you can have more than one tag assigned to a email at once, and therefore orgniaze mail which should be in two categories better. For messages which are in an Inbox you just use a Inbox tag. Spam messages get a Spam tag. So I will be orgnaizing my email with tags as much as possible. The second thing which intrigues me is the idea of emails being conversations instead of threads. GMail puts related emails together, but uses a flat structure as opposed to the tree structure that I am used to. I wonder if this will confuse me because I am on a number of lists which tend to use the tree structure a lot, with many branches. I may not like some of the things that GMail does – in these cases I will look for hacks or build ones myself which attempt to solve these problems. I am also probably going to try and do some user interface critique.

Some people are concerned about GMail and scanning emails. This doesn’t concern me that much: most ISPs can scan your emails for viruses and spam nowadays anyway, webmail particularly so. I can understand the effect that this could have on some people however. If anyone reading this message would like to be excluded from the email going to GMail, just reply to this post or email me. I will oblidge by filtering your following emails to a special folder on my server. Just realize that I will only read those emails weekly at best.

GMail will not be the only Google service which gets tested by me this month. Stay tuned for more information. Until then, hail to Google, our dear and glorious leader.