Tag Archives: Computers

Why do we code

Last night while the night went on and I worked away optimising some code on a website I got to thinking about why we do it? Why do we code, and by code I’d guess you could apply this to anyone in the opensource community too. Its all lots of effort for no measurable payoff. Or is there?

The project I’ve been working on will never make me money. It’ll probably never be in a portfolio. My name isn’t visible on the site and nothing is credited back to me. And yet I’ve spent at least a few hundred hours on it at this stage with many more to come.

For as long as I’ve coded, I’ve also been doing non-paid coding. Hell its pretty much how we learn most languages unless we are lucky enough to have an understanding work place. But no, this project is written in PHP, a language I’ve 10+ years experience in.

So why do we do it?

The three reasons I have are pretty simple, although explaining them to non-coders never works. If you code, you’ll just nod along, but if you aren’t, well the usual response is just but why?

  1. Its fun. (a.k.a the what the hell are you smoking reason)
    No really, most coders I know actually enjoy it. Its quite satisfying to look at a problem, then solve it. Or to be creating something, even if that something takes days of work and ends up as just a small dot on screen.
    In short, the best coders like their work.
  2. I get some use out of it. (a.k.a. the selfish reason)
    In this case it started as some fun, a challenge even. Then it turned into some useful things that I like to see on a regular basis. It just so happens that others also like to see the same stuff so why not share the love?
  3. Experience. (a.k.a. the job reason)
    Some things in computers you can’t do without real world, high traffic. Yes you can simulate things, but it is never the same as that crazy user who does that thing you never thought possible. And when you get a few hundred of those, well that is when things get interesting. Sure you can say you built a site that does a, b, c, but it is much more impressive to have that site handle X number of users.

Overall I guess the whole thing loop around on itself a lot. It starts because it’s something you need or want, but that gives you some enjoyment so you keep at it, and then it grows only to become something that gives a bit of experience while building something else that you’ll get some use out of. And that gives you some enjoyment so you…

Dirvish Backup – Multiple seperate backup schedules

I’ve been used Dirvish now for just over a year. It replaced a number of rsync replication scripts that I had running that were doing rolling backups. While moving to Dirvish has required a few extra scripts to be written, it has been a worthwhile experience. My own scripts weren’t able to handle holding backups for longer periods, at least not gracefully. The biggest issue we had was trying to get Dirvish to do different backups on different schedules. The Dirvish config, while it may look like it allows this at first glance, it really isn’t setup for it. Backups once per day is its bread and butter.

Hopefully this will help clear up a few minor issues with Dirvish and get you running with multiple independent schedules.

Credit where credit is due, some of this is a result of a different sources on Google. We have modified this a number of times over the last year to fit our needs so I’m not totally sure how much of the original remains.

Note: This is from a Debian based system. Paths reflect same.

Initial Dirvish Configuration

For this guide, out setup consists of 1 host which we backup once per day, and the same host which has a directory which gets backed up once per hour. Backups are being stored under /storage/Backups/dirvish.

Our master.conf file – notice that no hosts are actually defined here.

bank:
     /storage/Backups/dirvish
exclude:
     lost+found/
     *~
     .nfs*
expire-default: +15 days
expire-rule:
#       MIN HR    DOM MON       DOW  STRFTIME_FMT
    *   *     *   *         1    +3 months
    *   *     1-7 *         1    +6 months
    *   *     1-7 1,4,7,10  1    +6 month
    *   10-20 *   *         *    +4 days
#   *   *     *   *         2-7  +15 days

runall-daily.conf

Runall:
     host     02:00

runall-hourly.conf

Runall:
      host/hourly/folder

/etc/cron.d/dirvish – This is what calls the jobs

00 01 * * *     root    /usr/sbin/dirvish-expire --quiet   # Expire old images
00 02 * * *     root    /usr/sbin/dirvish-runall --quiet --config runall-daily.conf
00 * * * *      root    /usr/sbin/dirvish-runall --quiet --config runall-hourly.conf

With those in place, our host backs up at 2am every day. The hourly script kicks in every hour. We setup the vaults as normal in the folders defined about. So the hourly is /storage/Backups/dirvish/host/hourly.

Only thing that needs changing is the image-default option in the configs.

Daily vault: image-default: %Y%m%d
Hourly vault: image-default: %Y%m%d%H

Living with it

This setup has run great for us. We get what we need backed up when we need it backed up. There has been a few notable excepts however.

First, one of our hosts started locking up during backup windows. Dirvish then went nuts and started marking incomplete backups correct somehow. We noticed when our bandwidth shot up as it was recopying full machine images across.

Second, our expire rules tend to leave too much data. Yes we could probably fix this, and we probably will when space becomes an issue. I guess the first thing is for our hourlys to reduce down to one a day on the older sets instead of keeping the whole day. But since Dirvish is so good with space, a few months of hourlys isn’t taking too much space.

Overall however, I can’t say we’ll be moving from Dirvish anytime soon for backups, at least for our linux machines.

Windows 7 Aero Peek and System Administration

Windows 7 brought around a number of different improvements for my system administrator job, most notably the fact that it connects to the Win2k8 servers while the old XP system didn’t. However this morning during a mass patching, the Aero Peek feature really has shown its true nature.

Right now I’m installing patches on 6 identical machines to bring them up to date with the patches from our WSUS server. (There are reasons why we don’t clone the boxes to get patches on.) But picture the only, XP way, of doing this. 6 remote desktop sessions and randomly switching between them to see the progress. Switch to the Windows 7 way and it becomes open 6 Hyper-V connections, then mouse over the taskbar icon to get the image below.

Completely wonderful. Total time saver.

In the time normally spent flicking between machines, I’ve written this post, added some more nagios alerts in, and checked on a few other servers.

Double disk failures – A storage nightmare

Anyone who has worked with storage systems, or even large personal installs has heard of them. Double disk failures. Words you never say. Ever! You can be banished from the server room for even suggesting it is possible!

But the reality is it can, and does happen. It is why we have hot swappable disks, or even hot swappable drives. I’m even looking at some array by NetApp which has something called DP or Dual Parity which, they say, can handle two separate disk failures without taking down the array. Something that sounds very interesting really. The Dell / Equallogic array we have on test currently runs in a type of raid 50 so you can lose two disks but only from separate arrays. The other two disks are running as hot standby disks to allow for online rebuilding.

The setup

My current, dilemma we’ll call it, is with a much simpler setup. Intel based server with 8 SATA disks connected to a 3ware card doing Raid-5. It is a high end 3ware card too, a 9650. (I do NOT recommend these cards. We have numerous other performance issues with the cards in both Windows and Linux, the Windows ones being much worse, currently stopping me copying backups). Anyway, to make things a little more challenging, something every admin loves in their day is a challenge, the server is remote. In another country remote.

Anyway, this machine has been running fine for nearly a year. Raid array sitting there taking files happily enough. When I started testing some further backups recently, I ran into some troubles. Most of it looked to be Windows related so the usual apply the updates, reboot the machine and see what happens. Only on the first reboot, wham, disk 8 offline. Ok, so I’ll finish the updates and then worry about getting another disks over to be put into the machine. Next reboot, disk comes magically back online but in a degraded state. Strange, we’ll let this rebuild and return tomorrow, see if live has returned to normal.

Normal is normal is just a cover

Sleeping on things and letting the array rebuild and everything looks to be great and just a temporary problem that we can forget about and move on. Never a good idea but when you are overworked, what can you do?

Another day passes trying to move backups across and we hit another windows error. This time requiring a registry fix to increase the IRQStackSize. So I bang in the first change and reboot. Login and strange, the system is locking up it appears. Open the 3ware webpage and get prompted with something  I’d not seen until now.

Raid Status: Inoperable

Luckily these are backups, no live data lost. We can fix this. Hell lets try a reboot and see. Can’t do anymore damage can it?

The Recovery?

Rebooting fixes disks, magically. Both disks back online. Array in a consistent state. Why not leave well enough alone?

More windows problems and another reboot. Back to two disks offline. Reboot again and one disk gone. Useless, useless, useless.

Solutions…

If this was a live server, with live data? I’d probably cry. There’d not be much else to do. You could probably have it rebuild by replacing the disk that was going offline the most, but I’d move as much off as quick as possible. In this case, since it is a backup server, I’ll be getting the guys local to the machine to remove and reseat all the drives. And check the cables inside the case. And then destroy and reformat the array, and the filesystem, with full formats all around.

And then to top it off, 10 reboots, minimum, when the server isn’t looking! If they all work, then maybe, just maybe I’ll look at trusting it again. Any problems and I guess I’m on a plane 🙁

Lessons learned

Well I think I’ll be putting the really critical data onto more than one backup server in future. At least more of the fileserver data anyway. The massive exchange backups will need to be looked at.

Enterprise level SANs are cheaper than you think when you factor in the cost of fixing setups like this. Okay so you aren’t going to be able to get a SAN for twice the price of a server with 16x1TB drives in it, or even three times. You may get a low spec’d one however, and if it gives more piece of mind, maybe that is worth the cost? I know that if faced with the decision in future, I’ll probably recommend a SAN and attached server for a file server assuming it is above the 1TB mark. Lower than that, you can probably get anyway with the multiple servers, replication software AND backups. Replication software is NOT backup software. Delete from live, deletes from backup.

And what nows?

I don’t know. All I can hope is that reseating disks and cables fixes the array, gets it online and lets me start transferring backups offsite. Another box is going to be added to give more backups, hopefully point in time backups too.

Backups really are the largest cost for something you never will use. I do honestly hope I never have to pull any data from backups, ever. It is possible what with Volume Shadow Copies on file servers and raid disks for servers. And maybe real permissions for applications, but that is another day!

In Private Browsing – aka Porn Mode

I’ve recently started using the In Private browsing feature of IE8 more and more, and no not for Porn!

For testing sites I’m developing or doing a clean Google search, it would usually involved closing the browser, clearing cookies / cache etc. and then restarting. It is now reduced to Tools -> InPrivate Browsing, and bang you’ve got a clean browser session. And I know Firefox supports this, but they really make it unusable if you run Firefox with lots of open tab (currently I’ve 36 and it isn’t a busy day) because it shuts the browser down, opens the private browsing mode and then restores things after it is finished.

I guess I’ve long since used two browsers. Firefox for personal stuff and general web development tasks. IE for Intranet net applications and not the InPrivate feature.

If only someone would invent some proper work spaces for a browser. And some better way of storing/organising favourites.

Dell overheating problems, Windows Search and Acronis restore

So it seems that Dell or more so nVidea have some over heating problems with some of the gpus. My D630c had been running really hot for quite a while and the fan was going a bit nuts during windows startup until last weekend… when the system decided to put random characters onscreen and die. As with all things, the laptop booted up fine on the Tuesday when I called Dell however running the system diags did reproduce the problem. While testing further, a hard drive problem popped up so they agreed to change the disk. When I mentioned about the heat problem, I was quickly put on hold for a few minutes, then they came back and said they were replacing the motherboard and fans. A quick Google did show up a few things about the failling GPUs.

Anyway, Dell did replace everything and things are working fine since. The replacement harddisk is a bit louder than the last one but it works so I’m finally getting back to normal. It has taken nearly a week to get everything restored, mostly due to Acronis being unable to restore large files individually. It kept getting stuck about 1.8G into the large files in my laptop. Doing a full disk restore worked fine.

The other annoying issue is Windows search stops working in Outlook after installing Exchange Administrator. Easy fix however. Close Outlook, Run System32\fixmapi.exe. Open Outlook. Let the search reindex everything. You may have to open the Windows Search options, select Outlook, then hit rebuild on the index.

Dell Keyboard layouts – why do they change them

It used to just be a case where certain models didn’t have a keyboard I liked but others would, but now looking at the Dell site, none of the laptops have a keyboard I like. And it isn’t like I’m after some crazy combination. All I want is a machine with the normal Irish keyboard. Even Wikipedia agrees with the format. Same as my D630c.

We currently order Vostro 1510 machines as standard and this problem might have started when they messed up the keyboard having the whole bottom line in the wrong place but the current keyboard model is closer than anything you see on the website. Only difference is the left shift key is bigger, the right on is smaller, oh and the backslash (\) key is on the far right instead of the left. A completely useless layout for anyone who uses the keyboard all the time for coding or working on linux.

Worse than all this is the trend to make the Enter / Return key smaller like the american keyboards. For US people, fine, keep it small since they are used to it, but don’t go trying to force random keyboard changes on us. Hell even the XPS that is on my desk has another layout.

Edit: So the new Vostro 1520 has normal keyboard, or so it looks until you start typing. The bottom line suffers from a smaller than normal ctrl key meaning the left hand side keys (ctrl, fn, windows, alt) are slightly to the left. Not a huge problem and I’d take it over the older problem, but still a problem. Also the keyboards on this model as bouncy. Yes bouncy. Whole keyboard moves when you press the keys in the center.

Windows Virtual Server 2005 R2 Differencing disk size

I’ve been running Virtual Server 2005 R2 for a few months now stress testing it on a machine seeing where the limits were before real deployment. Machine is a P4 dual core with 4G of ram. It has 1 Windows 2003 server host and 6-8 XP hosts running at any one time. The main bottle neck seems to always be the harddisk. It just can’t keep up with things. When things are running the disk queue is a solid line across the top of perfmon. Before you say it, the VM’s aren’t running lots of disk actions and in fact if I reduced the number of VM’s and increased the ram, it’d probably drop the disk usage way down.

Anyway when setting up these VM’s for testing, differencing disks seemed to be the way to go. New VM’s took a matter of minutes to setup and getting running. Now that they are running for nearly three months, it is looking not so hot. Each VM has about 4G of space used inside their virtual disks but yet the master disk is around the 3G mark with the differencing disks being over 6G. Something really wasn’t adding up. Clearly a disk file should take up more space than the data on the disk.

Merging the differencing disk

First thing I tried was to merge the differencing disk. To do this, you inspect the disk, then under actions, merge the disk and choose a new file.

Compacting the disk

Once the disk was merged, it gave the option to compact the disk. Running this did nothing but waste time making the new disk no smaller than the original.`

The Solution

So after some googling and trial and error, the following steps seem to have worked and made the disks a lot smaller.

  1. Merge the differencing disk into a new disk.
  2. Mount the new disk to the VM.
  3. Boot the VM and defrag the disk using Windows Defrag.
  4. Mount the precompator vm tool in the VM. This tool is found in precompact.iso in the Virtual Machine Additions folder in your Virtual Server install directory. (Usually C:\Program Files\Microsoft Virtual Server\Virtual Machine Additions\Precompact.iso)
  5. Run the precompactor in the VM. (This does take a while to run)
  6. Shutdown the VM and compact the hard disk image as above.

All in all, I got near a 50% reduction in real space usage which brought it closer to what the VM’s are actually using.

Magpie RSS

So I’ve used MagpieRss for quite sometime. First used happened a few years back when I did a little work customising my own version of TorrentFlux. Someone back then introduced me to Magpie which proved much easier than using my own XML parser.

Anyway since then it has been used in quite a few places to auto update sections of sites. One of the recent “bugs” I’ve come across is where it displays the message below from time to time.

Notice:  Undefined property:  etag in rss_fetch.inc on line 156

Anyway from the usual googling, there is more than one place that this type of message shows up. Annoying though is that this bug is fixed in the development version of Magpie and was actually fixed over two years ago.

Basic fix is to swap line 156

if( $rss and $rss->etag and $rss->last_modified) {

to

if ( $rss and (isset($rss->etag) and $rss->etag)
 and (isset($rss->last_modified) and $rss->last_modified) ) { 

On a side note, WordPress needs a nice way to handle code windows.