Archive for November, 2008

More memcached

Thursday, November 20th, 2008

While playing with memcached I wanted to find a way to monitor what was getting set.  You see, we are trying to pre-warm our cache with lots of colours. (roughly 16.5 million of them :) )

Anyway, I remembered I’d had a play a (long) while ago with expect and decided it would be just the thing…

So here we are – a status report from memcached every 5 minutes.

#!/bin/bash
# script to monitor memcached

expect << EOF
set timeout 1
spawn telnet localhost 11211
while 1 {
send "stats\n";
expect #
send "stats slabs\n";
expect #
puts "sleeping for 5 mins"
sleep 300;
}

# ends

Our pre_warming script has been running for almost an hour and has warmed the cache with 319116 items – so about 52 hours to go….

In terms of memory usage, I am running memcached with a Gig of Ram, and after the hour we are only using 4.7 meg of that :) . So my estimate is that we would need approximately 250MB.

Target = 16,581,375 items (colours)
1 hour = 319,116 items
1 hour = 5,005,741 bytes written
so 16,581,375/319,116 = 51.96 Hours
so 5,005,741*51.96 = 260,098,302 bytes = 248.05 MB

And now I’ve done these calculations, something makes me want to revist my memcache monitor script so that I pass it a target and then get a “time left” estimate :)

Colourphon ontology for digital images

Monday, November 17th, 2008

I was recently talking about my experiences with OWL over on ITO, and so have decided to go ahead and put my OWL ontology out there, and invite comment.  I am quite open to criticism, as I am sure that I will need to revise this, but here it is.

The aim of this ontology is to clarify some of the classes of owl:thing that concern us here at Colourphon. 

We are describing the colours found within digital images. Note: I use the term ‘digital image’ as opposed to just ‘image’ as we are specifically dealing with data gleaned from digital representations of images, and not images in any other format.

Currently the ontology deals with DigitalImage, the image itself; Pixel, a point within the image; Coordinate, the location of a pixel; RGBValue, the RGB value of the pixel; ColourName, the word or phrase used to describe the colour and GuessedColourName, A word or phrase used to describe the colour (but that word or phrase may not be accurate).

BTW, this ontology is modelled using Protégé 4, if you want to see it as I see it…

If you are aware of any work already in this area, then please do let us know.

memcached

Friday, November 7th, 2008

The Wolverhampton Hell’s Angels are shaking our windows and foundations with their stomach churning firework display, which I would be watching now rather than typing this, but for the trees at the back of the garden that are in the way!

This post comes in too parts, the good news and the bad news.

First the good news. At lunchtime, Rich and I had a chat while doing our lunchtime circuit of the business park, and we were talking about ways to make Colourphon quicker.  We know we can do the analysis which takes a bunch of coloured pixels and puts a human friendly name to the most frequently occurring in the image.  But the problem was that this was taking upwards of 30 seconds, so PHP, quite rightly, kept throwing back a maximum execution timeout error.

Rich has been thinking alot about application architecture, and in particular memcached.

So tonight, on my ubuntu development machine, I installed memcached

sudo apt-get install memcached

I installed a pecl extension for PHP.

sudo pecl install memcache

Then I added a function to instantiate a Memcache object with an array of servers, finding a neat way to get an array stored as a constant, by having the constant contain an object reference that could be evaluated.

$arr=array(1,2,3);

define("ARRAY_CONSTANT","return ".var_export($arr,1).";");

Then when we want to use the constant we simply evaluate it.

foreach(eval(ARRAY_CONSTANT) as $val ){

//do stuff with $val

}

The upshot is:
1st page load: 29.94 seconds – including several hundred calls to the class with most processing overhead.
2nd page load: 1.61 seconds – with not a single call to the class with most processing overhead!!

So if you need massive performance boost, use memcache – originally desiged for database cals, but if you want to cache a bunch of frequently used data – even objects, simply serealize and deserialize when you need it.

It has taken longer to write and proof read this post than it did to add the 12 lines of code to get it to work.

The bad news?  This isn’t live on a public machine yet :) .