Simon Engledew's Blog

Eldritch nomenclature, conjoured for arcane contraptions to execute unerringly.

May 15, 2009 at 8:52am
0 notes

Greedy and Reluctant Qualifiers

A common gotcha when using regular expressions occurs when using the default (greedy) qualifiers + ? and *. These qualifiers will attempt to make the longest match they possibly can.

The regular expression:

/'(.*)'/

will successfully match and group the word there in hello 'there', but will actually run on to the last single quote in the string hello 'there' how are 'you', matching there' how are 'you.

One solution is to restrict the set of characters you are searching for with the greedy qualifier, thus ensuring the match will finish before hitting the terminating character:

/'([^']*)'/

This works, but the more readable option is to turn the greedy qualifier into a reluctant one:

/'(.*?)'/

By adding a ? to the qualifier the expression will try to match the minimum string that satisfies the expression.

May 14, 2009 at 2:50pm
0 notes

Making Java Processes Play Nice

Looking around, it seems that there is no easy way to stop Java from eating all your system resources when running a particularly heavy-going task.

Thankfully my lovely colleague Ben made me aware of a helpful UNIX command called nice.

By prefixing nice to any command you can ask the scheduler to be a bit more kind, running the process at a slightly lower priority to ensure it doesn’t starve other resources of CPU time:

nice java ExpensiveTask

April 28, 2009 at 2:46pm
0 notes

Progressive Enhancement Using Comments

Most methods of progressive enhancement involve setting various DOM elements to display:none and then making them visible from JavaScript, or building new document nodes and inserting them.

In the former case you end up with lots of brittle snippets of code for traversing the DOM and toggling elements. In the later case you often end up having to write the same markup twice: Once in your web application, and once in the JavaScript which enhances your code.

I recently read a blog post by James Padolsey that suggested creating comment nodes with HTML inside and then promoting their contents to real nodes within their parent elements. Neat.

The comments for this article on Hacker News quickly moved to performance however: As you can’t natively scour the DOM for comments, it is necessary to iterate over every DOM element from JavaScript – checking its type and manipulating it if applicable. An expensive business.

If performance really is an issue, but you like the idea of baking your HTML directly into the page as comments, you can replace the DOM traversal with an atomic regular expression replace and an innerHTML assignment. Both of these operations occur in the underlying DOM implementation and are run at native speeds:

document.body.innerHTML = document.body.innerHTML.replace(/(?:<!--\[enhance\]>)|(?:<!\[enhance\]-->)/g, '');

Wrapped up into a tidy rails helper, you can use this to create blocked out elements which will only render if the user has JavaScript enabled:

def enhancement(&block)
  concat("<!--[enhance]>#{capture(&block)}<![enhance]-->")
end
<% enhancement do %>You have JavaScript enabled<% end %>

Obviously it’s not quite as flexible as the DOM-based solution — especially as James seems to be branching out into a fully-blown templating system called JSHTML — but I think it works as a speedy way to encapsulate JavaScript-only functionality.

March 9, 2009 at 1:53pm
0 notes

MySQL Remote Database Transfers

One-liner to remote copy a MySQL database over SSH:

mysqldump [db] | ssh -C [host] 'mysql [db]'

For anything more complicated there is also taps.

March 6, 2009 at 11:39am
0 notes

Reflection and Introspection Over Modules and Packages In Python

Someone Twitter me if I’m missing something, but I couldn’t find a core way of doing reflection over packages in Python.
In this particular case, I wanted a way to load all the modules in a certain package (a directory with an __init__.py) and automatically add them into a running Twisted service.

To get this working, I created a small module called reflection:

import os
import sys
import re

'turns a lower case string with underscores into its camel case equivalent'
def camelize(string):
    return re.sub(r"(?:^|_)(.)", lambda x: x.group(0)[-1].upper(), string)

'returns a list of modules objects in the package identified by package_name'
def dir_modules(package_name):
    modules = []
    load_package(modules.append, package_name)
    return modules

'load the specified class from module, or return the default class derived from module.__name__ if no class_name is specified'
def load_class(module, class_name = None):
    return getattr(__import__(module) if module.__class__ == str else module, class_name or camelize(module.__name__.split('.')[-1]))

def load_package(function, package_name):
    os.path.walk(package_name, load_modules, function)

def load_modules(function, package_name, module_names):
    for module_name in module_names:
        if re.match(r"^(?!__)\w*\.py$", module_name):
            qualified_module_name = '%s.%s' % (package_name, module_name[0:-3])
            __import__(qualified_module_name)
            function(sys.modules[qualified_module_name])

My Twisted application then uses this code in its initializer:

xmlrpc.XMLRPC.__init__(self)
xmlrpc.addIntrospection(self)
for module in reflection.dir_modules('services'):
    klass = reflection.load_class(module)
    if issubclass(klass, xmlrpc.XMLRPC):
        print 'adding xmlrpc sub handler from %s' % module.__name__
        self.putSubHandler(module.__name__.split('.')[-1], klass())
    elif issubclass(klass, internet.TimerService):
        print 'initalizing timer service from %s' % module.__name__
        instance = klass(Application.Config[module.__name__.split('.')[-1]]['interval'])
        instance.setServiceParent(Service.Application)
        instance.startService()

Now you can drop new modules into the services directory and, provided they are named correctly, they will be auto-loaded into the application on restart.

March 5, 2009 at 4:10pm
0 notes

Killing Python: Exiting Without Using SystemExit

Usually in python you exit a script programmatically by raising SystemExit or calling sys.exit. Both methods send an exception hurtling up the call stack, allowing every level of your program to execute finally statements and exit cleanly.

This behaviour changes if you throw SystemExit in a multithreaded application: it kills the calling thread instead. If the calling thread is not the main thread the application will just keep ticking along. Fine in most cases, but sometimes you just need to make the application quit.

To achieve this, you need to call:

os._exit()

with an error code of your choice.

February 13, 2009 at 3:59pm
0 notes

Creating Static Methods In Python

Static methods can be a little confusing if you come to Python from other languages in which they are first class citizens. To create a static method you need to pass an existing method through staticmethod():

class Person:
  people = {}
  def __init__(self, name):
    self.name = name
    Person.people[self.name] = self

  def find_by_name(name):
    return Person.people.get(name)

  find_by_name = staticmethod(find_by_name)

This will do all the hokey Python magic to (I assume) bind the method correctly into the Class’s lookup __dict__.

Thankfully, newer versions of Python — including 2.5.1, the version which ships with Leopard — allow you to use the slightly more aesthetically pleasing decorator shortcut:

class Person:
  people = {}
  def __init__(self, name):
    self.name = name
    Person.people[self.name] = self

  @staticmethod
  def find_by_name(name):
    return Person.people.get(name)

February 1, 2009 at 5:48pm
0 notes

Rails 2.2 Templates

TemplateHandlers have been significantly overhauled in Rails 2.2, and these changes are not backwards-compatible with Rails 2.1.

Instead of being responsible for rendering a template, TemplateHandlers should now provide a string of Ruby that will be eval’ed by ActionView further along the rendering chain.

So, where previously a TemplateHandler might have declared a render method:

class DotHandler < ActionView::TemplateHandlers::ERB

  def render(template, local_assigns = {})
    @view.controller.headers["Content-Type"] ||= 'image/png'

    input = super(template)

    output = IO.popen('dot -Tpng', 'r+') do |io|
      io.write(input)
      io.close_write
      io.read
    end

    output
  end
end

ActionView::Template::register_template_handler 'dot', DotHandler
ActionController::Base.exempt_from_layout :dot

Templates should now provide a compile method that will return ruby code for execution later:

class DotHandler < ActionView::TemplateHandler

  def compile(path)
    <<-EOS
      controller.response.content_type ||= Mime::PNG
      #{ActionView::Template.handler_class_for_extension('erb').call(path)}
      @output_buffer = IO.popen("dot -Tpng", 'r+') do |io|
      io.write(@output_buffer)
      io.close_write
      io.read
      end
    EOS
  end

end

ActionView::Template.register_template_handler 'dot', DotHandler
ActionView::Template.exempt_from_layout 'dot'

To chain template handlers, find the appropriate handler with ActionView::Template#handler_class_for_extension and then call it. The code it returns can then be injected into your compiled template.

January 8, 2009 at 11:21am
0 notes

Setting The Default Java Virtual Machine On Ubuntu

Select the default Ubuntu JVM with update-alternatives:

sudo update-alternatives --config java

December 15, 2008 at 9:19pm
0 notes

Monit And My Sql

Usually Monit will be happy to watch MySQL with the following configuration:

check process mysqld with pidfile "/var/run/mysqld/mysqld.pid"
  group database
    start program = "/etc/init.d/mysql start"
    stop program = "/etc/init.d/mysql stop"
    if cpu > 60% for 2 cycles then alert
    if cpu > 80% for 10 cycles then restart
    if failed port 3306 protocol mysql then restart

If you are getting connection refused errors, your MySQL database may not be bound to localhost. Check your bind address in /etc/mysql/my.cnf:

bind-address = 0.0.0.0

If the address MySQL is bound to is external, you will need to adjust your monitrc accordingly:

if failed host 0.0.0.0 port 3306 protocol mysql then restart