Wednesday 10 June 2009

XML, Objects and Python

I briefly mentioned that I had created two object related schemas for XML in my previous post and then elaborated on XOMS, the mapping schema. This time I'll elaborate on XOS - XML Object Schema. This is a schema I created to allow the specification of classes in XML (although I've called them objects for the XML syntax). I consider this less interesting than the mapping, although the implementation of the XOS processor in python led to some more interesting problems to overcome.

First though, a quick overview of the XML schema. This is simpler than the XOMS schema. Here is a schema for the same data model as seen previously:

<xos>
<xos:object name="Person">
<xos:attribute name="name" type="xos:string" />
<xos:attribute name="address" type="Address" />
</xos:object>
<xos:object name="Address">
<xos:attribute name="address" />
</xos:object>

Again, this is fairly self explanatory. Given this schema, the XOS processor would produce two classes, one called Person, one called Address, with the attributes specified. In most languages, the XOS processor would really be more of a pre-processor - taking an object model and outputting a set of files that define the classes required. In Python though, I could go a step further and process a model at run-time, creating the classes dynamically and registering them for use the same as any other class.

This is where the more interesting stuff came in - dynamically creating classes in python. It turns out that doing this is remarkably easy, with the following code:


def create_object(object_name):
try:
getattr(sys.modules['__main__'], object_name)
except AttributeError:
class BaseObject(object):
pass
BaseObject.__name__ = object_name
BaseObject.__module__ = '__main__'
setattr(sys.modules['__main__'], object_name, BaseObject)

(apologies for the lack of indentation. It should be clear where the indentation should be though)

This surprisingly simple bit of code creates a 'template' object derived from object, assigns it a name and a module and then registers it in the '__main__' module with the same name. It doesn't create attributes on the object, but with python this isn't required as setattr() can add arbitrary attributes to an object or class. I have plans to add in some ability for this in the future though, as I could then provide chunks of python code to do things such as create SQLAlchemy database objects on the fly.

The next step is then allowing these to be defined in arbitrary modules. For this, I needed another function:

def create_module(module_name):
if sys.modules.has_key(module_name):
return
previous_module = sys.modules['__main__']
full_mod_name = ""
for mod_name in module_name.split('.'):
full_mod_name = ".".join([full_mod_name, mod_name])
try:
previous_module = getattr(previous_module, mod_name)
except AttributeError:
mod = type(previous_module)(name=full_mod_name)
sys.modules[full_mod_name] = mod
setattr(previous_module, mod_name, mod)
previous_mod = mod


This is a fairly interesting function too. It first checks that the module being asked for hasn't already been loaded (the first line of the function). If it hasn't been loaded then the function loops through the module name split down by '.' and builds up the fully qualified name through the loop. For each loop iteration, it first checks that the previous module doesn't already have a module with the expected name. If it doesn't it creates a module with the fully qualified name, registers it in the sys.modules dictionary with that name as well, then uses setattr() on the previous module to set it with the individual name. It then sets the new module as the previous module and iterates.

The next step for XOS would be to have an inheritance mechanism modelled in the XML. Some preliminary experiments in python have shown that I'll need to use a metaclass to correctly set the base class for the objects created but I haven't finished this yet.

In the end, what this experiment has shown is that it's possible to re-implement the python object and module creation using Python and your own syntax. More surprisingly, what has been shown is that the actual creation is easy! Less than 20 lines of code to create objects and modules, something that wouldn't even be possible in something like C++ and would require huge amounts of reflection code in C# or Java.

No comments:

Post a Comment