Skip to main content

Posts

Showing posts from July, 2009

Compatibility Table

With the canonical code in place (thanks to open source :) the structure generation goal is much nearer. The first thing to improve is the bond compatibility.


This image shows cuneane (again!) and the bond compatibility table. The table is tricky to calculate, but relatively easy to understand; there is only one bond between atoms of type A - so there is a one in the cell (A, A).
Conversely, no atom of type C is connected to an atom of type A (see this for more detail), so there is a O in both (C, A) and (A, C). Note that the table is not symmetric, as can be seen with (A, B) = 1 and (B, A) = 2. This makes sense, in that an atom of type A is connected to two of type B, and yet an atom of type B is only connected to one of type A.

Crude port of Faulon's C-implementation

I finally did the sensible thing and just ported over the c code. I don't particularly like doing this, as the resulting code is probably quite fragile, and unreadable - java spoken with a c 'accent' usually is.

However, it does reproduce the same signatures as the c code, so that's good. It only took 1 day to port, but 2 days to debug...

Check it out here : http://github.com/gilleain/generation/tree/master

Nicer tree pictures

Here are some pictures of proper trees, or rather proper signatures, to celebrate getting the translator program to compile (it involved replacing some 'local' keywords with 'define's - who knew?).

Also, thought I would post the code for tree layout I'm using:
public int layout(Node node) {
node.y = node.depth * ySep;
if (node.isLeaf()) {
leafCount += 1;
node.x = leafCount * xSep;
return node.x;
} else {
int min = 0;
int max = 0;
for (Node child : node.children) {
int childCenter = layout(child);
if (min == 0) {
min = childCenter;
}
max = childCenter;
}
if (min == max) {
node.x = min;
} else {
node.x = min + (max - min) / 2;
}
return node.x;
}
}
basically, it lays out the leaves first, then returns their ce…

Tree Canonization Simplified

While debugging the methods to make canonical signatures, I learned something about tree isomorphism from various sources, including Prof. Valiente's excellent looking book on trees.

One way of checking isomorphism is canonisation, since two trees are only isomorphic if they have the same canonical form. For simple labelled trees, it looks like there is an almost trivial way to get a canonical string representation. Say we have two trees:

The are rooted, labelled trees. So the conversion to a canonised string proceeds as follows; for each node, lexicographically sort the string form of the labels of its children, and return the concatenated string. In python this looks like:


which...is unreadable. hmmm. Wish there was a better way to get marked-up code into blog posts. Perhaps there is one, and I don't know of it. Anyway, the point is that it is very short.
edit : the code is...

def printSorted(node):
if len(node.children) > 0:
childStrings = [printSorted(child) for child in n…

Square Grids, Cylinders, Spheres, and Toruses

This is straying from the point; but if any graph can be described by canonized trees made from its subgraphs, then what are the properties of very large (regular) graphs? A grid, for example?
Starting with a square, and fusing squares together results in this situation:

two and three fused squares are similar, at the height-two tree level. With four rings, a new type c appears (b becomes b' with three rings). Beyond this point, any number of rings fused in a row like this has the same structure.
Moving to a second dimension of growth, a square grid (Gnn) has the following structure:

On the left are 'snapshots' of the advancing wave of the tree. Looks a lot like a breadth-first search, I suppose. On the right are the trees for each snapshot, with increasing heights. Although this is not shown, 8 of the 20 leaves of the third tree are duplicates. This should be clear from the third snapshot, which has only 12 (filled) circles on the 'wavefront'.
These trees can even be e…

Warning : Abstraction!

This is a throwaway mathematical point, that I am not qualified to make, but it looks like three of the previous examples (diamantane, twistane, and cuneane) have a very abstract connection when colored by signature:

what I mean by this diagram is that diamantane has atoms colored by (a) connected to both other (a) atoms, and to (b) atoms. Its (c) atoms are only connected to (b)s; the arrows could well be double-headed, by the way.
The most complex situation is cuneane, where each 'type' of atom is connected to another in its type and to two in another type. Adamantane would just look like : (a)-(b).
Interesting, but it doesn't get the signature canonization methods debugged any faster...

Adamantane, Diamantane, Twistane

After cubane, the thought occurred to look at other regular hydrocarbons. If only there was some sort of classification of chemicals that I could use look up similar structures. Oh wate, there is.
Anyway, adamantane is not as regular as cubane, but it is highly symmetrical, looking like three cyclohexanes fused together. The vertices fall into two different types when colored by signature:
The carbons with three carbon neighbours (degree-3, in the simple graph) have signature (a) and the degree-2 carbons have signature (b). Atoms of one type are only connected to atoms of another - the graph is bipartite.

Adamantane connects together to form diamondoids (or, rather, this class have adamantane as a repeating subunit). One such is diamantane, which is no longer bipartite when colored by signature:

It has three classes of vertex in the simple graph (a and b), as the set with degree-3 has been split in two. The tree for signature (c) is not shown. The graph is still bipartite according to th…