Skip to main content

Posts

Showing posts from May, 2010

Non-chemical example : Grinberg's graph

While looking for an algorithm to lay out fullerenes as 2D graphs, I came across this paper (PDF). It describes an annealing/spring layout method for Schlegel diagrams. I don't think I can spare the time to implement it at the moment, but one of the graphs in the paper is this:

known as the Grinberg graph.

C60 double bonding networks

C60 (or buckminsterfullerene) has no hydrogens, so it must have quite a few double bonds. I am beginning to understand that bond orders are in some sense a simplification, and I suppose that the bonding is in some way delocalised across the sphere. However, here is a picture of two different bonding patterns:

(click for bigger, as usual). The 'ChEBI' bonding pattern on the left is from the molfile in a ChEBI entry while the 'radial' bonding one on the right is bonded according to schemes from this site which has an interesting graph-theory perspective on fullerenes.
The radial bonding version has a simpler, layered structure like this:

Ok, so that's a slightly comical picture of the slices. What is also nice is the quotient graph for the ChEBI structure:

I didn't color this, but what is great is that it looks like a subgraph of a fullerene! Apart from the loops along the top and bottom, of course.

Fullerene symmetries

Continuing the theme of colored graphs, some of the more interesting examples are fused ring structures, especially those with some symmetries, but not completely symmetrical. Fullerenes fit this description, for example this 26 atom example:

the distribution of colors might look a little odd, but the dark blue atom surrounded by three cyan atoms is actually repeated at the top - which is really the other side of the sphere. These kind of 3D molecules don't lay out very well with the CDK's layout code, so I used JChempaint instead to make a more symmetric 28-fullerene:

This is actually a screengrab of a crude viewer I put together (commit) that takes a molfile and calculates the signatures. Selecting a signature from the right hand list highlights it on the graph. Anyway, it's easier than making images like this by hand:

Except that the little graph at the bottom (which is a quotient graph) can't be drawn by the renderer as it has loop-edges.

Chemicals as colored graphs

The interface between maths and chemistry can be tricky when it comes to terminology - sets (maths) have elements, chemistry has a different kind of element; graphs have colors which are usually just numbers, diagrams of chemicals have colors which usually relate to the element type of the atom, and so on.

So, for maximum confusion, here are two pictures of graphs (that could represent chemical connectivity) colored by equivalence class (determined by signature). The signature trees are also drawn with graphical colors, but these represent the integer colors in the signature, which are not the same as the colors used to indicate equivalence class. Firstly, a structure that the smiles algorithm is meant to have trouble with (but may not exist):


It looks quite strained, so I expect that it may not be possible to synthesise. Another multi-ring system is this one:



I don't even know what this one would be called, even if it did exist. Annoyingly, this structure triggers a bug if the two d…

Orienting a Pyrene diagram

Another example from Symmmetry in Chemistry, of drawing pyrene:



the standard (IUPAC?) orientation on the left doesn't show the symmetries of the molecule as well as the rotated version on the right. The letters indicate equivalent atom positions - oh, and aromatic indicators are missing :)
It seems like there should be a way to discover symmetry axis from the graph - without coordinates. In a similar way that some ring perception algorithms work by reducing the graph:


However, it is not obvious that this would be better than laying out the structure, then using the 2D coordinates to determine symmetries. Also not clear to me is how to choose which vertices to merge, and which to duplicate. On the left hand side of the image above, the fragments have duplicate (a), (d), and (e) vertices, but (e) has also been merged before being duplicated.

1,2-dichlorocyclopropane and a spiran

As I am reading a book called "Symmetry in Chemistry" (H. H. Jaffé and M. Orchin) I thought I would try out a couple of examples that they use. One is 1,2-dichlorocylopropane :

which is, apparently, dissymmetric because it has a symmetry element (a C2 axis) but is optically active. Incidentally, wedges can look horrible in small structures - this is why:

The box around the hydrogen is shaded in grey, to show the effect of overlap. A possible fix might be to shorten the wedge, but sadly this would require working out the bounds of the text when calculating the wedge, which has to be done at render time. Oh well.
Another interesting example is this 'spiran', which I can't find on ChEBI or ChemSpider:
Image again courtesy of JChempaint. I guess the problem marker (the red line) on the N suggests that it is not a real compound? In any case, some simple code to determine potential chiral centres (using signatures) finds 2 in the cyclopropane structure, and 4 in the spiran…

Stuck : Detailed Description

Ok, so this is the detailed version of the previous post.
To recap; the structure generation code is still missing a vital piece - the canonical checking. I have been implementing Jean-Loup Faulon's algorithm for generation, but there is no precise algorithm given for canonical checking. Here the relevant paragraph from the enumeration paper:
"Checking for canonicity is a common procedure of orderly enumeration algorithms, the procedure guarantees that the graphs generated are nonisomorphic. ...To verify that a graph is canonical, one labels the vertices of the graph in all possible ways. The graph is canonical if the initial labeling leads to a list of edges that is lexicographically smaller than the lists obtained with all other labelings. In the present paper, we have implemented two algorithms to verify canonicity, Tarjan tree canonization algorithm if the tested graph is acyclic and McKay’s Nauty technique otherwise" Okay, so I don't understand this for a couple of…

Stuck : The Summary

A couple of people have asked how the structure generation stuff is going, and the short answer is that I am stuck. This post will give a short summary of the problem, and the next will give a much more detailed description.
So the problem is this : given an elemental formula (like C6H12) or a list of fragments plus a formula (like {2 * CH, 2 * CH2, 2 * CH3 : C6H12}) return all possible connected structures.
There are simple ways to do this, such as connecting every atom to every other atom, and removing duplicates. The downside is that this takes forever, because this procedure will make many, many isomorphic copies of each solution. At the final filtering step, an all-v-all comparison would have to be done on these many copies.
A better solution is to check each structure each time a bond is made, to see if it is canonical. Although I know how to do this in theory, it turns out to be more difficult in practice. For simple graphs, I have a solution that seems to work. Chemicals are not …

Debugger