Skip to main content

Posts

Showing posts from June, 2009

Cuneane

While taking a look at the InChI discussion mail archives, I came across a discussion on graphs that are difficult to canonize (so called 'isospectral' graphs - I've heard the term before, but I haven't worked with eigenvalues of adjacency matrices, so did not pay them much attention).
Anyway, one such was this structure, cuneane:

which shows 3D and 2D representation of the molecule (reproduced from the wiki page), and an arbitrary numbering in the center. The three signatures A, B, and C are shown labelled by this numbering.
What's interesting about this particular example is the number of times that the same atom is represented in the signatures. For the signature called 'B', which is rooted at atom 2, the atom numbered 5 appears four times in the lowest layer of the tree. This naturally follows from the large number of rings of different sizes that make up cuneane - two 5-membered rings, two 4-membered rings, and two three-membered rings.

Two pass rendering

So, there was a question on the cdk-devel mailing list about bounding boxes, reactions, and text. An unfortunate consequence of the new design is that the renderer will not calculate bounding boxes that can fully contain the text. Concretely, this is what it would look like (not made in JCP!)


The blue box is the bounds that would be created, which is minimal with respect to the atom centers. The black box is the bounds that should be created, if we respected the text size. The problem is, the size of text is not known until the point it is drawn. Or, more precisely, until we have some sort of GraphicsContext to ask about the width in a particular font.
So, a two-pass system was suggested. When this was mentioned before, I was dismissive - perhaps even rude. Sorry about that Egon, Sam. I still think it is better avoided; in the case of transparency, I don't know why alpha values can't be used for fill colours. I understand there was some SWT problems..?
Anyway, here is a sketch of…

More chemical signature example

A neat and tidy example of generating a structure from atoms and signatures occurred to me : adenine. It is known that this molecule forms readily from HCN under conditions similar to those thought to exist on the early Earth.
Anyway, the point is that the structure is almost exactly 6 C-N units (ignoring hydrogens and multiple bonds). A C-N is very similar to a height-1 signature, and the height-1 signatures for adenine are shown in the figure below:


Each signature (a-e) is mapped to an atom in the original structure, and then a table is shown of the maximum number of compatible bonds possible between each pair of signatures. Note that the table is not symmetric, as the compatibility operation is not commutative. Also, notice that the diagonal is not all 0 - bonds can form between atoms that both have b as a target signature.
Although these target signatures are smaller than the diameter of the adenine molecular graph, they still seem to define it completely. However, they are not guara…

Signature bond compatibility

So, given my previous posts on what Faulon's signatures are, here is an explanation of how they are used in the structure enumeration algorithm that I am almost finished implementing.
The core test in this algorithm is for compatible bonds. Two atoms are only joined if : a) they have compatible target signatures and b) there are less than the target number of bonds already. A target signature here is just a signature that is set on the atom for it to match, like a pattern.
The first of these tests is illustrated here:
Another (overly) complex diagram! But the formula here is a bit difficult to interpret otherwise. In the top left corner is a graph G (slightly resembling hexane without the hydrogens) which is, by convention, composed of vertices (V) and edges (E).
The equation to the right of the graph defines part of the condition for a compatible bond. The tau terms are just target signatures, as shown on the upper right. The tricky term is h-1στ(y)(z) which means 'a signature st…

The Voynich blog?

I apologise to any readers. I do tend to make complex diagrams. Perhaps some distant historian will find them in an archive and a new voynich-style mystery will result..

This does mean something, but you would have to read the previous post to discover what, exactly.

Faulon's Signatures : A Possible Interpretation

Several recent papers by Faulon concern an idea he calls 'signatures'. This post is just a record of what I understood them to be.
Firstly, a signature is a subgraph of a molecular graph. There is a distinction between atomic signatures - which is a tree rooted at a particular atom - and a molecular signature, which is the set of atomic signatures for each atom in a molecule.
A tree is a graph with no cycles, so an atomic signature is not just a subgraph. Like a path, a signature has a length - or rather a height. Here is a picture of signatures of heights 1-4 for a fused ring structure:

The graph G on the left has one of its atoms labelled (a), and each of the trees in the center is a signature rooted at that atom. On the right, is the simple string form of the tree, as a nested list. I should point out that the signatures in these images may not be canonical, as I worked them out by hand (as I have not yet fully implemented the canonization algorithm).
Signatures of the same hei…

Automorphism groups and fragment graphs

Structure generation involves not just graph theory, but group theory. Or, I should say, it does in some of the papers I have read. For example, in this paper by J.L.Faulon, there is the sentence:"The two main steps are to compute the orbits of the automorphism group of G and to saturate all the atoms of a chosen orbit
which may well be incomprehensible to many readers, except if the reader is a mathematician.
I am no mathematician, but thanks to some books on groups, I now understand both what an automorphism group is and what an orbit is. On the other hand, I also believe that this definition of how the algorithm works is overly complex. A more simple term might just be "fragment sets" - as it is fairly clear, if not mathematically exact. So, for the fragment graph [CH3, CH3, CH2, CH2, CH, CH] the fragment set is [CH3, CH2, CH].
Anyway, here is a short analysis of the automorphism group of the fragment graph [CH2, CH2]. This first image shows the tiny group of permutatio…