JmolSmilesTest.jar -- A Universal SMILES String comparator


This page uses the new JmolSmilesApplet to compare SMILES strings coming out of JME.
Draw two structures, one on the left and one on the right.
See if you can find an example that matches when it isn't supposed to or vice-versa!

A (target model)

B (search pattern)







Do they match?

 ? 


What is their relationship?

 ? 
                         
Load into Jmol JME SMILES input JME save self-test
close this window http://www.molinspiration.com/jme
copy swap invert printCommand
close this window http://www.molinspiration.com/jme




Note that the test is set up such that ambiguous stereochemistry on the RIGHT matches defined stereochemistry on the left, but ambiguous stereochemistry on the LEFT does not match defined stereochemistry on the right --- as you would want if the student were drawing on the left and we were checking a key on the right. That is, we test "studentResponse".find("SMILES","answerKey") [Jmol] or find("answerKey", "studentResponse", false, false) [JmolSmiles], not the other way around. We want to find the correct answer IN the student response. Or if we were searching a database, we would want LEFT to represent "a molecule in the database" and RIGHT to represent "our search". If our search were for a specific stereochemistry, we would want to skip ill-matched database structures; if our search were for a general (unspecified) stereochemistry, we would want it to match any constitutionally correct isomer in the database. That is, we would use "databaseMolecule".find("searchMolecule"), which if you think about it, makes sense.

In general, we want "targetString".find("SMILES","searchString") in JmolApplet or find("searchString", "targetString", false, false) in JmolSmilesApplet.
   

How it works

The idea is really pretty straightforward. The JME applet is used to generate a SMILES string with stereochemistry. The JmolSmilesApplet (the little white dot in the blue square, above) creates two molecular graphs: One from the target SMILES string (from the left panel, the target model), and one from the search SMILES string (from the right panel, the search pattern). Notably, though, there are no spacial relationships, just topology. First, the algorithm checks for a constitutional match. This is a standard iterative connectivity search, following the bonding of the structure to see if it matches the topology inherent in the target.

So how then does JmolSmilesApplet check stereochemistry -- and why does it do such an amazing job of matching? The stereocenters are checked one by one using a very fast and relatively simple strategy. Now, we don't have any 3D information, but we do know the "local winding" that is needed based on the target SMILES string. This involves two aspects:
  1. We have to order the atoms in the SEARCH model around the stereocenter based on the winding in the TARGET string, because it is the order in that string that defines the desired stereochemistry. Of course, there are several orders that are the same, because in most stereochemical situations, you can twice switch any two pairs of groups and have an equivalent structure.
  2. Rather than checking strings, we temporarily assign 3D coordinates to the atoms connected to the stereocenter. Just simple coordinates, like (1,0,0) and (0,1,0). Then we carry out a quick geometry winding match, checking to see how various vectors based on two sets of atoms around the stereocenter line up. This works with both double-bond and chirality-center stereochemistry. It is an extraordinarily fast test.
And there you have it! A univeral SMILES string comparator!