SMARTS Implementation

14-07-2004 09:48:33

Hello!


I am using the interface for SMARTS-search in order to find substructures. Can you tell me how the SMARTS search is implemented? If you can not tell me the exact details, perhaps you can give me a general description how the search is done?


Thanks for your help

ChemAxon a3d59b832c

14-07-2004 10:49:49

Our algorithm is mainly based on the Ullmann algorithm:





J. R. Ullmann. An algorithm for subgraph isomorphism.


Journal of the ACM, 1(23):31--42, 1976.





Prior to starting the "proper" searching algorithm, JChem's search makes statistics about atom and bond types and rejects matching immediately if the query has more of a type than the target. Furthermore, at exact searching equality is needed.





In smarts searching particularly, there are many query properties which have to be evaluated. If complex logical expressions are involved, the atom or bond expression is converted to an expression tree first.





Explicit and implicit hydrogens complicate things. For example if the query contains explicit hydrogens, fake hydrogens are put on the target molecule in place of the implicit hydrogens. These are then converted back to negative indices (of their neighbour) when a mapping has to be returned like at the call of findNext() and findAll().





You can also find valuable information in the Query Guide.





Please note that full SMARTS support is only available from JChem 2.3.


Pre-release is available here: http://www.chemaxon.com/restricted/jtest/





Best regards


Szabolcs