Searching results with smarts

User 8688ffe688

04-11-2005 01:41:00

We are experiencing some odd behavior that we can not explain. Could you provide the logic and rationale and possible options.





Question: Find all compounds in the ACD that have a five member aromatic ring with the 1 position containing any atom.


Query: select count(*) from acd_chemstruct where jc_contains(smiles,'
  • 1C=CC=C1') = 1


    Query: select count(*) from acd_chemstruct where jc_contains(smiles,'[C,O,S]1C=CC=C1') = 1


    Both queries produces results with no thiophene or furan





    I'm thinking the smarts query should return thiophene and furan. Is this correct? If not, please explain.


    Query: select count(*) from acd_chemstruct where jc_contains(smiles,'S1C=CC=C1') = 1


    Query: select count(*) from acd_chemstruct where jc_contains(smiles,'O1C=CC=C1') = 1


    When I use the queries above thiophene and furan are returned in the results.





    We would like thiopene and furan to come back in the results when using smarts wild card characters.
  • ChemAxon a3d59b832c

    04-11-2005 08:46:10

    Matt,





    The problem is in the aromatization part:





    For chemically accurate search results, JChem handles both the query and database molecules in aromatized form. (See also: http://www.chemaxon.com/jchem/doc/user/Query.html#noteonaromatic )





    During aromatization, the bonds in the ring are converted to aromatic inside JChem. (See attached image.)


    In the query this is also done, as long as all possibilities of list atoms in the ring make the ring aromatic. (First and second row.)





    However, the ring in the third row is not aromatized because the list atom may take a value (carbon) which prevents aromaticity. The situation is the same with the ANY atom (*) that you used.





    When you are formulating the query, you can use the Edit/Bond/Aromatize function in Marvin to check how aromaticity will be handled by JChem. When such a confusing case appears, I suggest to use query bonds like "single or aromatic" or "double or aromatic".





    In the future we intend to solve this situation, because other users also found this behaviour counterintuitive. In this solution both the aromatic and non-aromatic forms will be retrieved for these ambiguous cases.





    Best regards,





    Szabolcs

    User 8688ffe688

    28-11-2005 23:50:39

    Is there anyway of retrieving only aromatic compounds. This is the expectation that our chemists would like to see. Additionally after some testing this the behavior that we see with ISIS and daylight.

    ChemAxon a3d59b832c

    29-11-2005 10:50:41

    mpustel wrote:
    Is there anyway of retrieving only aromatic compounds. This is the expectation that our chemists would like to see.
    Yes. If all bonds of the query are changed to aromatic during drawing, only the aromatic compounds are retrieved.


    (Your smarts queries should be written as: c1cc[c,o,s]c1 and c1cc*c1 )
    mpustel wrote:
    Additionally after some testing this the behavior that we see with ISIS and daylight.
    This surprises me. AFAIK, ISIS does not consider pyrrole-like rings aromatic, this is why it returns all for the Kekule query.





    On the other hand, Daylight seems to be working exactly like us:


    http://www.daylight.com/daycgi_tutorials/depictmatch.cgi





    Best regards,


    Szabolcs

    ChemAxon a3d59b832c

    26-10-2006 10:00:43

    JChem 3.1 is out, and we have solved the above issue in this version. We have a new search option called "Vague bond level", and its default value - level 1 - handles the discussed ambiguous five-membered rings. See more details in the documentation below.





    http://www.chemaxon.com/jchem/doc/user/Query.html#vaguebond