User c2ffbfa8f8
30-09-2010 12:45:44
JChem version used: 5.3.4
Target: C1Cc2ccccc2=NC1
Query: [NX2;R]=[C;R]
Example code:
final Molecule target = MolImporter.importMol("C1CN=c2ccccc2C1","smiles");
final Molecule query= MolImporter.importMol("[NX2;R]=[C;R]", "smarts");
final MolSearch ms = new MolSearch();
final MolSearchOptions molSearchOptions = new MolSearchOptions();
molSearchOptions.setVagueBondLevel(MolSearchOptions.VAGUE_BOND_OFF);
ms.setSearchOptions(molSearchOptions);
ms.setTarget(target);
ms.setQuery(query);
System.out.println("HIT: " + ms.getMatchCount());
This matches but I'd expected it to NOT match because the carbon in the target is aromatic. In this case should I change the query to [NX2;R]=[!c;C;R] or is there a MolSearch option I can use?
thanks muchly
ChemAxon a3d59b832c
01-10-2010 10:54:11
Hi Derek,
It is SMARTS import that does not put up the aliphatic property to the C atom, from valence considerations.
(It avoids to put on the aliphatic property when aromaticity can be excluded from the existing bonds and the total number of valence.)
Your target molecule basically has a valence error. (Pentavalent C - note that it is not possible to dearomatize it.)
So there is no option for MolSearch to avoid this structure, as the query input already lacks this information.
But the query you proposed should work fine.
Best regards,
Szabolcs
User c2ffbfa8f8
01-10-2010 13:51:36
Thanks Szabolcs, that makes sense.
User c2ffbfa8f8
20-10-2010 15:44:07
Hi Szabolcs,
I have another structure (this time with correct valence) that I would expect to not match (using the same test):
C1CN=c2ccccc2=C1
Have tried aromatising the target with general aromaticity but this didn't help. Have I missed something?
thanks muchly
Derek
User 73531e86ff
20-10-2010 15:59:27
Here is another example which is causing an issue in a different part of our code. We think it is related to the same general aromatisation as above.
SMILES: C=c1ccc(cc1)=C1CCCCO1
SMARTS: [OH0X2$(*-C=C)]
Similarly, we'd expect the smarts NOT to hit the structure above because the atom i've highlighted red in the SMARTS is aliphatic and the atom it is matched against in the SMILES is aromatic (with general aromatisation).
ChemAxon 25dcd765a3
21-10-2010 10:47:42
Hi Derek,
In this SMILES
C1CN=c2ccccc2=C1
there are two Carbon atoms (with indexes 4 and 9) which have valence 5.
These Carbon atoms have 2 aromatic bonds with three electrons and a double bond with two electrons which is all together 5 electrons. The Carbon atom cannot have 5 valence electrons. This is a misleading representation of this molecule.
I think this is the source of the problem.
However, there exist aromatic representation of the molecules which allows such chemically strange (lets say unaccepted by a chemist) representation. To avoid this situation, the SMARTS importer will mark these atoms with aliphatic flag.
It will most probably ready in Marvin 5.4.1
User 7c177bab3b
22-10-2010 09:08:12
Please could you clarify, do you mean "mark with aromatic flag"?
Taking the Kekule structure through standardize generates the smiles with "aromatic" carbons
> echo 'C1CN=C2C=CC=CC2=C1' | standardize -c "aromatize"
C1CN=c2ccccc2=C1
and we would expect these to be treated as such in SMARTS matching, i.e. N=[C;R] should not match.
ChemAxon a3d59b832c
22-10-2010 09:53:53
Hi all,
Please could you clarify, do you mean "mark with aromatic flag"?
It means that SMARTS import will put on the appropriate query property on the atom. (A) - for aliphatic in this case.
(See: http://www.chemaxon.com/jchem/doc/user/query_features.html#atprop )
I attach below two pictures for the current and the future representations of SMARTS N=[C;R].
I confirm that with this change, query N=[C;R] will not match C1CN=c2ccccc2=C1, and indeed all the other query target pairs in this topic will be fixed.
Furthermore, a workaround exists that works in the current version as well: Add the recursive SMARTS $(A) to the atoms in question, for example:
[N$(A)]=[C$(A);R]
[OH0X2$(*-C=[C;$(A) ])]
[NX2;R;$(A) ]=[C;R;$(A) ]
(It may not be necessary on the N, I just included there as well for safety.)
You can also try these here: http://www.chemaxon.com/jchem/examples/sss/index.jsp
Best regards,
Szabolcs
ChemAxon a3d59b832c
24-01-2011 10:02:41
We have fixed the above issue. None of the above SMARTS searches will match in the coming JChem 5.4.1 version.
5.4.1 will be released still in this month.
Best regards,
Szabolcs
User 73531e86ff
24-01-2011 10:30:31
Many thanks for the status update. I will install 5.4.1 and re-run all our unit tests when it is available.
Cheers,
Shane