Hi,
Monday is a national holiday in Hungary, but let me try to give a partial answer before the experts can comment.
Q1: Is Jchem able to tell the following two entries are the same structure?
C[C@@H](Cl)[C@H](C)Cl |&1:1,3|
C[C@@H](Cl)[C@H](C)Cl
It depends on what is the definition of "same".
The first structure contains 2 chiral centers in the same AND group, which represents a mixture of enantiomers.
From the query guide:
http://www.chemaxon.com/jchem/doc/user/query_stereochemistry.html#mdl_enhanced_stereo
Stereogenic centers belonging to an ANDn group (e.g. AND1) represents a mixture of two enantiomers: the structure as drawn AND the epimer in which the stereogenic centers have the opposite configuration. (Note, that it is not a racemic mixture, but a mixture of the enantiomers of any ratio. Of course, a 1:1 mixture (racemic mixture) is included in this sense.)
On the other hand the seconds structure has unlabeled stereo centers, ans since SMILES is treated as absolute stereo, they are treated the same as if they had the ABS label, so this is not a mixture:
Stereogenic centers belonging to ABS represent absolute stereochemistry, i.e. chirality. (All unlabeled stereo centers are also thought to belong to the ABS group by default. Unlabelled stereo centers may be interpreted as an independent AND group only if (1) chiral flag is not set AND (2) the absolute stereo search options (Query/TargetAbsoluteStereo, AbsoluteStereo
) are set to false. See the following sections for further explanation.)
For non-default behavior one can control the stereo matching with the following option:
http://www.chemaxon.com/jchem/doc/dev/cartridge/cartapi.html#jc_compare_stereoSearchType
Q2: Why the results are different?
select jc_equals('C[C@@H](Cl)[C@H](C)Cl','C[C@@H](Cl)[C@H](C)Cl |&1:1,3|') from dual:
returns 0; # so it seems the answer to Q1 is no.
select jc_compare('C[C@@H](Cl)[C@H](C)Cl','C[C@@H](Cl)[C@H](C)Cl |&1:1,3|', 't:e') from dual:
returns 1;
Because the search modes are different.
jc_equals means Duplicate search (formerly Perfect), its equivalent is "t:d" in jc_compare (formerly "t:p").
The second search mode is Full structure search ("t:f"), formerly exact ("t:e").
Full structure search applies the same matching rules as substructure search (there may be some differences on query/target side and order is important), while Duplicate search looks for exactly the same stereo properties (exact stereo matching).
From the same document:
The exact stereo option means that all stereo information should be the same in the query and target ("all stereo info is exactly the same"). It mainly has an effect when the query has no stereo information: it only matches non-stereo target. Similarly, a query with a wiggly tetrahedral center will only match wiggly tetrahedral center, and not specific R and S configurations.
Q3: what's the difference between target & query below? Could you point me to document where describes the difference between target and query in exact/full structure search?
select jc_compare('C[C@@H](Cl)[C@H](C)Cl','C[C@@H](Cl)[C@H](C)Cl |&1:1,3|', 't:f') from dual;
returns 1;
select jc_compare('C[C@@H](Cl)[C@H](C)Cl |&1:1,3|','C[C@@H](Cl)[C@H](C)Cl', 't:f') from dual;
returns 0;
For Full structure and as well as for Substructure search the order of the query and target is important.
(but not for Duplicate search, where the order does not matter).
Apart from stereo configurations and query features implicit/explicit H atoms also have different meaning for example.
Again, the relevant parts of the Query Give give guidance.
For stereo matching:
http://www.chemaxon.com/jchem/doc/user/query_stereochemistry.html#tetrahedral_stereo
General link to the query guide:
http://www.chemaxon.com/jchem/doc/user/Query.html
In this certain case the mixture "contains" the pure epimer, while it is not true the other way around.
Please let us know if you have further questions.
Best regards,
Szilard