Smiles with valence 5 Nitrogen and implicit hydrogen

User 73531e86ff

18-03-2009 10:45:26

Hi,





We have come across an interesting case of a nitrogen with valence 5 which is being dealt with differently between our current vendor and ChemAxon.





The simplest case is C#NC which we're trying to match with the smarts [#7v5]. Our current vendor recognises that the nitrogen has an implicit hydrogen and the match is positive.





However, when we load this smiles into ChemAxon it does not match the smarts. We found out that the mol.isValenceError() was giving true for this example. When we changed the smiles to C#[NH]C the match is positive again.





So, my question is: Is there a way to have the implicit hydrogen added automatically in this example?





Having taken a look at the Daylight page for the smiles definition it says:
Quote:



Elements in the "organic subset" B, C, N, O, P, S, F, Cl, Br, and I may be written without brackets if the number of attached hydrogens conforms to the lowest normal valence consistent with explicit bonds. "Lowest normal valences" are B (3), C (4), N (3,5), O (2), P (3,5), S (2,4,6), and 1 for the halogens.


From this explanation it implies that as the Nitrogen in my example already has 4 bonds then the lowest normal valence is 5, therefore you should assume it has an implicit hydrogen.

ChemAxon 25dcd765a3

19-03-2009 08:58:29

Hi,





I will also examine the problem but at the first sight it seems that you are right, so we will fix this bug ASAP.





Andras

ChemAxon 25dcd765a3

23-03-2009 17:55:58

Hi,





We have examined the problem more deeply and found the following:





Moreover Nitrogen cannot have valence of 5. You may complain about this with the example of Nitro group which is described conventionally with -N(=O)=O. But chemically this form is not valid as Nitro goup has the following form: [N+]([O-])=O.





So I think I have to conclude that your current vendor is chemically mistaken.





Please let me know your opinion.





Andras

User 73531e86ff

26-03-2009 13:49:22

Hi,





The problem is that many of our structures will be in this form.  Plus, If it is not valid I don't understand why an import error is not thrown.





Are you saying that you are complying with your own SMILES specification and not Daylight's?





The other odd thing is as you say nitro groups CN(=O)=O are succcessfully imported but give no valency error.  Surely it should be the same for both C#[NH]C and C#NC (the latter having an implicit hydrogen) which both import successfully but DO give a valency error with mol.hasValenceError().  It is not consistent.





Cheers,





Shane

ChemAxon 25dcd765a3

27-03-2009 23:15:15

Hi,


Quote:
The problem is that many of our structures will be in this form.
I think this should not be problem. C#NC has valence of 4 so it should match with smarts [#7v4].


Quote:
 Plus, If it is not valid I don't understand why an import error is not thrown
Import error is not thrown, just valence error on the Nitrogen atom after import.


In case of database import it is strongly suggested to standardize the molecule  from different sources into standard representational forms (as one user would use the C#NC form while an other would use the [C-]#[N+]C form, but the previous one is chemically incorrect). So in this case the isocyanide group which may be written conventionally as C#NC, is converted to the chemically correct [C-]#[N+]C.





This standardization step in case of jchem is done by standardizer.


Quote:
Are you saying that you are complying with your own SMILES specification and not Daylight's
The SMILES specification is the same. The difference is at the implicit Hydrogen count assignment. If the Hydrogen count is not specified explicitly then it is calculated by our algorithm.





Taking the previous molecule as an example, from a chemical point of view the Nitrogen at C#NC SMILES will not have implicit Hydrogen. At the end of the SMILES import process the implicit Hydrogen count is calculated for the Nitrogen. Our code finds that this atom cannot have implicit Hydrogen atom and also finds that the Nitrogen atom has valence problem (the positive charge is missing).


Quote:
The other odd thing is as you say nitro groups CN(=O)=O are succcessfully imported but give no valency error.  Surely it should be the same for both C#[NH]C and C#NC (the latter having an implicit hydrogen) which both import successfully but DO give a valency error with mol.hasValenceError().  It is not consistent.
Mathematically you are right, if we allow the Nitrogen with valence of 5.





But I'm sure as a chemist, that the Nitrogen atom can not have valence of 5.








Why do we accept Nitrogen with valence of 5 at the nitro group and not elsewhere? Because nitro group is still written and accepted in the chemically incorrect CN(=O)=O form (this is the one and only case). Many chemist use this form for shake of simplicity, but they know that in the correct form the Nitrogen has positive charge and one Oxygen has negative charge.

















Andras