Invalid SMILES

User f564ccf382

04-01-2008 12:46:12

To whom it concerns,





I've a programme which 'creates' SMILES structures and I use the MarvinBeans property calculator to determine the Validity of these structures. For example, if MarvinBeans gives me properties for the SMILES then I take it to be Valid. Also, I use MSKETCH to determine the structures Validity if it is depicted then I assume that it is valid. But, I have used both the Molinspiration and Daylight websites to verify the Validity of my SMILES and in some cases where MarvinBeans and MSKETCH have given me a 'VALID' result Daylight and Molinspiration both return an Invalid result(which I have verified with pen and paper).





Here are two examples of MarvinBeans/MSKETCH wrongly validated SMILES:





c1nnnsc2nnsc2sc1


c1nnnsc2nnsc2oc1





I would be very grateful if you could help me as the MarvinBeans calculator is a crucial part of my research and I need to be able to stand over my results.





Kind Regards,


Miriam O'Riordan

ChemAxon 25dcd765a3

04-01-2008 13:43:15

Hi,





I don't really understand how do you validate the SMILES.


Let me ask first how do you validate which is valid molecule from these?


c1cc1


c1ccc1


c1cccc1


c1ccccc1


Which properties are calculated with MarvinBeans?


Msketch will depict all the above, but you can see that only benzene is correct structure.


However the above SMILES are all syntactically valid.





Andras

User f564ccf382

04-01-2008 14:23:30

Hi Andras,





Thanks for replying so quickly to my query. MarvinBeans is used to calculate the following properties:





Number of Hydrogen Acceptor Bonds


Number of Hydrogen Donor Bonds


Number of Rotatable Bonds,


Molecular Weight


logP





In order to validate my SMILES I used MarvinBeans to calculate the above properties and if I received a set of values I assumed that my SMILES was valid.





Regards,


Miriam

ChemAxon 25dcd765a3

04-01-2008 17:10:48

Dear Miriam,





Let's say that a SMILES is valid if it is syntactically valid, and the SMILES is adequate if it corresponds to a chemically valid structure.


All your and also my examples are valid SMILES.


Next question what do we mean on chemically valid structure?


How about radicals? Do you want to accept them?


And how to check which smiles string is adequate.


How did you choose these properties?


And why did you choose these?


I think all of these can be calculated for example radicals also.





But let me focus on your SMILES examples.


All your and also my examples are focusing on aromatic structures.


I think in these (aromatic) cases all you need to do is to dearomatize the structure.


If dearomatization fails the you should reject the structure.





Andras

User f564ccf382

04-01-2008 18:03:34

Hi Andras,





What do you mean by dearomatising the SMILES? Do you mean instead of using c1ccccc1 to represent Benzene I should use C1=CC=CC=C1?





Kind regards,


Miriam

ChemAxon 25dcd765a3

04-01-2008 20:15:37

Hi Miriam,





Somehow your program generated the SMILES from your molecule which is in aromatic form.


Take your (aromatic) molecule.


If dearomatizing succeeds then it is valid like this:


Code:



boolean v = mol.dearomatize();


if (v){


// your molecule is valid


// you may aromatize it to get back the original form


mol.aromatize();


} else{


// the mol is not valid


}








So you can use c1ccccc1 to represent Benzene, which will be dearomatized and aromatized. So it will be a valid molecule.





Andras

ChemAxon 25dcd765a3

05-01-2008 19:43:48

Sorry, I have forgotten that mol.dearomatize() returns void, so you have to check if molecule has aromatic atoms after dearomatization.


If so, then the dearomatization was not succesfull.





Andras

ChemAxon 25dcd765a3

05-01-2008 19:59:05

mol.dearomatize() will return boolean value from Marvin 5.0.1





Andras