conditional search in SMILES strings

User 941c2467a3

21-02-2008 19:43:06

Dear Chemaxon,





I have troubles in trying to query some SMILES string with special patterns . Here are two example,





I want to query out compounds containing halogen (F,Cl,Br,I) with more than one bond. for example,


C1CCC(CC1)C2CCC[Cl]C2


c1ccc(cc1)C2=CC=ClC=Cl2


c1ccc(cc1)C2=CCl=CC=Cl2


c1ccc(cc1)C2=ClC=CC=Cl2


C1=CC=C(Cl=C1)C2=CC=CC=Cl2


C1CCC([Cl]C1)C2CCCC[Cl]2


c1cc[cH+]cc1


C1=C(Cl=ClCl=Cl1)C2=ClCl=ClCl=Cl2


[Cl]1[Cl][Cl]C([Cl][Cl]1)C2[Cl][Cl][Cl][Cl][Cl]2





I want to query out compounds containing O with more than two bonds. for example,


CCCCCCCCCCCCO[O]([O-])(=O)=O


CCCCCCCCCCC([o-])=O





When I use jcsearch, I can use a group of similar command lines together to query what I want,


jcsearch -q "*-Cl-*" -f smiles "C1CCC(CC1)C2CCC[Cl]C2"


jcsearch -q "*=Cl" -f smiles "c1ccc(cc1)C2=CC=ClC=Cl2"


etc.





But I'd like to know can I use a single command line to do the whole thing? If there is one, that would be great!





Thanks.





Jeff

ChemAxon 42004978e8

22-02-2008 07:22:39

Hello Jeff,





You can use SMARTS atom properties for this purpose .


http://www.daylight.com/dayhtml/doc/theory/theory.smarts.html


In this case: Xn or Dn.


So the searching for molecules with Cl looks like:


jcsearch -q "[Cl;X2,X3,X4,X5]" -f smiles "c1ccc(cc1)C2=CC=ClC=Cl2"





We search for a Cl with 2,3,4 or 5 connections.


If you would like to have a single query for all halogens then you may define a molecule which contains a single atom which is an R group. Then define the R group as one of the halogens possessing the query properties.





For the definition of R groups see the attached file, testR.mrv in marvin.





jcsearch -q testR.mrv -f smiles "c1ccc(cc1)C2=CC=ClC=Cl2"





I hope this helps.





Bye,


Robert

User 941c2467a3

22-02-2008 08:21:49

rwagner wrote:
Hello Jeff,





You can use SMARTS atom properties for this purpose .


http://www.daylight.com/dayhtml/doc/theory/theory.smarts.html


In this case: Xn or Dn.


So the searching for molecules with Cl looks like:


jcsearch -q "[Cl;X2,X3,X4,X5]" -f smiles "c1ccc(cc1)C2=CC=ClC=Cl2"





We search for a Cl with 2,3,4 or 5 connections.


If you would like to have a single query for all halogens then you may define a molecule which contains a single atom which is an R group. Then define the R group as one of the halogens possessing the query properties.





For the definition of R groups see the attached file, testR.mrv in marvin.





jcsearch -q testR.mrv -f smiles "c1ccc(cc1)C2=CC=ClC=Cl2"





I hope this helps.





Bye,


Robert
Thanks Robert! That really helps!





Regards,


Jeff