Removing Alkyl Groups

User 8011b7f284

03-07-2012 14:16:12

Hi,


This is my first post to the forums, so thanks in advance for all of your help.


I have two databases of molecules stored as SMILES, one with about 1000 molecules and the other with several million.  I would like to automate a process that removes alkyl tails from the molecules and generates the new SMILES for the molecules.  Is it possible to do that?


The database is coded in python with a Django interface, I don't know if that's relevant.


Thanks


Laszlo

ChemAxon e08c317633

03-07-2012 14:33:22

Hi,


Could you post a few input and expected output examples here? For example what do you expect for these inputs?


CCCCCCCCC
CCCF
CCCCC(C)CC
CCCCCCCCC1CC(CCC(C)CC)CC(CCC)C1CCCCC

Zsolt


 

User 8011b7f284

03-07-2012 14:54:10

Hi Zsolt,


 


I suppose I should clarify that the molecules in my database largely consist of conjugated systems/rings with alkyl tails (for example:


 CCCCCCC1=CC2=CC3=CC=C(C=C3C=C2C=C1)C1=CC2=CC3=C(C=C(CCCCCC)C=C3)C=C2C=C1


or


CCCCCCCCOC(=O)C1=CC2=CSC(C3=CC4=C(OCC(CC)CCCC)C5=C(C=CS5)C(OCC(CC)CCCC)=C4S3)=C2S1


which would become: 


C1=CC2=CC3=CC=C(C=C3C=C2C=C1)C1=CC2=CC3=C(C=CC=C3)C=C2C=C1


and


COC(=O)C1=CC2=CSC(C3=CC4=C(OC)C5=C(C=CS5)C(OC)=C4S3)=C2S1


The molecules you gave are all alkyl.  For numbers 1, 3, and 4, I guess I wouldn't want to keep anything, and for 2  I would keep CF.


Thanks for your help!


Laszlo

ChemAxon d76e6e95eb

03-07-2012 16:43:21

I would try to remove terminal CH3 atoms with Standardizer using a quasy loop. This is the transformation scheme:


[CH3:1][#6:2]>>[H:3][#6:2]


 


Test with your second input:


$ standardize -c '[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:
1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..
[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6
:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:
3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]
>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][
#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH
3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]
..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][
#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[
H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:
2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1
][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[
CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:
2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3
][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>
>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#
6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]..[CH3:1][#6:2]>>[H:3][#6:2]' 'CCC
CCCCCOC(=O)C1=CC2=CSC(C3=CC4=C(OCC(CC)CCCC)C5=C(C=CS5)C(OCC(CC)CCCC)=C4S3)=C2S1
'
COC(=O)C1=CC2=CSC(C3=CC4=C(OC)C5=C(C=CS5)C(OC)=C4S3)=C2S1


 


In this test, I apply the same carbon terminal removal 50 times, so it removes alkyls up to 50 atoms. See more details here.

User 8011b7f284

05-07-2012 18:23:37

Hi,


 


I'm a little confused where to input the commands into standarizer.  When I start it up and give Standardizer the input file, it then gives me a list of things to do (such as aromatize, remove explicit hydrogens, etc.).  Where can I enter the command you gave me?


 


Sorry for my confusion.


 


Thanks,


Laszlo

ChemAxon d76e6e95eb

05-07-2012 19:49:32

What I sent you is a command for a console.


If you prefer the graphical user interface, you just need to select the the "Transform" action in the available actions list on the left, and add it to the right list. 


Then select that transform on the right pasting this transformation in its scheme editor:


{code}


[CH3:1][#6:2]>>[H:3][#6:2]


{code}


Now you have a transform to convert methyls to hydrogens. If you add the same transform 30 times to your action list, it will remove alkyl chains up to 30 carbon atoms.


To save some time for you, I have actually created the xml configuration, you can load it on that configuration page and run directly. Please find it attached.

User 8011b7f284

09-07-2012 14:56:26

Hi Gyuri,


 


It works great.  Thanks for your help!


 


Best,


Laszlo