User 7f33ec9a5c
12-11-2012 20:44:24
Hi,
We are using jcf.standardize to standardize smiles. I am finding that I need to call it twice times, using the output from one call as the input to the next call in order to make sure that the output smiles are in a standard state.
Our conversion options are:
sep=~ config:removeexplicitH..[NX3+:1](=[O:2])-[O-;X1:3]>>[N:1](=[O:2])=[O:3]..[SX4+:1](=[O:2])-[O-;X1:3]>>(=[O:2])=[O:3]..[CX1H0:1]=[NX2H0:2]>>[C-:1]#[N+:2]..[NX1H0:1]#[NX2H0:2]=[N:3]>>[N-:1]=[N+:2]=[N:3]..[P+:1][O-:2]>>[P:1]=[O:2]..[n:1]-[O-:2]>>[n:1]=[O:2]..tautomerize..aromatize~outFormat:smiles:u
The below examples have 4 smiles, the input is from a chemical vendor's catalog.
1). Input
2). Output from processing #1
3). Output from processing #2
4). Output from processing #3
In the below examples, you can see how standardizing the first output gives a different output.
Could you please take a close look at our options, and see if this is a jc_convert issue or if we need to add more options to fix this behavior, or if we just need to run jc_convert multiple times for these cases.
---------------------
CCCCCCCCCOC(=O)C1=CC2=C(C=C1)N(CC)\C(=C\C=C\C1=[N+](CCCS([O-])(=O)=O)C3=C(C=CC(=C3)C(=O)OCCCCCCCCC)N1CC)N2CCCS([O-])(=O)=O
CCCCCCCCCOC(=O)c1ccc2N(CC)\C(=C\C=C\c3n(CC)c4ccc(cc4[n+]3CCCS([O-])(=O)=O)C(=O)OCCCCCCCCC)N(CCCS([O-])(=O)=O)c2c1
CCCCCCCCCOC(=O)c1ccc2N(CC)\C(=C/C=C/c3n(CC)c4ccc(cc4[n+]3CCCS([O-])(=O)=O)C(=O)OCCCCCCCCC)N(CCCS([O-])(=O)=O)c2c1
CCCCCCCCCOC(=O)c1ccc2N(CC)\C(=C/C=C/c3n(CC)c4ccc(cc4[n+]3CCCS([O-])(=O)=O)C(=O)OCCCCCCCCC)N(CCCS([O-])(=O)=O)c2c1
---------------------
---------------------
.CCN1\C(S\C(=C\NC2=CC=CC=C2)C1=O)=C/C1=[N+](CC)C(=CS1)C1=CC=CC=C1
.CCn1\c(=C\c2scc(-c3ccccc3)[n+]2CC)s\c(=C\Nc2ccccc2)c1=O
.CCn1\c(=C\c2scc(-c3ccccc3)[n+]2CC)s\c(=C/Nc2ccccc2)c1=O
.CCn1\c(=C\c2scc(-c3ccccc3)[n+]2CC)s\c(=C/Nc2ccccc2)c1=O
---------------------
---------------------
CN1C(=S)S\C(=C\C2=C(OCCSC3=CC=C(C)C=C3)C=CC=C2)C1=O
CN1C(=S)S\C(=C\c2ccccc2OCCSc2ccc(C)cc2)C1=O
CN1C(=S)S\C(=C/c2ccccc2OCCSc2ccc(C)cc2)C1=O
CN1C(=S)S\C(=C/c2ccccc2OCCSc2ccc(C)cc2)C1=O
---------------------
---------------------
CN1C(=S)S\C(=C\C2=C(OCCOC3=CC=CC(C)=C3)C=CC(Cl)=C2)C1=O
CN1C(=S)S\C(=C\c2cc(Cl)ccc2OCCOc2cccc(C)c2)C1=O
CN1C(=S)S\C(=C/c2cc(Cl)ccc2OCCOc2cccc(C)c2)C1=O
CN1C(=S)S\C(=C/c2cc(Cl)ccc2OCCOc2cccc(C)c2)C1=O
---------------------
---------------------
CCC1=CC(OCCOC2=C(\C=C3\SC(=S)N(C)C3=O)C=C(Br)C=C2)=CC(C)=C1
CCc1cc(C)cc(OCCOc2ccc(Br)cc2\C=C2\SC(=S)N(C)C2=O)c1
CCc1cc(C)cc(OCCOc2ccc(Br)cc2\C=C2/SC(=S)N(C)C2=O)c1
CCc1cc(C)cc(OCCOc2ccc(Br)cc2\C=C2/SC(=S)N(C)C2=O)c1
---------------------
---------------------
CCOC1=C(\C=C2\C(=O)NC(=S)N(C2=O)C2=CC=C(Cl)C=C2)C2=CC=CC=C2C=C1
CCOc1ccc2ccccc2c1\C=C1/C(=O)NC(=S)N(C1=O)c1ccc(Cl)cc1
CCOc1ccc2ccccc2c1\C=C1\C(=O)NC(=S)N(C1=O)c1ccc(Cl)cc1
CCOc1ccc2ccccc2c1\C=C1\C(=O)NC(=S)N(C1=O)c1ccc(Cl)cc1
---------------------
---------------------
[O-][N+](=O)C1=C(C=CC=C1)C1=CC=C(O1)\C=C1/C(=O)NC(=S)N(C1=O)C1=CC=CC(Br)=C1
Brc1cccc(c1)N1C(=S)NC(=O)\C(=C/c2ccc(o2)-c2ccccc2N(=O)=O)C1=O
Brc1cccc(c1)N1C(=S)NC(=O)C(=Cc2ccc(o2)-c2ccccc2N(=O)=O)C1=O
Brc1cccc(c1)N1C(=S)NC(=O)C(=Cc2ccc(o2)-c2ccccc2N(=O)=O)C1=O
---------------------
---------------------
[O-][N+](=O)C1=CC=C(O1)\C=C1\C(=O)NC(=O)N(C1=O)C1=CC=C(Cl)C=C1
Clc1ccc(cc1)N1C(=O)NC(=O)\C(=C\c2ccc(o2)N(=O)=O)C1=O
Clc1ccc(cc1)N1C(=O)NC(=O)C(=Cc2ccc(o2)N(=O)=O)C1=O
Clc1ccc(cc1)N1C(=O)NC(=O)C(=Cc2ccc(o2)N(=O)=O)C1=O
---------------------
---------------------
[O-][N+](=O)C1=CC=C(C=C1)C1=CC=C(O1)\C=C1\C(=O)NC(=S)N(C1=O)C1=CC=C(Br)C=C1
Brc1ccc(cc1)N1C(=S)NC(=O)\C(=C\c2ccc(o2)-c2ccc(cc2)N(=O)=O)C1=O
Brc1ccc(cc1)N1C(=S)NC(=O)C(=Cc2ccc(o2)-c2ccc(cc2)N(=O)=O)C1=O
Brc1ccc(cc1)N1C(=S)NC(=O)C(=Cc2ccc(o2)-c2ccc(cc2)N(=O)=O)C1=O
---------------------
---------------------
[O-][N+](=O)C1=CC=CC(=C1)C1=CC=C(O1)\C=C1\C(=O)NC(=S)N(C1=O)C1=CC=CC(Br)=C1
Brc1cccc(c1)N1C(=S)NC(=O)\C(=C\c2ccc(o2)-c2cccc(c2)N(=O)=O)C1=O
Brc1cccc(c1)N1C(=S)NC(=O)C(=Cc2ccc(o2)-c2cccc(c2)N(=O)=O)C1=O
Brc1cccc(c1)N1C(=S)NC(=O)C(=Cc2ccc(o2)-c2cccc(c2)N(=O)=O)C1=O
---------------------
---------------------
O=C1N(C2CCCC2)\C(S\C1=C\C1=CN(CC2=CC=CC=C2C#N)C2=C1C=CC=C2)=N/C1=CC=CC=C1
O=C1N(C2CCCC2)\C(S\C1=C\c1cn(Cc2ccccc2C#N)c2ccccc12)=N/c1ccccc1
O=C1N(C2CCCC2)\C(S\C1=C/c1cn(Cc2ccccc2C#N)c2ccccc12)=N/c1ccccc1
O=C1N(C2CCCC2)\C(S\C1=C\c1cn(Cc2ccccc2C#N)c2ccccc12)=N/c1ccccc1
---------------------
---------------------
CC1=C(\C=C2\S\C(=N\C3=CC=CC=C3)N(C3CCCC3)C2=O)C2=C(C=CC=C2)N1CC1=CC=CC=C1C#N
Cc1c(\C=C2/S\C(=N\c3ccccc3)N(C3CCCC3)C2=O)c2ccccc2n1Cc1ccccc1C#N
Cc1c(\C=C2\S\C(=N\c3ccccc3)N(C3CCCC3)C2=O)c2ccccc2n1Cc1ccccc1C#N
Cc1c(\C=C2/S\C(=N\c3ccccc3)N(C3CCCC3)C2=O)c2ccccc2n1Cc1ccccc1C#N
---------------------
---------------------
CCOC(=O)C1=C(C)N=C2S\C(=C/C3=CC(Cl)=C(OCC4=C5C=CC=CC5=CC=C4)C(OC)=C3)C(=O)N2C1C1=CC=CC=C1
CCOC(=O)C1=C(C)N=c2s\c(=C\c3cc(Cl)c(OCc4cccc5ccccc45)c(OC)c3)c(=O)n2C1c1ccccc1
CCOC(=O)C1=C(C)N=c2s\c(=C/c3cc(Cl)c(OCc4cccc5ccccc45)c(OC)c3)c(=O)n2C1c1ccccc1
CCOC(=O)C1=C(C)N=c2s\c(=C/c3cc(Cl)c(OCc4cccc5ccccc45)c(OC)c3)c(=O)n2C1c1ccccc1
---------------------
---------------------
CC1=C(C(N2C(=O)\C(SC2=N1)=C\C1=CN(CC2=C3C=CC=CC3=CC=C2)C2=C1C=CC=C2)C1=CC=CC=C1)C(=O)NC1=CC=CC=C1
CC1=C(C(c2ccccc2)n2c(=N1)s\c(=C/c1cn(Cc3cccc4ccccc34)c3ccccc13)c2=O)C(=O)Nc1ccccc1
CC1=C(C(c2ccccc2)n2c(=N1)s\c(=C\c1cn(Cc3cccc4ccccc34)c3ccccc13)c2=O)C(=O)Nc1ccccc1
CC1=C(C(c2ccccc2)n2c(=N1)s\c(=C/c1cn(Cc3cccc4ccccc34)c3ccccc13)c2=O)C(=O)Nc1ccccc1
---------------------