User 9627e6d784
05-12-2008 15:52:47
dear chemaxon users,
i am new to chemaxon and its provided software packages an currently face the following problem:
i have different libaries and want to compare them. however, i am ONLY interested in comparison of
structural scaffolds of the different libaries.
so for example:
given:
- imagine i have libraries A and B
- library A provides me with x different structural scaffolds
- library B provides me with y different structural scaffolds
wanted:
- how can i compare the x scaffolds from lib. A with the y scaffolds of lib. B ?
- how can i extract ONLY the scaffold structures, NOT the whole structures?
- how am i able - if i am at all - to extract these comparison results ?
any suggestion or help is much appreaciated.
thanks a lot in advance
greetings,
JL
ChemAxon efa1591b5a
09-12-2008 10:20:12
Hi JL,
you may try LibraryMCS to extract the scaffolds form your library,
http://www.chemaxon.com/shared/libMCS/.
You can use the compr command line batch program to do the comparison (
http://www.chemaxon.com/jchem/doc/user/Compr.html), however, you have not specified how you envisioned the actual comparison. Compr is based on similarity calculations using 2D fingerprinting, and performs pairwise comparison, is that suitable for your needs? Or you'd prefer comparing scaffolds directly as graph structures?
HTH, regards,
Miklos
User 9627e6d784
10-12-2008 07:53:32
Hi Miklos and thanks a bunch for your reply.
i have some some furher questions:
1.) it is not clear for me how i can extract the scaffolds after using 'compr' or
'libmcs'.
1.A.) is there a special flag to set ?
2.) can i have only the scaffolds extracted to a file that i can look at with a viewer afterwards ?
3.) However, how can I compare the scaffolds directly as graphs ?
3.A.)How can I extract these scaffold graphs (see 2.)) ?
Greetings and many many thanks,
JL
User 9627e6d784
12-12-2008 15:01:11
Hi again,
did anyone of you chemaxon experts had some time to look through my questions ?
it would be of great help for my work.
so have a nice weekend,
greetings,
joern
User 9627e6d784
02-01-2009 18:14:42
hi there and a happy and healthy 2009,
tahnks a bunch for your answers. however, i am still facing some problems concerning the batch version of libmcs (v.5.1.4 of JChem) on a SUSE11.1 system.
whenever I put:
libmcs use_this_sdf_file.sdf -v -e -o CSV output.csv
to the unix command line, the error message is as follows:
Reading structures from use_this_sdf_file.sdf ... chemaxon.formats.MolFormatException: Cannot recognize format (?)
Unrecognized file contents:
use_this_sdf_file.sdf
at chemaxon.formats.recognizer.RecognitionSubsystem.getFormat(RecognitionSubsystem.java:203)
at chemaxon.formats.MolInputStream.initTextFormat(MolInputStream.java:235)
at chemaxon.formats.MolInputStream.init(MolInputStream.java:134)
at chemaxon.formats.MolInputStream.<init>(MolInputStream.java:116)
at chemaxon.formats.MolInputStream.<init>(MolInputStream.java:58)
at chemaxon.clustering.JKlustorImport.getMolImporter(JKlustorImport.java:685)
at chemaxon.clustering.JKlustorImport.readStructures(JKlustorImport.java:584)
at chemaxon.clustering.LibraryMCS.main(LibraryMCS.java:1405)
has anyone any idea of what is wrong here ?
a quick answer is much appreciated since i cannot go on with my work otherwise
greetings
JL
ChemAxon efa1591b5a
05-01-2009 14:36:19
Hi,
I did try version 5.1.4 and it worked all right for me. Did you try to open your SDFile with mview? Did that work?
Thanks
Miklos
User 9627e6d784
05-01-2009 14:44:12
hi miklos, I tries mview and it worked. afterwards i tries libmcs with another sdf file and it also worked fine for me. strange, since using the graphical frontend of libmcs opens the file which did not work with the commandline tool.
thanks
jl
User 9627e6d784
05-01-2009 15:02:47
hi,
and sorry to ask this question again:
however, i am still not able to extract scaffolds of a molecules after clustering.
i am just interested in scaffolds of the structures, not the whole sidechains etc.
Imagine I have a compound library with 1000 structures.
I want to cluster them using libmcs. To do so I used the following command:
libmcs this_is_the_sdf_file.sdf -v -f -o blah.sdf
After the clustering I want to look only at the scaffolds of the e.g. only nodes in the clustering tree that depict scaffold structures. Thus, I used mview, but there is no clustering. All the molecule that were originally in the input sdf file for libmcs also occur in the outputfile of libmcs instead.
Can you tell me what the mistake I did was ? If you imagine the clusterimg as a tree, I only want to extract molecular scaffolds that are depicted by internal nodes of that tree !
Greetings
JL
User 9627e6d784
07-01-2009 17:02:46
dear users,
please find a screenshot of the libmcs GUI attached to this post.
in the middle of the picture one can see the scaffolds of my structures for each cluster at a certain level in the tree.
how can i extract exactly this information from the command line tool of libmcs ?
hope this helps to clarify my problem.
greetings
jl
ChemAxon efa1591b5a
08-01-2009 10:55:43
Hi,
thanks, it's clear.
There's no parameter available that can produce the exact output you need. Instead, the entire hierarchy is saved in a flat (i.e. not hierarchical) file (either SMILES in CSV, or SDFile).
However, hierarchy information is preserved in either output file type. All you need is some further processing of the text output of LibMCS. If you work in UNIX/Linux (including MacOS) environment then it's pretty straightforward. Under MSWin it's less convenient - though doable.
I reckon the better option for you is the CSV output in which the SMILES representation of scaffolds/molecules is written in each line from top to down of the dendrogram. Each SMILES is followed by two values, basically the 'co-ordinates' of the corresponding tree node, level first then position on that level from left to right. These values are comma separated.
Example:
Code: |
CNc1cccc(F)c1C,1,1
C\C(C)=C\Nc1cccc(F)c1C#N,2,1
CCOC(=O)C(=CNc1cccc(F)c1C#N)C#N,3,1
CC(=O)C(=C/Nc1cccc(F)c1C#N)\C(C)=O,3,2
Cc1onc(c1C(=O)Nc1cccc(F)c1C(=O)Nc1ccc(F)cc1)-c1c(Cl)cccc1Cl,2,2
Cc1onc(c1C(=O)Nc1cccc(F)c1C(=O)Nc1ccc(F)cc1)-c1c(Cl)cccc1Cl,3,3
Cc1nc2cccc(F)c2c(=O)o1,1,2
Cc1nc2cccc(F)c2c(=O)o1,2,3
C\C(C)=C\c1nc2cccc(F)c2c(=O)o1,3,4
...
|
To get all structures on the second level of the hierarchy simple use sg like the commands below:
Code: |
libmcs infile -o CSV outfile
fgrep ",2," outfile
|
This is the simplest one - surely, there are more sophisticated ways of processing the required output.
HTH
regards,
Miklos
User 9627e6d784
09-01-2009 09:43:23
Hi Miklos and thanks a bunch for your efforts,
is there any option to search a compound library ONLY for aliphatic systems, i.e. I want to know if there are compounds in the library that do not possess any rings in their structure.
Thanks for an answer and greetings from out of the snow,
JL
ChemAxon efa1591b5a
14-01-2009 08:35:17
Hi,
I'm afraid there's no such option available. These structure can be removed in a pre-processing step, for instance with the help of the evaluate batch program.
Regards,
Miklos