comparison of structural scaffolds from different libraries

User 9627e6d784

05-12-2008 15:52:47

dear chemaxon users,





i am new to chemaxon and its provided software packages an currently face the following problem:


i have different libaries and want to compare them. however, i am ONLY interested in comparison of


structural scaffolds of the different libaries.


so for example:





given:


- imagine i have libraries A and B


- library A provides me with x different structural scaffolds


- library B provides me with y different structural scaffolds





wanted:


- how can i compare the x scaffolds from lib. A with the y scaffolds of lib. B ?


- how can i extract ONLY the scaffold structures, NOT the whole structures?


- how am i able - if i am at all - to extract these comparison results ?





any suggestion or help is much appreaciated.





thanks a lot in advance


greetings,


JL

ChemAxon efa1591b5a

09-12-2008 10:20:12

Hi JL,





you may try LibraryMCS to extract the scaffolds form your library, http://www.chemaxon.com/shared/libMCS/.


You can use the compr command line batch program to do the comparison (http://www.chemaxon.com/jchem/doc/user/Compr.html), however, you have not specified how you envisioned the actual comparison. Compr is based on similarity calculations using 2D fingerprinting, and performs pairwise comparison, is that suitable for your needs? Or you'd prefer comparing scaffolds directly as graph structures?





HTH, regards,


Miklos

User 9627e6d784

10-12-2008 07:53:32

Hi Miklos and thanks a bunch for your reply.





i have some some furher questions:





1.) it is not clear for me how i can extract the scaffolds after using 'compr' or


'libmcs'.


1.A.) is there a special flag to set ?


2.) can i have only the scaffolds extracted to a file that i can look at with a viewer afterwards ?


3.) However, how can I compare the scaffolds directly as graphs ?


3.A.)How can I extract these scaffold graphs (see 2.)) ?





Greetings and many many thanks,


JL

User 9627e6d784

12-12-2008 15:01:11

Hi again,





did anyone of you chemaxon experts had some time to look through my questions ?


it would be of great help for my work.





so have a nice weekend,





greetings,


joern

ChemAxon efa1591b5a

15-12-2008 08:33:18

Hi,
Quote:
1.) it is not clear for me how i can extract the scaffolds after using 'compr' or


'libmcs'.
You can use libmcs to extract the scaffolds. Would you like to use a command line tool or a desktop application? Both are available.


Quote:
1.A.) is there a special flag to set ?
It depends: in command line tool one has to specify the -o option along with the CSV format; while in the desktop application the selected structures can be saved (e.g. select top level clusters).
Quote:
2.) can i have only the scaffolds extracted to a file that i can look at with a viewer afterwards ?



Yes, see above.
Quote:
3.) However, how can I compare the scaffolds directly as graphs ?


Scaffolds are saved as ordinary molecules (in SDF or in SMILES). The compr tools does compare molecules, so it's quite straightforward.
Quote:
3.A.)How can I extract these scaffold graphs (see 2.)) ?
See 1. Just save them in an output file.





HTH


Miklos

User 9627e6d784

19-12-2008 11:09:28

Quote:



1.) it is not clear for me how i can extract the scaffolds after using 'compr' or


'libmcs'.


You can use libmcs to extract the scaffolds. Would you like to use a command line tool or a desktop application? Both are available.


I prefer using the command line lbmcs. however, i cannot find how to exactly specify the output. do you mean something like:


libmcs ... -o outfile.csv


or how does libmcs know to write out a csv file?


is there a special manual where i can see all the options that are available for libmcs?





greetings,





jl

User 9627e6d784

19-12-2008 11:13:35

Quote:
Quote:





1.) it is not clear for me how i can extract the scaffolds after using 'compr' or


'libmcs'.


You can use libmcs to extract the scaffolds. Would you like to use a command line tool or a desktop application? Both are available.





I prefer using the command line lbmcs. however, i cannot find how to exactly specify the output. do you mean something like:


libmcs ... -o outfile.csv


or how does libmcs know to write out a csv file?


[\quote]


All right, I found it by myself.


however, it remains unclear, how to specify that only the scaffolds should be extracted





cheers,


jl





ChemAxon efa1591b5a

19-12-2008 11:31:48

Hi,
Quote:
I prefer using the command line lbmcs. however, i cannot find how to exactly specify the output. do you mean something like:


libmcs ... -o outfile.csv


or how does libmcs know to write out a csv file?


is there a special manual where i can see all the options that are available for libmcs?


To get a CSV output you need to use the -o CVS filename.csv option.


Run libmcs -h to get the list of all options.





At present the CSV output is rather specific and may not satisfy your particular needs (it was developed to meet a particular user request). We are exploring possible options and usage scenarios, mainly from an integration point of view. Many need a CSV output for a loose SporFire integration, but the picture is not clear enough, the ways of use a very diverse. Can you share your expectations with us?


In the long term (by version 1.0) saving and exporting various file formats along with handy options will be supported (supposedly in JChem version 5.3).





Regards


Miklos








I'm afraid there is no manual yet, but that will surely be provided when LibraryMCS reaches version 1.0.





Miklos

User 9627e6d784

02-01-2009 18:14:42

hi there and a happy and healthy 2009,





tahnks a bunch for your answers. however, i am still facing some problems concerning the batch version of libmcs (v.5.1.4 of JChem) on a SUSE11.1 system.





whenever I put:





libmcs use_this_sdf_file.sdf -v -e -o CSV output.csv





to the unix command line, the error message is as follows:





Reading structures from use_this_sdf_file.sdf ... chemaxon.formats.MolFormatException: Cannot recognize format (?)


Unrecognized file contents:


use_this_sdf_file.sdf


at chemaxon.formats.recognizer.RecognitionSubsystem.getFormat(RecognitionSubsystem.java:203)


at chemaxon.formats.MolInputStream.initTextFormat(MolInputStream.java:235)


at chemaxon.formats.MolInputStream.init(MolInputStream.java:134)


at chemaxon.formats.MolInputStream.<init>(MolInputStream.java:116)


at chemaxon.formats.MolInputStream.<init>(MolInputStream.java:58)


at chemaxon.clustering.JKlustorImport.getMolImporter(JKlustorImport.java:685)


at chemaxon.clustering.JKlustorImport.readStructures(JKlustorImport.java:584)


at chemaxon.clustering.LibraryMCS.main(LibraryMCS.java:1405)








has anyone any idea of what is wrong here ?





a quick answer is much appreciated since i cannot go on with my work otherwise





greetings


JL

ChemAxon efa1591b5a

05-01-2009 14:36:19

Hi,





I did try version 5.1.4 and it worked all right for me. Did you try to open your SDFile with mview? Did that work?





Thanks


Miklos

User 9627e6d784

05-01-2009 14:44:12

hi miklos, I tries mview and it worked. afterwards i tries libmcs with another sdf file and it also worked fine for me. strange, since using the graphical frontend of libmcs opens the file which did not work with the commandline tool.





thanks


jl

User 9627e6d784

05-01-2009 15:02:47

hi,


and sorry to ask this question again:





however, i am still not able to extract scaffolds of a molecules after clustering.


i am just interested in scaffolds of the structures, not the whole sidechains etc.


Imagine I have a compound library with 1000 structures.


I want to cluster them using libmcs. To do so I used the following command:


libmcs this_is_the_sdf_file.sdf -v -f -o blah.sdf


After the clustering I want to look only at the scaffolds of the e.g. only nodes in the clustering tree that depict scaffold structures. Thus, I used mview, but there is no clustering. All the molecule that were originally in the input sdf file for libmcs also occur in the outputfile of libmcs instead.


Can you tell me what the mistake I did was ? If you imagine the clusterimg as a tree, I only want to extract molecular scaffolds that are depicted by internal nodes of that tree !





Greetings


JL

User 9627e6d784

07-01-2009 17:02:46

dear users,





please find a screenshot of the libmcs GUI attached to this post.





in the middle of the picture one can see the scaffolds of my structures for each cluster at a certain level in the tree.





how can i extract exactly this information from the command line tool of libmcs ?





hope this helps to clarify my problem.


greetings


jl

ChemAxon efa1591b5a

08-01-2009 10:55:43

Hi,





thanks, it's clear.





There's no parameter available that can produce the exact output you need. Instead, the entire hierarchy is saved in a flat (i.e. not hierarchical) file (either SMILES in CSV, or SDFile).


However, hierarchy information is preserved in either output file type. All you need is some further processing of the text output of LibMCS. If you work in UNIX/Linux (including MacOS) environment then it's pretty straightforward. Under MSWin it's less convenient - though doable.





I reckon the better option for you is the CSV output in which the SMILES representation of scaffolds/molecules is written in each line from top to down of the dendrogram. Each SMILES is followed by two values, basically the 'co-ordinates' of the corresponding tree node, level first then position on that level from left to right. These values are comma separated.





Example:





Code:
CNc1cccc(F)c1C,1,1


C\C(C)=C\Nc1cccc(F)c1C#N,2,1


CCOC(=O)C(=CNc1cccc(F)c1C#N)C#N,3,1


CC(=O)C(=C/Nc1cccc(F)c1C#N)\C(C)=O,3,2


Cc1onc(c1C(=O)Nc1cccc(F)c1C(=O)Nc1ccc(F)cc1)-c1c(Cl)cccc1Cl,2,2


Cc1onc(c1C(=O)Nc1cccc(F)c1C(=O)Nc1ccc(F)cc1)-c1c(Cl)cccc1Cl,3,3


Cc1nc2cccc(F)c2c(=O)o1,1,2


Cc1nc2cccc(F)c2c(=O)o1,2,3


C\C(C)=C\c1nc2cccc(F)c2c(=O)o1,3,4


...








To get all structures on the second level of the hierarchy simple use sg like the commands below:





Code:
libmcs infile -o CSV outfile


fgrep ",2," outfile








This is the simplest one - surely, there are more sophisticated ways of processing the required output.








HTH





regards,


Miklos

User 9627e6d784

09-01-2009 09:43:23

Hi Miklos and thanks a bunch for your efforts,








is there any option to search a compound library ONLY for aliphatic systems, i.e. I want to know if there are compounds in the library that do not possess any rings in their structure.





Thanks for an answer and greetings from out of the snow,





JL

ChemAxon efa1591b5a

14-01-2009 08:35:17

Hi,


I'm afraid there's no such option available. These structure can be removed in a pre-processing step, for instance with the help of the evaluate batch program.


Regards,


Miklos