chemical database

User 4cd313c631

14-05-2008 08:19:45

Hi everybody, Camelia here. Have a question to ask.


For example, I have a SDF chemical database like this which is very huge:





CPSS- 0804941117





13 14 0 0 0 0 0 0 0 0 0


0.8400 -0.1600 0.0000 N 0 0 0 0 0 0 0 0


1.4800 0.4300 0.0000 N 0 0 0 0 0 0 0 0


0.0900 0.2700 0.0000 N 0 0 0 0 0 0 0 0


1.1100 1.2100 0.0000 C 0 0 0 0 0 0 0 0


0.2700 1.1200 0.0000 C 0 0 0 0 0 0 0 0


0.8400 -1.0300 0.0000 C 0 0 0 0 0 0 0 0


1.5300 1.9900 0.0000 C 0 0 0 0 0 0 0 0


1.0700 2.7400 0.0000 Cl 0 0 0 0 0 0 0 0


1.5900 -1.4600 0.0000 C 0 0 0 0 0 0 0 0


0.0800 -1.4600 0.0000 C 0 0 0 0 0 0 0 0


1.5900 -2.3300 0.0000 C 0 0 0 0 0 0 0 0


0.0700 -2.3200 0.0000 C 0 0 0 0 0 0 0 0


0.8400 -2.7600 0.0000 C 0 0 0 0 0 0 0 0


2 1 1 0 2 0 0


3 1 1 0 2 0 0


4 2 2 0 2 0 0


5 3 2 0 2 0 0


6 1 1 0 2 0 0


7 4 1 0 2 0 0


8 7 1 0 2 0 0


9 6 1 0 1 0 0


10 6 2 0 1 0 0


11 9 2 0 1 0 0


12 10 1 0 1 0 0


13 12 2 0 1 0 0


4 5 1 0 2 0 0


13 11 1 0 1 0 0


> <Sample Ref.>


OC101-12





> <Melting Point>


41.00 - 43.00





> <B1 Record No.>


304





> <ID>


304





$$$$





I need to remove the data from "Sample Ref" till "$$$$" using either Java or C programming. This data is just for one compound and we have to remove millions of them. Anybody has an idea of how we can go about doing this?





Thank you,


camelia :)

ChemAxon fa971619eb

14-05-2008 08:24:24

I moved this topic from the Instant JChem forum to this forum as it is more relevant here.





Tim

ChemAxon 7c2d26e5cf

14-05-2008 11:15:28

Quote:
I need to remove the data from "Sample Ref" till "$$$$" using either Java or C programming.
I have got two tips howto do it.


1. Write a simple code that read the sd file and write each line into an output file (exclude those lines that include property description).


It means, skip everything between the first occurance of ">" and "$$$$" by each record. (But preserver "$$$$" because it is the record separator.


You can write it in Java, C or in any script language.





2. Import each molecule from the SD file with MolImporter. Take some operations with the created Molecule object. Remove property in the Molecule and export the result into SD file. You can do this by set each property value to null.


http://www.chemaxon.com/marvin/help/developer/beans/api/chemaxon/struc/Molecule.html#setProperty(java.lang.String,%20java.lang.String)





I recommend the first solution because it is faster than the second.