Do cached cpds resulted from a JChemSearch ever expired?

User 5787a55225

18-08-2005 18:27:07

Hi, there.





It occurs to me that the JChemSearch does its job on caching the compounds upon the first search. However, a certain time period after the first search, like 3 hours. The cached compounds are gone! If I do the exact same search as the first search, I have to wait the same long time as the first search.





Did I miss something terribly here?





Thanks,





Donald Chen

ChemAxon 9c0afc9aaf

19-08-2005 10:01:54

Hi Donald,





Please let me know the following:





- In what environment do you get this error ? (standalone / web application, JSP example app., etc.)


- Do you have any error messages on the standard output ?


- What JChem version do you use ?





Best regards,





Szilard

User 5787a55225

19-08-2005 14:03:59

There is no error in the case I am reproting. It is the slugishness of the search I was complaining.





It is understandable that the very first search is slugish(for my case, it took about 9 minutes to do a similarity search). The immediately subsequent search is unstably faster than the first one. However, if the subsequent searchs happen hours apart from the first search, the speed will be as bad as the first search's speed(take 9 minutes again).





This frequently observed situation is an indicator that the cache isn't hold its content steadly and loose its functionality if the subsquent search is hours apart from the cache.





My testing enivronmen is WinProXP, Tomcat5.5.


Again, there is no error message. Did I even mention the "error" in my first post?





My question is what is the mechanism of the cache, and how to make the cache not loosing its content.





Any document guiding the usage of the cache?





Or, am I the only one concern about the cache's performance?





Donald

User 5787a55225

19-08-2005 14:19:06

WinXPPro


Tomcat5.5


JDK1.5


JChem3.02

ChemAxon 9c0afc9aaf

19-08-2005 14:33:31

Hi,





9 minutes for a similarity search seems to be very long.


For example for 1 million structures is should take less than a second on any average computer.





I suspect you are not using the structure cache at all.


The change in speed may be explained by the caching of the DB or the OS, but not by JChem's structure cache.





If you are using the JSP example application that is included in the JChem package, you should include the following line in configuration:





useStructureCache=true





You can edit the configuration via setup.jsp or by manually editing the file





<user_home>\chemaxon\.jchemsite





If you are using JChemSearch from API, you should enable structure caching by calling the following method:





http://www.chemaxon.com/jchem/doc/api/chemaxon/jchem/db/JChemSearch.html#setStructureCaching(boolean)





Please make sure you allow enough memory for Tomcat to accommodate the cache. Start with 100MB and add 100MB for every 1 million structure.


(you should consider the sum of structures in all tables).





By default JChem will only drop tables from the cache if there was no search on them in the past 96 hours.


You can change this setting by calling:


http://www.chemaxon.com/jchem/doc/api/chemaxon/jchem/db/JChemSearch.html#setCacheExpirationTime(double)








Best regards,





Szilard

User 5787a55225

19-08-2005 15:03:32

Quote:
9 minutes for a similarity search seems to be very long.


For example for 1 million structures is should take less than a second on any average computer.
We have roughly similar amount of the structures to search against, and my test PC is probably an average computer.
Quote:
I suspect you are not using the structure cache at all.


The change in speed may be explained by the caching of the DB or the OS, but not by JChem's structure cache.


The performace pattern of my tests on searching is repeatable and I am sure it is the JChem cache. I definitely use cache, because otherwise how an immediate subsequent search following the first search can give me the result within a second.





Again, the caching appears not holding its cached content, otherwise how come the we experience the slugishness if the subsequent search happen hours apart from the first search.


Quote:
If you are using the JSP example application that is included in the JChem package, you should include the following line in configuration:





useStructureCache=true





You can edit the configuration via setup.jsp or by manually editing the file





<user_home>\chemaxon\.jchemsite
I am not using JSP directly.
Quote:
If you are using JChemSearch from API, you should enable structure caching by calling the following method:





http://www.chemaxon.com/jchem/doc/api/chemaxon/jchem/db/JChemSearch.html#setStructureCaching(boolean)





Please make sure you allow enough memory for Tomcat to accommodate the cache. Start with 100MB and add 100MB for every 1 million structure.


(you should consider the sum of structures in all tables).





By default JChem will only drop tables from the cache if there was no search on them in the past 96 hours.
It looks to me that JChem drop tables from the cache if there was no search on them in the past 1 hour.








Anyway, I use the JChemSearch API directly, and I indeed use the caching:





Code:
searcher.setStructureCaching(true);



, and I have allocated the Tomcat 256M memory(plenty for the 1m structure).





What else could contribute the unstable performance of the caching?





Donald

ChemAxon 9c0afc9aaf

19-08-2005 15:50:33

Hi,





I suggest you to enable log messages on the standard error:





http://www.chemaxon.com/jchem/doc/api/chemaxon/jchem/db/JChemSearch.html#setInfoToStdError(boolean)





This will produce informative error messages that can be found in Tomcat's log file.


Please post the relevant part of the log here, that includes the search with the slowdown.
Quote:
I am not using JSP directly.
Could you tell me how do you use Tomcat ?


Are writing a servlet then ?





Did you reload the context of your application in Tomcat or did you perform any administrative operations on Tomcat between the searches ?





Best regards,





Szilard

User 5787a55225

19-08-2005 16:35:30

I use the Tomcat as the app server for hosting a web service(via Apache Axis) I am testing.


The web service is basically a java class for exporting the search functionality. I do not know exactly how the Axis works with Tomcat, but the JSP, per sa, is not involved, at least not directly.





Code:
setInfoToStdError(ture);






has already been set from the beginning. And the following are the pertinent logs:
Quote:
Thu Aug 18 08:31:09 EDT 2005


Search mode: SUBSTRUCTURE


Structure table: JCHEM.STRUC


Query: foo_bar_version1 <-- this is a dummylized SMILES for the IP reason


Screened: 63


Hits: 62


Cache loading: 184092 ms


Cache size (this table / total): 110.56 / 110.56 MBytes


Total time: 1219 ms Screening: 47 ms


Processing threads: 2


Current / peak / maximum searches per minute: 1 / 1 / 20
This is the very first search on Aug. 18th, and it took 184092 ms to cache.


And the cachis was really kicking in.
Quote:



Thu Aug 18 08:34:43 EDT 2005


Search mode: SUBSTRUCTURE


Structure table: JCHEM.STRUC


Query: foo_bar_version2 <-- this is a dummylized SMILES for the IP reason


Screened: 4835


Hits: 1384


Total time: 1516 ms Screening: 62 ms


Processing threads: 2


Current / peak / maximum searches per minute: 2 / 2 / 20


This is the second search, not very long apart from the first one. As you can see, it did not even do the cacheing. I guess the cached stuff is still be hold and effective.
Quote:



Thu Aug 18 08:48:16 EDT 2005


Search mode: SUBSTRUCTURE


Structure table: JCHEM.STRUC


Query: foo_bar_version1


Screened: 63


Hits: 62


Cache loading: 182561 ms


Cache size (this table / total): 110.56 / 110.56 MBytes


Total time: 766 ms Screening: 47 ms


Processing threads: 2


Current / peak / maximum searches per minute: 1 / 1 / 20
This is the third search, not very long apart from the first one, either. As you can see, it did cacheing, as if the cached stuff resulted from the first search was gone. This indeed puzzled me.
Quote:



Thu Aug 18 08:58:40 EDT 2005


Search mode: SUBSTRUCTURE


Structure table: JCHEM.STRUC


Query: foo_bar_version2


Screened: 4835


Hits: 1384


Total time: 1938 ms Screening: 78 ms


Processing threads: 2


Current / peak / maximum searches per minute: 1 / 1 / 20
This is the forth search, not very long apart from the search #3 and #2. As you see, the cacheing did not happen, which indicates the cached stuff was still effective.
Quote:



Thu Aug 18 09:08:31 EDT 2005


Search mode: SIMILARITY


Structure table: JCHEM.STRUC


Query: foo_bar_version1


Screened: 2


Hits: 2


Total time: 1281 ms Screening: 1281 ms


Processing threads: 2


Current / peak / maximum searches per minute: 1 / 1 / 20


Search #5, again, no cacheing kicked in --> cached stuff was still good.
Quote:



Thu Aug 18 09:08:49 EDT 2005


Search mode: SIMILARITY


Structure table: JCHEM.STRUC


Query: foo_bar_version2


Screened: 10008


Hits: 10008


Total time: 859 ms Screening: 859 ms


Processing threads: 2


Current / peak / maximum searches per minute: 2 / 2 / 20


Search #6, again, no cacheing kicked in --> cached stuff was still good.
Quote:



Thu Aug 18 13:58:06 EDT 2005


Search mode: SIMILARITY


Structure table: JCHEM.STRUC


Query: foo_bar_version2


Screened: 10008


Hits: 10008


Cache update: 290455 ms


Cache size (this table / total): 110.56 / 110.56 MBytes


Total time: 3735 ms Screening: 3735 ms


Processing threads: 2


Current / peak / maximum searches per minute: 3 / 3 / 20
Here comes the killer search #7, which happens about 5 hours after search #6. As you can see, the caching kicked in --> the orginally cached stuff was no longer good.
Quote:



Thu Aug 18 14:04:19 EDT 2005


Search mode: SUBSTRUCTURE


Structure table: JCHEM.STRUC


Query: foo_bar_version3


Screened: 5


Hits: 3


Total time: 157 ms Screening: 62 ms


Processing threads: 2


Current / peak / maximum searches per minute: 4 / 5 / 20


Search #8, looks good, since no lengthy caching involved.
Quote:



Thu Aug 18 14:04:25 EDT 2005


Search mode: SIMILARITY


Structure table: JCHEM.STRUC


Query: foo_bar_version3


Screened: 1861


Hits: 1861


Total time: 843 ms Screening: 843 ms


Processing threads: 2


Current / peak / maximum searches per minute: 3 / 5 / 20
Search #9, looks good, since no lengthy caching involved.
Quote:



Thu Aug 18 14:38:35 EDT 2005


Search mode: SIMILARITY


Structure table: JCHEM.STRUC


Query:foo_bar_version3


Screened: 1861


Hits: 1861


Cache update: 215983 ms


Cache size (this table / total): 110.56 / 110.56 MBytes


Total time: 1688 ms Screening: 1688 ms


Processing threads: 2


Current / peak / maximum searches per minute: 2 / 5 / 20
Search #10, again, a killer search. Looks bad, since lengthy caching involved, 34 minutes apart from search #9.








Now, from the aboved quoted log items, the caching's performs in a quite unstable mode. It seems to me it can not even hold cached stuff for more than 30 minutes.





Glad to have the evident logs available.





BTW, the test machine is a dedicated machine, and no one other than me issused searches via the Tomcat on the machine.





Don

ChemAxon 9c0afc9aaf

19-08-2005 18:06:13

Hi,





This is a weird issue indeed, as so far none of our clients had similar problems with the structure cache.





Your third search (with "cache loading") can be easily explained: the Java Virtual Machine (JVM) has been restarted, that's why the cache had to be loaded again.


Probably you have restarted Tomcat at this point.


The clear indication of this is that the peak number of searches dropped back to 1.





What puzzles me is the "cache update" in some of your subsequent searches.





What kind of modification(s) were made to the structure table at this point (insert/update/delete) ?


How these modifications were performed (API / jcman)?


Were these modifications performed with exactly the same JChem version (please check, if needed)?





Szilard

User 5787a55225

19-08-2005 20:43:20

Quote:



Your third search (with "cache loading") can be easily explained: the Java Virtual Machine (JVM) has been restarted, that's why the cache had to be loaded again.


Probably you have restarted Tomcat at this point.


The clear indication of this is that the peak number of searches dropped back to 1.
Most likely, I have been restarted Tomcat right before search #3. So, the re-caching happened in search #3 has nothing to worry about.





As for search #7 and search #10, I did not restart the Tomcat.


Quote:
What kind of modification(s) were made to the structure table at this point (insert/update/delete) ?
There were 4 structures been inserted into the structure table between search #6 and search #7 were issued. Those insertion were issued from a different Tomcat on a different machine. Does it affect the cached stuff on my local machine? If it does, why?
Quote:
How these modifications were performed (API / jcman)?
Via API. Again, not from my Tomcat and my from my PC.
Quote:
Were these modifications performed with exactly the same JChem version (please check, if needed)?
I have checked, those modifications were performed via the exactly the same JChem(3.02) as the one my searchs depend on.





Don

ChemAxon 9c0afc9aaf

20-08-2005 12:28:02

Hi Donald,
Quote:
There were 4 structures been inserted into the structure table between search #6 and search #7 were issued. Those insertion were issued from a different Tomcat on a different machine. Does it affect the cached stuff on my local machine? If it does, why?
Yes, it does.


If the content of the table has changed the structure cache must be updated accordingly, so we can get correct search results.





The problem in your case is the unusually slow update of the cache.


Typically the update time after the import of a small number of "unidentified structures" (see later, method 2 of UpdateHandler usage) should be around 7-8 times faster than the cache loading time. In your case it's even slower for an unknown reason.


I have performed tests in a similar scenario, and the updates were always much faster for me:


For 1 million structures the cache loading was around 150 seconds, while the cache update only took about 20 seconds.








Some supplementary information about UpdateHandler (not directly related to the slow cache update in your system):





You may argue, that the 20-second update time may seem long after the insert of only a handful of structures.


Currently there are 2 ways of inserting structures with UpdateHandler


(I must confess our documentation is scarce on this issue)





1. Inserting individual structures. (A typical scenario when the chemist draws a structure in a GUI and inserts it.)


In this case only a single structure is inserted by an UpdateHandler object with UpdateHandler.execute(true), and the UpdateHandler is closed without inserting any more structures.


In this case the execute(true) insures that we determine the cd_id of the inserted structure, and because we have only one structure we store the cd_id of the new structure in the property table.


(of course this single insert can be repeated many times)


When the next search begins, the cache can be updated rapidly, since we know which are the new structures





2. Several structures are inserted with one UpdateHandler (e.g. during the import of structure files). In this case we must get all the cd_id vales from Oracle, determine which ones are new. This can take some time (but still should be almost 10x faster than on your system).









Please confirm with tests if the insert of 4 (or less) structures mentioned by you is sufficient to produce such an unreasonably slow cache update.


If it is, please illustrate the API usage for the inserts with code snippets.


Please also tell me about your Oracle and JDBC driver version.





Best regards,





Szilard

User 5787a55225

22-08-2005 17:30:51

Szilard,





Thanks for the information. Looks like the caching does affect the search performance in my case.





We did another prelimilary test by using a different JChem jar(3.014). This time the cacheing time resulted from an insertion of a new structure does not cost as much as in JChem(3.02). Under 3.014, the search appeared to only engaged cache update and it only took 485ms to do the cache update.





While under JChem(3.02), the cache update took 290455ms for the 4 insertion. (see previous post, search #7).





Could you try on your side by using the JChem(3.02) to see if it makes difference?





Don

User d68ef9d5a9

22-08-2005 18:05:19

Hi Szilard,





The behavior on JChem 3.0.2 is quite interesting. It looks like the incremental caching feature was not there anymore. I remember our discussion about this feature in September 2004 in our private communications, and I assume that you have put this feature in the new version 3.0.2 released Dec. 2004. What Don has proved here is that the incremental caching seems not there at all. I have repeated the test and found that the re-caching occurs when a new compound has been inserted into the structure table. The re-caching takes similar amount of time as the initial caching. Same activity was repeated on version 3.0.14, there was no re-cashing, and search went through very quickly.





I just wonder if you can confirm this.





Thank you for your time.





Ben Li

ChemAxon 9c0afc9aaf

22-08-2005 20:12:28

Hi Ben,





Incremental caching was introduced in version 3.0 (Dec 1, 2004).





For more details on changes between versions please see:





http://www.chemaxon.com/jchem/changes.html





This problem doesn't seem to be version specific though, it's rather an error of low probability.


I could reproduce it once with version 3.0.14 (but interestingly not with 3.0.2)





It seems that the information on the state of the cache can become inconsistent in rare cases (which may be not so rare in your application), and in such cases the cache reloads itself (rather than to provide incorrect search results).





We will investigate this issue, and let you know if a new version is available with a fix.





Best regards,





Szilard

User 5787a55225

22-08-2005 20:40:57

Szilard,





On our test, with 3.0.2, we constantly reproduce the very sluggish cache-update once there are insertions happen to the structure table, and in fact, it is sure case rather than rare case. While with 3.0.14, we have not seen such sluggish cache-update yet.





Do you have any public-available document about how on the earth the caching is achieved? This kind of document maybe helpful to us for joining our efforts to address this someone-think-rare-while-someone-think-sure bug.





I am surprised that I am the only one to raise this kind of issue.





I hope your engineers can find the cause to this issue and get it fixed as quikly as possible, regardless if the issue can only happen with a low probablity or not. Because our users demand fast and stable search.





Thanks,





Donald

User 5787a55225

22-08-2005 21:08:03

Szilard,





I am not sure how the usage to the JChem API for insertion is related to the issue been discussed, but I am providing the code snippets anyway, as you asked for in one of your previous posts.





Code:



    public int insertCompound(Object ob) throws RepositoryException {


        CompoundBean bean =(CompoundBean)ob;


        String sdf=bean.getStructure();


        if (sdf == null || sdf.trim().length() == 0) {


                        throw new RepositoryException(getMyName()


                + ": cannot get valid cssdf ."


                + sdf );


        }


        UpdateHandler uh=null;


        try {


                 if (!this.isConnected()) {


                      throw new SQLException("bad connection");


                  }





                  MolHandler mh=new MolHandler(sdf);


                  mh.aromatize(MoleculeGraph.AROM_DAYLIGHT);


                  Molecule mole=mh.getMolecule();





                  int chiralCount = JChemTools.getChiralCount(mole);


                  uh = new UpdateHandler(connectionHandler,


                                                        UpdateHandler.INSERT,


                                                        "JCHEM.STRUCTURE",


                                                        "IUPAC_NAME, STRUCTURE_ID, TYPE, MONO_ISO_MW, CHIRAL_COUNT");





                  uh.setValuesForFixColumns(mole.toFormat("mol"));


                  uh.setValueForAdditionalColumn(1, bean.getName(), 


                                                                  Types.VARCHAR);


                  uh.setValueForAdditionalColumn(2, new Integer(bean.getId()), Types.INTEGER);


                  uh.setValueForAdditionalColumn(3, bean.getType(),


                                                                 Types.VARCHAR);


                  if (bean.getMonoIsoMW()==null) {


                       bean.setMonoIsoMW();


                  }


                  uh.setValueForAdditionalColumn(4, bean.getMonoIsoMW(),


                                                                  Types.FLOAT);


                  uh.setValueForAdditionalColumn(5, new Integer(chiralCount),


                                                                  Types.INTEGER);


                  uh.setDuplicateFiltering(true);


                  int lastId=uh.execute(true);


                  if (lastId<0) {


                         System.out.println("lastId="+lastId);


                          lastId=(-1)*lastId;


                  }


                  return lastId;


                } catch(SQLException se){


                     throw new RepositoryException(getMyName()


                     +": Sql error caught in insert method." + se.getMessage());


                }catch (Exception e) {


                           throw new RepositoryException(getMyName()


                            +": Unknown error caught in insert method." +


                            e.getMessage());


                } finally {


                          if (uh!=null)


                               uh.close_NE();


               }


   }








Hope the code snippets will be helpful,





Donald

ChemAxon 9c0afc9aaf

23-08-2005 12:47:07

Hi,
Quote:
On our test, with 3.0.2, we constantly reproduce the very sluggish cache-update once there are insertions happen to the structure table, and in fact, it is sure case rather than rare case. While with 3.0.14, we have not seen such sluggish cache-update yet.
BTW why did you downgrade to 3.0.2 ?


3.0.2 is a pretty old version (Dec 7 2004), there have been several improvements and bugfixes since then.


Do you have some special reason for downgrading ?
Quote:
Do you have any public-available document about how on the earth the caching is achieved? This kind of document maybe helpful to us for joining our efforts to address this someone-think-rare-while-someone-think-sure bug.
No we don't have such a document.


I do understand that this is a frequent bug for you, I just said it should be rare in general, as nobody else has reported it yet.





Of course we treat this issue with high priority.





Meanwhile it would be interesting if you could test if the same error occurs with a freshly created table (if this doesn't take too long for you).





Best regards,





Szilard

User d68ef9d5a9

23-08-2005 13:08:36

Hi Szilard,





Our database table was created with JChem 2.33. Since that, we had upgraded once to 3.0.2. It has been on our plan to upgrade it to the newest version. Because of some uncertainty in the new versions, we normally have to test the new version substantially before we push the button to change in our production. So there is no downgrade in our side, we are just very cautious in the process.





The behavior of re-caching seems very repeatable to us. The same application on our development zone, which uses 3.0.14, doesn't have such problem. I was very surprised when I learned that you actually reproduced the re-caching scenario on version 3.0.14.





Anyway, I appreciate your help and time on this. I would appreciate very much if you can share your finding with us (publicly or privately). So we can also look into the problem on our side, too.





I have created new table in version 3.0.2, and we do the test this against the new table. Don or I will keep you posted about our findings.





Again, thank you for your effort.





Ben Li

ChemAxon 9c0afc9aaf

23-08-2005 13:45:22

Donald,





Thank you for the attached code.


Hopefully it will help us to reproduce the problem with higher probability.





So far only one thing caught my eyes:





Using UpdateHandler.close_NE() without checking if there was an error will hide any exceptions that may occur during the closing process.


It's during the close process that information about changes are saved for future cache updates.


I suggest you to call close(), catch the SQLException if present, and print the stack trace. Afterwards you may also rethrow it as a RepositoryException of yours.


I'm not saying there is an exception now, but it's better not to swallow potential exceptions in any case.








Best regards,





Szilard

ChemAxon 9c0afc9aaf

23-08-2005 13:54:49

Hi Ben,
Quote:
I was very surprised when I learned that you actually reproduced the re-caching scenario on version 3.0.14.
I was surprised too :) Actually it was on a test table, which I accessed in some unconventional ways (trying to investigate the problem), so the cause of that unnecessary cache reload might even be different.


(I could not investigate that case further, because afterwards it always worked OK)





I will let you know if I find out something.





Best regards,





Szilard

User 5787a55225

23-08-2005 14:06:38

Sziland,
Quote:
I do understand that this is a frequent bug for you, I just said it should be rare in general, as nobody else has reported it yet.
Thanks for the understanding. Nobody else has reported this bug expect us does not necessarily mean that it is rare in general, because it is possible that we are the only ones who are seriously banging on JChemSearch.





Do you think the JDK version could play a role in this issue?





As Ben mentioned, please share with us your findings, no matter wheather are conclusive or not, via posting or via emailing.





Thanks,





Donald

ChemAxon 9c0afc9aaf

23-08-2005 16:17:59

Quote:
it is possible that we are the only ones who are seriously banging on JChemSearch.
I don't think so :)


Maybe not a lot of other users use 3.0.2 though.
Quote:



Do you think the JDK version could play a role in this issue?
It's not likely.





BTW when do you plan to switch to 3.0.14 ?


This version is much more stable than 3.0.2. in several ways.





Best regards,





Szilard

User 5787a55225

23-08-2005 17:22:05

Quote:



BTW when do you plan to switch to 3.0.14 ?


This version is much more stable than 3.0.2. in several ways.


I hope upgrading to 3.0.14 will magically burry the discussed issue.


I am not sure when we will do the upgrade.





Don

User 5787a55225

24-08-2005 15:09:59

Szilard,





In order to find out what could cause the re-casheing, we created a small structure table (~4K structures) in our development area, based on JChem 3.0.2. Also, we used the following script to test the search:


Code:



import java.util.*;


import java.util.*;


import java.sql.*;





import com.nrgn.core.*;


import chemaxon.jchem.db.*;


import chemaxon.struc.*;


import chemaxon.util.*;


import chemaxon.formats.*;


import chemaxon.marvin.modules.*;





public class TestSearch2 {


    public static void main (String [] args){


       try{


            String structureTable="TEST_STRUCTURE302";


              ConnectionHandler coreCH = new ConnectionHandler();


              coreCH.setDriver("oracle.jdbc.driver.OracleDriver");


              coreCH.setUrl("jdbc:oracle:thin:@jar_bar_tooo.com:1521:foo_bar");


              coreCH.setLoginName("testJChem");


              coreCH.setPassword("gotcha");


              coreCH.connect();





              MolHandler mh=new MolHandler(args[0]);


              Molecule mole=mh.getMolecule();


              mole.aromatize(MoleculeGraph.AROM_CHEMAXON);





              //********************


              //  First Search


              //********************


              JChemSearch coreSearcher =new JChemSearch();


              coreSearcher.setConnectionHandler(coreCH);


              coreSearcher.setStructureTable(structureTable);


              coreSearcher.setStructureCaching(true);


              coreSearcher.setSearchType(JChemSearch.SUBSTRUCTURE);


              coreSearcher.setInfoToStdError(true);


              coreSearcher.setMaxResultCount(200);


              coreSearcher.setMaxTime(6000000);


              coreSearcher.setWaitingForResult(true);


              coreSearcher.setQueryStructure(mole.toFormat("mol"));


              coreSearcher.run();





              if (coreSearcher.getResultCount()>0){


                  int counter=coreSearcher.getResultCount();


                  System.out.println("First search, I got "+counter+" structures.");


              }





              //**********************


              //  Insert a structure


              //**********************


              MolHandler mh2 = new MolHandler(args[1]);


              Molecule mole2 = mh2.getMolecule();


              UpdateHandler coreUH=new UpdateHandler(coreCH,


                                                     UpdateHandler.INSERT,


                                                     structureTable,


                                                     "id");


              coreUH.setValuesForFixColumns(mole2.toFormat("mol"));


              coreUH.setValueForAdditionalColumn(1, "DL", Types.VARCHAR);


              int key=coreUH.execute(true);


              System.out.println("\nCompound inserted, Id = " + key);





              //********************


              //   Second Search


              //********************


              JChemSearch coreSearcher2 =new JChemSearch();


              coreSearcher2.setConnectionHandler(coreCH);


              coreSearcher2.setStructureTable(structureTable);


              coreSearcher2.setStructureCaching(true);


              coreSearcher2.setSearchType(JChemSearch.SUBSTRUCTURE);


              coreSearcher2.setInfoToStdError(true);


              coreSearcher2.setMaxResultCount(200);


              coreSearcher2.setMaxTime(6000000);


              coreSearcher2.setWaitingForResult(true);


              coreSearcher2.setQueryStructure(mole.toFormat("mol"));


              coreSearcher2.run();


              if (coreSearcher2.getResultCount()>0){


                  int counter = coreSearcher2.getResultCount();


              System.out.println("Second search, I got "+counter+" structures");


           }


              coreCH.close();


           } catch(Exception e){


                 System.out.println(e.toString());


           }


    }


}






Then I ran the script from command line, and got the following output:
Quote:



D:\dev\java>java TestSearch2 "c1cc(Cl)c(Cl)cc1" "O=C(c1cc(Cl)c(Cl)cc1)c2ccnc2"





Wed Aug 24 10:28:46 EDT 2005


Search mode: SUBSTRUCTURE


Structure table: NRGN_CORE.TEST_STRUCTURE302


Query: Clc1ccccc1Cl


Screened: 58


Hits: 42


Cache loading: 1078 ms


Cache size (this table / total): 0.31 / 0.31 MBytes


Total time: 782 ms Screening: 0 ms


Processing threads: 2


Current / peak / maximum searches per minute: 1 / 1 / 20





First search, I got 42 structures.





Compound inserted, Id = 4038





Wed Aug 24 10:28:48 EDT 2005


Search mode: SUBSTRUCTURE


Structure table: NRGN_CORE.TEST_STRUCTURE302


Query: Clc1ccccc1Cl


Screened: 58


Hits: 42


Total time: 109 ms Screening: 0 ms


Processing threads: 2


Current / peak / maximum searches per minute: 2 / 2 / 20





Second search, I got 42 structures


As you can tell, the first search engaged the cacheing, while the second search did not engage the cache-update as anticipated(since a new structure was just inserted). The behavior of the second search puzzled me, not only because it did not engage the cache-update, but also did not include the newly inserted structure. (I expected it found 43 structures).





Then I immediately give the script another test run, and got the following output,
Quote:



D:\dev\java>java TestSearch2 "c1cc(Cl)c(Cl)cc1" "O=C(c1cc(Cl)c(Cl)cc1)c2ccnc2"





Wed Aug 24 10:28:54 EDT 2005


Search mode: SUBSTRUCTURE


Structure table: JCHEM.TEST_STRUCTURE302


Query: Clc1ccccc1Cl


Screened: 59


Hits: 43


Cache loading: 1079 ms


Cache size (this table / total): 0.31 / 0.31 MBytes


Total time: 780 ms Screening: 0 ms


Processing threads: 2


Current / peak / maximum searches per minute: 1 / 1 / 20





First search, I got 43 structures.





Compound inserted, Id = 4039





Wed Aug 24 10:28:56 EDT 2005


Search mode: SUBSTRUCTURE


Structure table: JCHEM.TEST_STRUCTURE302


Query: Clc1ccccc1Cl


Screened: 59


Hits: 43


Total time: 125 ms Screening: 0 ms


Processing threads: 2


Current / peak / maximum searches per minute: 2 / 2 / 20





Second search, I got 43 structures


This time, as you can see, the first search, as expected, picked up the structure been inserted during last run, and the output shows 43. However, the second search, not very surprisingly, failed to pick up the newly insertion happened during this run.





Did I terribly miss something here?





Also, I suspect the codes dedicated to the second search may be wrong, so I tried the follwiing 2 versions:


Code:



              //************************


              //   Second Search: version 2


              //************************


              coreSearcher.setConnectionHandler(coreCH);


              coreSearcher.setStructureTable(structureTable);


              coreSearcher.setStructureCaching(true);


              coreSearcher.setSearchType(JChemSearch.SUBSTRUCTURE);


              coreSearcher.setInfoToStdError(true);


              coreSearcher.setMaxResultCount(200);


              coreSearcher.setMaxTime(6000000);


              coreSearcher.setWaitingForResult(true);


              coreSearcher.setQueryStructure(mole.toFormat("mol"));


              coreSearcher.run();


              if (coreSearcher.getResultCount()>0){


                  int counter = coreSearcher.getResultCount();


              System.out.println("Second search, I got "+counter+" structures");


           }


              coreCH.close();


           } catch(Exception e){


                 System.out.println(e.toString());








Code:



              //************************


              //   Second Search: version 3


              //************************


              coreSearcher = new JChemSearch();


              coreSearcher.setConnectionHandler(coreCH);


              coreSearcher.setStructureTable(structureTable);


              coreSearcher.setStructureCaching(true);


              coreSearcher.setSearchType(JChemSearch.SUBSTRUCTURE);


              coreSearcher.setInfoToStdError(true);


              coreSearcher.setMaxResultCount(200);


              coreSearcher.setMaxTime(6000000);


              coreSearcher.setWaitingForResult(true);


              coreSearcher.setQueryStructure(mole.toFormat("mol"));


              coreSearcher.run();


              if (coreSearcher.getResultCount()>0){


                  int counter = coreSearcher.getResultCount();


              System.out.println("Second search, I got "+counter+" structures");


           }


              coreCH.close();


           } catch(Exception e){


                 System.out.println(e.toString());








Sadly, those two versions did not help a bit.





There is must something going on here, which we have not been aware of.





Please advice.





Don

ChemAxon 9c0afc9aaf

24-08-2005 15:41:15

Quote:
As you can tell, the first search engaged the cacheing, while the second search did not engage the cache-update as anticipated(since a new structure was just inserted). The behavior of the second search puzzled me, not only because it did not engage the cache-update, but also did not include the newly inserted structure. (I expected it found 43 structures).
This is because you have called UpdateHandler.close() after your search.


This method stores information about the updates, so only searches after the close() call will notice that the cache has to be refreshed.





Best regards,





Szilard

User 5787a55225

24-08-2005 15:54:18

Thanks a lot, your adervise helps to address the problem.





Don

User 5787a55225

24-08-2005 18:04:57

SZilard,





I have two logs as shown below, in which both search involves cache-updating.
Quote:



Wed Aug 24 13:54:22 EDT 2005


Search mode: SUBSTRUCTURE


Structure table: JCHEM.TEST_STRUCTURE302


Query: Clc1ccccc1Cl


Screened: 53


Hits: 37


Cache update: 6281 ms


Cache size (this table / total): 0.31 / 0.31 MBytes


Total time: 3406 ms Screening: 47 ms


Processing threads: 2


Current / peak / maximum searches per minute: 1 / 1 / 20


This one spend 6281ms to do the update(there was an insertion happen right before this search.).
Quote:



Wed Aug 24 13:54:56 EDT 2005


Search mode: SUBSTRUCTURE


Structure table: JCHEM.TEST_STRUCTURE302


Query: Clc1ccccc1Cl


Screened: 54


Hits: 38


Cache update: 329 ms


Cache size (this table / total): 0.31 / 0.31 MBytes


Total time: 874 ms Screening: 0 ms


Processing threads: 2


Current / peak / maximum searches per minute: 2 / 2 / 20


This one spend 329ms to do the update(there was an insertion happen right before this search, also.).





I do not understand why the performance difference is so large. The only difference between the two search that I am aware of is that the first search seems to base on a "stale" cached content(the last search happened about 2 hours ago).





The two test searches were on 3.0.2. We start to think the unstabel performance probably not related with the version of JChem.





Thanks,





Don

ChemAxon 9c0afc9aaf

24-08-2005 18:26:40

Hi,





While looking through the source code I have found some suspicious parts.





Please download the modified jchem.jar of version 3.0.2 from here:





http://www.chemaxon.com/download.php?d=/data/download/jchem/302mod1/jchem.jar





You can safely switch to it even in a live system, apart from a minor change in caching it is identical with your current version.





Please let me know if it makes any difference.


(don't forget to restart Tomcat after changing the jar file)





Best regards,





Szilard

ChemAxon 9c0afc9aaf

24-08-2005 18:41:03

PS: in a synthetic test I could reproduce the probelm and the new code seems to make a difference here.





Sz.

User 5787a55225

24-08-2005 22:35:16

Sz,





The modified 3.0.2 seems to help quit bit in our first test, see below logs:
Quote:
Wed Aug 24 16:34:00 EDT 2005


Search mode: SUBSTRUCTURE


Structure table: JCHEM.STRUCTURE


Query: yada_yade_smiles


Screened: 85301


Hits: 26393


Cache loading: 183143 ms


Cache size (this table / total): 110.56 / 110.56 MBytes


Total time: 11937 ms Screening: 62 ms


Processing threads: 2


Current / peak / maximum searches per minute: 1 / 1 / 20
The recacheing was due to the restart of Tomcat, so no concern here. Then we inserted a structrue, and did another search immediately after the insertion.
Quote:
Wed Aug 24 16:45:18 EDT 2005


Search mode: SUBSTRUCTURE


Structure table: JCHEM.STRUCTURE


Query: yada_yade_smiles


Screened: 85302


Hits: 26394


Cache update: 1359 ms


Cache size (this table / total): 110.56 / 110.56 MBytes


Total time: 10735 ms Screening: 78 ms


Processing threads: 2


Current / peak / maximum searches per minute: 1 / 1 / 20
As expected, the cache-updating kicked in. This indicates the modified 3.0.2 works, because the search after an insertion did not engage the recacheing(while the original 3.0.2 did).





Unfortunetately, before we declare a complete success, I did another test run. This time I did a similar search at about 1 hour apart from last search. During this one hour time period, there were 100 something structures had been inserted to the structure table. Here is the log:
Quote:
Wed Aug 24 18:08:54 EDT 2005


Search mode: SIMILARITY


Structure table: JCHEM.STRUCTURE


Query: foo_bar_smiles


Screened: 1861


Hits: 1861


Cache update: 178173 ms


Cache size (this table / total): 110.56 / 110.56 MBytes


Total time: 907 ms Screening: 907 ms


Processing threads: 2


Current / peak / maximum searches per minute: 2 / 2 / 20
To me, this log reflects too facts:


    1. The search only engaged the cache-updating, which is good.





    2. The cache-updating took long time(178173 ms), which is almost as costly as the first re-caching.(183143 ms).
I think we are closer to the success, but we probably still have a few miles need to cover, interms of setting this bug down.





What do you think do you think?





Donald

ChemAxon 9c0afc9aaf

25-08-2005 07:36:26

Hi,





Your lengthy cache update in fact included a full cache reload (the time clearly indicates this).


Currently we only write "loading" the very first time, but probably we will be more informative in the future, and write "reload" in the case of such updates.





I believe the cause of the current problem is the use of multiple Tomcat instances on the same table.


Did you perform a search from the other Tomcat before the lengthy reload ?





Using multiple Tomcat instances means caching the structure table multiple times.


This is a waste of resources.


Apart from duplicating the cache in RAM it can also have an impact on the performance : updating multiple caches increases the load on the DB.


For this reason it is not a general practice, so this can also explain why you are experiencing this problem and others are not.





Detailed description of the process:





- The cache is up-to-date in Tomcat1


- let's say 100 individual inserts or updates are performed (doesn't matter where), information about the changes is stored in the property table.


- a search is performed in Tomcat2. Because of the changes this starts with a cache update here (this will be quick). To prevent the property table growing into infinity, old update logs are deleted after the update. Only a limited number of updates logs are kept, which is currently quite a low limit (30).


- a search is performed in Tomcat1. This instance would need to know the nature of the last 100 updates, however not all of of them are present any more. This triggers a full cache reload.





Probably we could apply some workarounds here, but the healthiest solution would be the use of a single cache instance.





Please tell me why do you need multiple Tomcat instances?


Is this setup an integral part of your system design, or the other instance is just for testing ?





All the best,





Szilard

User 5787a55225

25-08-2005 14:11:43

Szilard,
Quote:
Your lengthy cache update in fact included a full cache reload (the time clearly indicates this).
Even the log shows it is [Cache Update]? If the indication of the lengthy cache update is actually a cache reload, then the [Cache Update] in the log is a bit misleading, isn't it?
Quote:
Currently we only write "loading" the very first time, but probably we will be more informative in the future, and write "reload" in the case of such updates.
That will be less misleading. But I still do not fully understand the need of reload, I thought increamental updates would be sufficient.
Quote:
I believe the cause of the current problem is the use of multiple Tomcat instances on the same table.
What makes you have that belief? Did you test to use multiple Tomcat instances to verify the belief?
Quote:
Did you perform a search from the other Tomcat before the lengthy reload ?
I tried searches from my own PC against the table, there was possibility that someone else perform a search against the same table from the other Tomcat before the lengthy reload happened on my Tomcat.
Quote:
Using multiple Tomcat instances means caching the structure table multiple times.


This is a waste of resources.
Does the cacheing actually happen on the Oracle side or the Tomcat sides? Let's say, we have one Oracle holding the structure table, and two Tomcat instances: one for regular users and one for a tester like myself. The searchs issued by the regular users are totaly independent from the searchs inssued by the tester. Does this senario means that there are two entities of cached structures: one on the Tomcat for regular users, another on the Tomcat for the tester? (This is what I have thought up till now.). Or does this means there is only one entity holding the cached structures, and the entity sits on somewhere in Oracle or whereever?





If the cached structure is hold in a single entity no matter how many Tomcat instances accessing it, then caching the sturcture table multiple times incurred by different Tomcat is a waste of resources. But why the searches issued from different Tomcats could trigger recaching? Won't it be a nicer way if the cacheing headquarter is smart enouth to do cachening only when necessary?
Quote:
Apart from duplicating the cache in RAM it can also have an impact on the performance : updating multiple caches increases the load on the DB.
Szilard, I am still confused as before. From this quoted statement, it seems to me that each Tomcat instance hold its own cached structures. Can I assume those different entities for holding the cached structures are independent and never intervene each other, whatsoever?
Quote:
For this reason it is not a general practice, so this can also explain why you are experiencing this problem and others are not.
Still not clear to me why my searchs could experience lengthy re-cacheing?
Quote:
Detailed description of the process:





- The cache is up-to-date in Tomcat1
I am with you on this.
Quote:
- let's say 100 individual inserts or updates are performed (doesn't matter where), information about the changes is stored in the property table.
Wait a second, what is the property table? Is it something on the Oracle or something stored in RAM in the format of some object?
Quote:
- a search is performed in Tomcat2. Because of the changes this starts with a cache update here (this will be quick).
okay, I am with you on this.
Quote:
To prevent the property table growing into infinity, old update logs are deleted after the update. Only a limited number of updates logs are kept, which is currently quite a low limit (30).
Aha, interesting! Sounds like the mysterious perperty table is a centralized piece and it has only one instance no matter how many Tomcats are in use. What do you mean by "old update logs"? Are those logs the ones we have seen under the Tomcat logs? Could you put a bit more detail on the updates logs?
Quote:
- a search is performed in Tomcat1. This instance would need to know the nature of the last 100 updates, however not all of of them are present any more. This triggers a full cache reload.
This information really sounds to me that the updates logs are the determining factor to steer whether a new search needs a recacheing or a cache-updating or neither. Three questions here:
    1). Why does it need to refer to last 100 updates? why not 20 or 200?


    Does the numebr 100 just an arbitary, handy guessy number?





    2). Does a new search issued from Tomcat2 need to refer the logs of the last 100 updates in order to determine whether to trigger recaching or cache-updating or neither?





    3). It becomes more clear to me that the update logs are stored in some centralized place, so that searches from different Tomcats could all have the access. If my understanding is correct, won't it make more sense that those logs are bin-ed into different folders(each Tomcat instance has its own dedicated version of logs), so that the cacheing/recacheing will be truly independent from each other across the different Tomcat instances? It is not that hard to implement, isn't it?


    There are surely real-world cases in which multiple server need to set up for the whatever sakes.
Any large company which has to serve large audiences usually set up multiple redudent servers for serving contents by useing the same data stored in database.


Currently, my employer is not the case. But big pharmers like the one who created Viagra may need duplicated servers.
Quote:
Probably we could apply some workarounds here, but the healthiest solution would be the use of a single cache instance.
You really sound you belive that the unnecessary recacheing is caused by multiple Tomcat instances. But I think in the company I work for, we have experienced slugish searches(probably due to the recacheing) even there is only one Tomcat instance is in use. We only set up the second Tomcat instance about 4 days ago, after all.





If the healthiest solution is to use a single cache instance(which inferres to use a single Tomcat instance), it actually put limits on the options for clients who want to expand their system. I think the ultimate solution to this unnecessary-recaheing issue is to really work out a cache mechnism which can cover the cases of multiple servers are in use against a single structure table.
Quote:
Please tell me why do you need multiple Tomcat instances?
Right now, the reason, as I mentioned before, is to do experiements against the production db, hopeing not to invervene the production server.
Quote:
Is this setup an integral part of your system design, or the other instance is just for testing ?
It is purely for testing right now. But who knows, setting up multiple servers may be necessary for the cases I mentioned above.





Thanks for reading this far,





Don

ChemAxon 9c0afc9aaf

25-08-2005 15:44:48

Hi,
Quote:
That will be less misleading. But I still do not fully understand the need of reload, I thought incremental updates would be sufficient.
Currently the cache uses a very compact storage method to minimize the memory footprint. This unique storage method doesn't allow to free up the space previously allocated for deleted or updated structures, so when they reach a certain percentage (after large number of deleted / updated structures), the cache reloads.


So far our customers like the low memory footprint, and the reload is not a problem, since modification / deletion of the structures are very rare.
Quote:
What makes you have that belief? Did you test to use multiple Tomcat instances to verify the belief?
Logical deduction from the source code. No.
Quote:
Does the caching actually happen on the Oracle side or the Tomcat sides?
It happens in the Tomcat instances.
Quote:
Does this senario means that there are two entities of cached structures: one on the Tomcat for regular users, another on the Tomcat for the tester?
Yes.
Quote:
But why the searches issued from different Tomcats could trigger recaching? Won't it be a nicer way if the cacheing headquarter is smart enouth to do cachening only when necessary?
To do this all changes to the table should be stored permanently.


It would produce an ever-increasing number of logs.
Quote:
Szilard, I am still confused as before. From this quoted statement, it seems to me that each Tomcat instance hold its own cached structures. Can I assume those different entities for holding the cached structures are independent and never intervene each other, whatsoever?
As I have explained before, normal cache updates can also take some time and resources. For example: if you need to update the cache in 10 Tomcat instances, they will fetch the updates 10 times.





Quote:
Wait a second, what is the property table? Is it something on the Oracle or something stored in RAM in the format of some object?
It is a table in Oracle, it stores information about the JChem structure tables. More than one property table may be used, each defines a distinct JChem environment.


Quote:
What do you mean by "old update logs"?
Each time you execute UpdateHandler.close() a new is line inserted into the property table about the nature of the performed changes. This is an update log. We do not let their number increase to infinity.


Quote:
1). Why does it need to refer to last 100 updates? why not 20 or 200?


Does the number 100 just an arbitrary, handy guessy number?
Because in the imaginary situation 100 structures were inserted individually (I could have chosen any other number), storing 100 change logs (a mass import only stores 1 change log). Tomcat2 needs to know all the changes since the last update, otherwise a full reload is needed.


Quote:



2). Does a new search issued from Tomcat2 need to refer the logs of the last 100 updates in order to determine whether to trigger recaching or cache-updating or neither?
Yes.
Quote:
It becomes more clear to me that the update logs are stored in some centralized place, so that searches from different Tomcats could all have the access. If my understanding is correct, won't it make more sense that those logs are bin-ed into different folders(each Tomcat instance has its own dedicated version of logs), so that the cacheing/recacheing will be truly independent from each other across the different Tomcat instances? It is not that hard to implement, isn't it?
Sorry, but the logs must be centralized. Otherwise one Tomcat would not know what changes were made by the other.
Quote:
You really sound you belive that the unnecessary recacheing is caused by multiple Tomcat instances. But I think in the company I work for, we have experienced slugish searches(probably due to the recacheing) even there is only one Tomcat instance is in use. We only set up the second Tomcat instance about 4 days ago, after all.


Yes, but I have sent you a modified jchem.jar only yesterday.


That one should not have these reloads with one Tomcat.


Quote:
If the healthiest solution is to use a single cache instance(which inferres to use a single Tomcat instance), it actually put limits on the options for clients who want to expand their system. I think the ultimate solution to this unnecessary-recaheing issue is to really work out a cache mechnism which can cover the cases of multiple servers are in use against a single structure table.
Of course we always try to improve our products.


I guess we will offer an option, how many cache updates to preserve (even unlimited). This way one server would not delete those updates which are still needed for the other.


(probably we will rather store these updates in dedicated table(s) though, not in the property table, for better performance)





I will let you know about this improvements.





Until then could you cope with using one Tomcat for one database ?


(Probably it's much safer anyway not to experiment on live data ;) )





Best regards,





Szilard

User 5787a55225

25-08-2005 16:42:28

SZ,
Quote:
2). Does a new search issued from Tomcat2 need to refer the logs of the last 100 updates in order to determine whether to trigger recaching or cache-updating or neither?





Yes.
I think I have not made my question clear to you. What I was trying to ask is the following senario(in the order of time):





1). an insertion happens --> a update log has been put into the property table.





2). a seach is issued from T2 --> incurred the cache-updating on T2, and the removal of the "update log" from the property table.





3). a search is issued from T1 --> since it can not find any update logs in the property table, it will do a re-cacheing.





4). Here is my question --> , a search is issued from T2 again, will T2 engage a recaching or a cache-updating or neither?
Quote:
Yes, but I have sent you a modified jchem.jar only yesterday.


That one should not have these reloads with one Tomcat.
That is a good news. Does 3.0.14 have this issue addressed too?
Quote:
I guess we will offer an option, how many cache updates to preserve (even unlimited). This way one server would not delete those updates which are still needed for the other.


(probably we will rather store these updates in dedicated table(s) though, not in the property table, for better performance)


I love to see these steps which will give us a bit power to customize the cacheing according to our needs.





Specifically, we love to see a way to allow us to determine how long we can keep the update logs. We alos love to see a way to allow us to determine how many updates need to be referred to decide recaching/cache-udpating/nothing for a new search. (As you mentioned, the current fixed quote is 100). In another word, we want to see we have the control on this number.
Quote:
(Probably it's much safer anyway not to experiment on live data ;) )
Don't you agree playing with live data is a bit more exciting, not to mention more evealing? ;-)





Thanks,





Don

User d68ef9d5a9

25-08-2005 16:53:12

Hi Szilard,





Thank you very much for the details of caching update and re-caching. This information is very helpful for us to better understand the behavior of our heavily-transacted tomcat in our production.





Based on this knowledge, I have this question





On single tomcat, if there are more than 100 updates between two searches, does the second search initiate a re-caching, instead of cashing update because the update log only contains last 100 updates? In our environment, we open and close UpdateHandler for each individual compound. So it is very possible that more than 100 compounds are created or modified between two searches. If this is true, then we probably should consider modify our insert/update mechanism to reduce the number of UpdateHandler instances.





Best regards,





Ben Li

ChemAxon 9c0afc9aaf

25-08-2005 20:10:33

Don, Ben,
Quote:
Here is my question --> , a search is issued from T2 again, will T2 engage a recaching or a cache-updating or neither?
Neither, as the cache in T2 is already up-to-date.


Quote:
That is a good news. Does 3.0.14 have this issue addressed too?
No, but any further JChem releases will.


If it's required I can prepare an modified jar of any existing version for you.


Quote:
(As you mentioned, the current fixed quote is 100).
Sorry I was probably not quite clear on this.


The current quote is 30.


(which I agree is very low, but we were not really considering multiple Tomcats + many individual updates + rare searches at that time)


100 was just an example scenario for a number that is bigger than 30.


Also, I can prepare a modified jar for you if needed.


Quote:
On single tomcat, if there are more than 100 updates between two searches, does the second search initiate a re-caching, instead of cashing update because the update log only contains last 100 updates?
Using a single Tomcat there is no limit on the remembered updates.


(old updates are only deleted right after cache update)





Best regards,





Szilard

User 5787a55225

26-08-2005 14:23:05

Szilard,





Please provide us the enhanced 3.0.14 which contains the following feature:By the way, does the newly released 3.1 have all the above requtest features incorporated?





Thanks,





Don

ChemAxon 9c0afc9aaf

26-08-2005 16:42:08

Hi,
Quote:



2). An enhancement which gives us the control on how many update logs could be kept before been removed.
To tell the truth I was originally thinking about changing 1 constant to a more reasonable value.


Changing the API requires a new release by policy, so I think we should rather put this change into JChem 3.0.15.


(Changing previous releases can cause confusion anyway.)


We will release 3.0.15 early next week.


Quote:
4). Seperate update logs from the property table, to boost the performace(as you mentioned in your post).
As this requires major changes in the code, this will only be available in the 3.1.x series.





Best regards,





Szilard

User 5787a55225

26-08-2005 17:29:27

Quote:
2). An enhancement which gives us the control on how many update logs could be kept before been removed.





To tell the truth I was originally thinking about changing 1 constant to a more reasonable value.


Changing the API requires a new release by policy, so I think we should rather put this change into JChem 3.0.15.


(Changing previous releases can cause confusion anyway.)


We will release 3.0.15 early next week.
We will be looking forward to 3.015 then. Thanks.
Quote:
Quote:


4). Seperate update logs from the property table, to boost the performace(as you mentioned in your post).





As this requires major changes in the code, this will only be available in the 3.1.x series.
Again, does the newly released 3.1 has some of the enhancements we requested?





Thanks,





Donald

ChemAxon 9c0afc9aaf

26-08-2005 18:17:58

Hi,
Quote:
Again, does the newly released 3.1 has some of the enhancements we requested?
Sorry, I forgot this part.





No, it has none of these so far.


The same improvements (1-3 in your list) will appear in JChem 3.1.1 as in JChem 3.0.15.





We also expect to release JChem 3.1.1 in two week's time or sooner.





Szilard

User 5787a55225

26-08-2005 19:19:47

Quote:



Quote:


Again, does the newly released 3.1 has some of the enhancements we requested?








Sorry, I forgot this part.





No, it has none of these so far.


The same improvements (1-3 in your list) will appear in JChem 3.1.1 as in JChem 3.0.15.


Szilard, we will be expecting the 3.1.1 as well as 3.0.15.





thanks,





Don

ChemAxon 9c0afc9aaf

31-08-2005 17:34:40

Hi,





JChem 3.0.15 has just been released, and can be downloaded from the following location:





http://www.chemaxon.com/download.php?d=/data/download/jchem





Regarding the cache issues:
Quote:
1). The bug fix as you did in the enhanced 3.02
Included.
Quote:
2). An enhancement which gives us the control on how many update logs could be kept before been removed.
Partly included (GUI support in jcman to change the value will be available only from 3.1.1).


We have decided to make this an option, which is stored in the property table (e.g. "JChemProperties")


You can specify an "update.logs.to.keep" property in the table, which will


inform the cache how many updates should be kept.


The value should be in the Integer range (less than 2^31-1).


You have to insert / edit this property via you favorite DB manipulation tool (Oracle Console, TOAD, SQL+Worksheet, etc.)





You should restart your application after changing this value.
Quote:



3). An enhancement which address the "cache-update-which-should-be-cache-loading" log item in the search log.
Included, "Cache reload" is written when the cache is reloaded.





We also plan to implement other changes in the caching system later (separate tables for update logs), these are expected from 3.1.1.








All the best,





Szilard

User 5787a55225

31-08-2005 17:43:46

Thanks, Szilard.





3.1.1 will be available within a couple of weeks, right?





Thanks again,





Don

ChemAxon 9c0afc9aaf

31-08-2005 17:45:00

Quote:
3.1.1 will be available within a couple of weeks, right?
Yes.





Szilard

User 5787a55225

05-10-2005 16:03:26

Szilard,





Long time no "see". Any progress on the enhancement on the caching stuff. It seems to me the final enhancement is nowhere in sight yet. Please do not forget my request on this enhancement.





We installed the V3.1.2pre2 in our production, and this version seems to me that it drops the bug fix you put together in the V3.02en(see below). Could you put the bug fix you put in th V3.02en in the V3.1.2pre2?





Thanks,





Don
Quote:



Hi,





JChem 3.0.15 has just been released, and can be downloaded from the following location:





http://www.chemaxon.com/download.php?d=/data/download/jchem





Regarding the cache issues:





Quote:


1). The bug fix as you did in the enhanced 3.02





Included.





Quote:


2). An enhancement which gives us the control on how many update logs could be kept before been removed.





Partly included (GUI support in jcman to change the value will be available only from 3.1.1).


We have decided to make this an option, which is stored in the property table (e.g. "JChemProperties")


You can specify an "update.logs.to.keep" property in the table, which will


inform the cache how many updates should be kept.


The value should be in the Integer range (less than 2^31-1).


You have to insert / edit this property via you favorite DB manipulation tool (Oracle Console, TOAD, SQL+Worksheet, etc.)





You should restart your application after changing this value.





Quote:





3). An enhancement which address the "cache-update-which-should-be-cache-loading" log item in the search log.





Included, "Cache reload" is written when the cache is reloaded.





We also plan to implement other changes in the caching system later (separate tables for update logs), these are expected from 3.1.1.


ChemAxon 9c0afc9aaf

05-10-2005 16:47:11

Hi Don,





Sorry for not getting back to you earlier.


(I have notified your colleague about the new release, but forgot to notify you separately)


We have implemented all of the quoted improvements in JChem 3.1.1.





As we have changed the code, the bugfix in 3.0.2 is not relevant any more, and should not be present.


(I rather think there's an other problem, see below)





Since 3.1.1 every table has a corresponding log table, where the changes are stored, and can be retrieved efficiently.





I have also added a new option ("number of update logs to keep") in the jcman GUI, where you can control how many update logs should be kept for an other JVM (Tomcat).


http://www.chemaxon.com/jchem/doc/admin/#options


I think this value is very low in your case, please try to increase it.


Since the logs are stored in a separate table, even very high values should not cause performance problems.


You should restart Tomcat after modifying this setting.


Please let me know, if this solves your problem.





Best regards,





Szilard

User 5787a55225

01-11-2005 21:30:24

Szilard,





What is the default log size("number of update logs to keep") the JChem 3.1.1? We went into the JChemManager and tried to adjust the size and noticed that on the GUI, the displayed size is "1000". It seems like the size has already been 1000, which is sufficiently large for us. Does the displayed "1000" means the current size is indeed "1000"?





Thanks,





Don

ChemAxon 9c0afc9aaf

02-11-2005 08:15:45

Hi,
Quote:
What is the default log size("number of update logs to keep") the JChem 3.1.1?
1000


Quote:
Does the displayed "1000" means the current size is indeed "1000"?
Yes.


(Please be aware that after changing this value you should restart Tomcat.)





You can safely raise this limit much higher if needed, with the new caching system this should not represent a significant overhead.





Best regards,





Szilard