User 870ab5b546
20-12-2006 21:03:46
Hi folks,
We have been doing load testing of ACE, and we have found that one particular kind of question with one particular kind of response causes ACE to founder at heavy loads.
Here is the code that we suspect is at the root of our problems:
response is:
(LNE stands for lone electrons -- it gives the number of unshared electrons attached to particular atoms. In the example above, there are 2 atoms with unshared electrons; atom 6 has two, and atom 9 has two.)
answer is:
It appears to me that because of the tert-butyl group in the two compounds, JChem has to run through an awfully large number of permutations to realize that the two structures can't be matched, causing us to experience a dramatic slowdown at high loads. Do you have any ideas how we can do this more efficiently?
We have been doing load testing of ACE, and we have found that one particular kind of question with one particular kind of response causes ACE to founder at heavy loads.
Here is the code that we suspect is at the root of our problems:
Code: |
public static boolean matchExact_Lewis(String response, String answer) throws MolFileException { try { boolean result = false; // System.out.println("MolFunctions:matchExact_Lewis: starting"); LewisMolecule studentAnswer = new LewisMolecule(response); Molecule responseMolecule = studentAnswer.getLewisMolecule(); LewisMolecule authorAnswer = new LewisMolecule(answer); Molecule referenceMolecule = authorAnswer.getLewisMolecule(); int respAtomCt = responseMolecule.getAtomCount(); if (respAtomCt != referenceMolecule.getAtomCount()) return false; // atoms with unshared electrons should be converted to pseudoatoms // with name "Xn" where X is the atom symbol and n is the number // of electrons for (int i = 0; i < respAtomCt; i++) { int atomElecs = studentAnswer.unsharedElectronsOfAtom(i); if (atomElecs != 0) { MolAtom atom = responseMolecule.getAtom(i); String elem = atom.getSymbol(); atom.setAtno(MolAtom.PSEUDO); atom.setAliasstr(elem + String.valueOf(atomElecs)); } // response atom has unshared electrons atomElecs = authorAnswer.unsharedElectronsOfAtom(i); if (atomElecs != 0) { MolAtom atom = referenceMolecule.getAtom(i); String elem = atom.getSymbol(); atom.setAtno(MolAtom.PSEUDO); atom.setAliasstr(elem + String.valueOf(atomElecs)); } // response atom has unshared electrons } // for each atom i in response and reference structures MolSearch mySearch = new MolSearch(); mySearch.setSearchType(SearchConstants.EXACT); mySearch.setStereoSearch(false); mySearch.setExactBondMatching(true); // don't need multiple results // mySearch.setOrderSensitiveSearch(true); // full match desired mySearch.setSubgraphSearch(false); // mySearch.setExactIsotopeMatching(true); // deprecated Marvin 4.1 mySearch.setOption(SearchConstants.OPTION_CHARGE_MATCHING, SearchConstants.CHARGE_MATCHING_EXACT); mySearch.setOption(SearchConstants.OPTION_ISOTOPE_MATCHING, SearchConstants.ISOTOPE_MATCHING_EXACT); mySearch.setOption(SearchConstants.OPTION_RADICAL_MATCHING, SearchConstants.RADICAL_MATCHING_EXACT); mySearch.setOption(SearchConstants.OPTION_VAGUE_BOND, SearchConstants.VAGUE_BOND_OFF); mySearch.setTarget(responseMolecule); mySearch.setQuery(referenceMolecule); if (mySearch.isMatchCountInRelation(">=", 1)) { println("Found the reference structure in the response."); // invert the search to check for perfect match mySearch.setTarget(referenceMolecule); mySearch.setQuery(responseMolecule); if (mySearch.isMatchCountInRelation(">=", 1)) { println("Found the response in the reference structure."); return true; } else println("Did not find the response in the reference structure."); } else { println("Did not find the reference structure in the response."); } // if reference is substructure of response return false; } catch (Exception ex) { ex.printStackTrace(); throw new MolFileException("MolFunctions.matchExact_Lewis: " + ex.getMessage()); } } // matchExact_Lewis() |
response is:
Code: |
Lewis 103006111347 34 34 0 0 0 99 V2000 82.6207 204.2324 0.0000 C 0 0 0 0 0 0 149.8619 204.2324 0.0000 C 0 0 0 0 0 0 65.1379 147.3023 0.0000 C 0 0 0 0 0 0 122.9655 107.1161 0.0000 C 0 0 0 0 0 0 215.7586 139.4882 0.0000 C 0 0 0 0 0 0 242.6551 241.6969 0.0000 N 0 0 0 0 0 0 328.7239 222.9299 0.0000 C 0 0 0 0 0 0 348.8965 157.3488 0.0000 C 0 0 0 0 0 0 172.7241 64.6976 0.0000 O 0 0 0 0 0 0 258.7929 66.9302 0.0000 C 0 0 0 0 0 0 292.4136 108.2324 0.0000 C 0 0 0 0 0 0 331.4138 71.3952 0.0000 C 0 0 0 0 0 0 308.5516 34.5581 0.0000 C 0 0 0 0 0 0 229.2068 180.7906 0.0000 H 0 0 0 0 0 0 218.4483 114.9301 0.0000 H 0 0 0 0 0 0 36.8965 248.8836 0.0000 H 0 0 0 0 0 0 31.5172 197.5348 0.0000 H 0 0 0 0 0 0 10.0000 154.0000 0.0000 H 0 0 0 0 0 0 43.6207 121.6278 0.0000 H 0 0 0 0 0 0 79.9309 75.8603 0.0000 H 0 0 0 0 0 0 262.8276 27.8602 0.0000 H 0 0 0 0 0 0 354.2756 31.2092 0.0000 H 0 0 0 0 0 0 320.6550 10.0000 0.0000 H 0 0 0 0 0 0 362.3446 94.8371 0.0000 H 0 0 0 0 0 0 379.8275 73.6277 0.0000 H 0 0 0 0 0 0 379.8275 58.0000 0.0000 H 0 0 0 0 0 0 330.6889 116.4649 0.0000 H 0 0 0 0 0 0 293.7586 148.4185 0.0000 H 0 0 0 0 0 0 258.7929 133.9068 0.0000 H 0 0 0 0 0 0 387.8964 181.9069 0.0000 H 0 0 0 0 0 0 400.0000 158.4651 0.0000 H 0 0 0 0 0 0 395.9655 137.2557 0.0000 H 0 0 0 0 0 0 379.8275 250.0000 0.0000 H 0 0 0 0 0 0 295.1033 199.7673 0.0000 H 0 0 0 0 0 0 1 2 1 0 0 1 3 1 0 0 3 4 1 0 0 4 5 1 0 0 5 2 1 0 0 2 6 2 0 0 6 7 1 0 0 7 8 1 0 0 4 9 1 0 0 9 10 1 0 0 10 11 1 0 0 10 12 1 0 0 10 13 1 0 0 5 14 1 0 0 5 15 1 0 0 1 16 1 0 0 1 17 1 0 0 3 18 1 0 0 3 19 1 0 0 4 20 1 0 0 13 21 1 0 0 13 22 1 0 0 13 23 1 0 0 12 24 1 0 0 12 25 1 0 0 12 26 1 0 0 11 27 1 0 0 11 28 1 0 0 11 29 1 0 0 8 30 1 0 0 8 31 1 0 0 8 32 1 0 0 7 33 1 0 0 7 34 1 0 0 M LNE 2 6 2 9 2 A 1 C A 2 C A 3 C A 4 C A 5 C A 7 C A 8 C A 10 C A 11 C A 12 C A 13 C M END |
(LNE stands for lone electrons -- it gives the number of unshared electrons attached to particular atoms. In the example above, there are 2 atoms with unshared electrons; atom 6 has two, and atom 9 has two.)
answer is:
Code: |
Lewis 122006155725 34 34 0 0 0 99 V2000 98.9046 81.9310 0.0000 C 0 0 0 0 0 0 177.1428 83.8740 0.0000 C 0 0 0 0 0 0 207.3820 128.5627 0.0000 C 0 0 0 0 0 0 60.2787 139.2493 0.0000 C 0 0 0 0 0 0 145.8883 177.1375 0.0000 C 0 0 0 0 0 0 224.5650 62.5010 0.0000 O 0 0 0 0 0 0 288.5711 51.8146 0.0000 C 0 0 0 0 0 0 304.8778 90.6745 0.0000 C 0 0 0 0 0 0 347.3399 50.8431 0.0000 C 0 0 0 0 0 0 292.6477 28.4986 0.0000 C 0 0 0 0 0 0 145.2684 215.2590 0.0000 N 0 0 0 0 0 0 221.9857 223.7693 0.0000 C 0 0 0 0 0 0 274.9822 215.2590 0.0000 C 0 0 0 0 0 0 29.2409 168.3941 0.0000 H 0 0 0 0 0 0 10.0000 142.7464 0.0000 H 0 0 0 0 0 0 56.3606 82.9024 0.0000 H 0 0 0 0 0 0 99.0000 36.7768 0.0000 H 0 0 0 0 0 0 235.1990 100.3893 0.0000 H 0 0 0 0 0 0 317.0000 136.9421 0.0000 H 0 0 0 0 0 0 344.2856 111.7580 0.0000 H 0 0 0 0 0 0 359.2331 97.4750 0.0000 H 0 0 0 0 0 0 385.5199 81.9310 0.0000 H 0 0 0 0 0 0 400.0000 60.5579 0.0000 H 0 0 0 0 0 0 394.5643 44.4270 0.0000 H 0 0 0 0 0 0 246.0000 10.0000 0.0000 H 0 0 0 0 0 0 348.3620 29.7600 0.0000 H 0 0 0 0 0 0 339.4374 11.9834 0.0000 H 0 0 0 0 0 0 237.0000 165.7024 0.0000 H 0 0 0 0 0 0 257.3168 126.6197 0.0000 H 0 0 0 0 0 0 221.9857 250.0000 0.0000 H 0 0 0 0 0 0 313.0000 248.1650 0.0000 H 0 0 0 0 0 0 323.9021 210.1685 0.0000 H 0 0 0 0 0 0 285.8533 181.9949 0.0000 H 0 0 0 0 0 0 219.2682 197.5390 0.0000 H 0 0 0 0 0 0 1 2 1 0 0 2 3 1 0 0 1 4 1 0 0 4 5 1 0 0 5 3 1 0 0 2 6 1 0 0 6 7 1 0 0 7 8 1 0 0 7 9 1 0 0 7 10 1 0 0 5 11 2 0 0 11 12 1 0 0 12 13 1 0 0 4 14 1 0 0 4 15 1 0 0 1 16 1 0 0 1 17 1 0 0 2 18 1 0 0 8 19 1 0 0 8 20 1 0 0 8 21 1 0 0 9 22 1 0 0 9 23 1 0 0 9 24 1 0 0 10 25 1 0 0 10 26 1 0 0 10 27 1 0 0 3 28 1 0 0 3 29 1 0 0 12 30 1 0 0 13 31 1 0 0 13 32 1 0 0 13 33 1 0 0 12 34 1 0 0 M LNE 2 6 4 11 2 A 1 C A 2 C A 3 C A 4 C A 5 C A 7 C A 8 C A 9 C A 10 C A 12 C A 13 C M END |
It appears to me that because of the tert-butyl group in the two compounds, JChem has to run through an awfully large number of permutations to realize that the two structures can't be matched, causing us to experience a dramatic slowdown at high loads. Do you have any ideas how we can do this more efficiently?