﻿<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/"><channel><title>BlogJava-&lt;h2&gt;&lt;font color="green"&gt;生命科学领域的专业信息解决方案！&lt;/font&gt;&lt;/h2&gt;-随笔分类-CDK</title><link>http://www.blogjava.net/rain1102/category/42240.html</link><description>&lt;br/&gt;&lt;font color="green" style="font-family: 华文行楷;font-size:16px;"&gt;化学结构搜索，化学信息学，生物信息学，实验室信息学等
。&lt;/font&gt;&lt;br/&gt;&lt;font color="#3C1435"&gt;以高科技的生物、化学信息技术实现生命科学领域中专业数据的计算和管理、提高研发能力、增强在科研和成本效率方面的国际竞争力，为生物、化学、医药和学术机构提供一流的解决方案和技术咨询。&lt;/font&gt;&lt;br/&gt;
&lt;br/&gt;&lt;font color="green" style="font-family: 华文行楷;font-size:16px;"&gt;子曰：危邦不入，乱邦不居。天下有道则见，无道则隐。&lt;/font&gt;&lt;font color="#3C1435"&gt;&lt;/font&gt;&lt;br/&gt;
</description><language>zh-cn</language><lastBuildDate>Thu, 30 Jun 2011 03:15:00 GMT</lastBuildDate><pubDate>Thu, 30 Jun 2011 03:15:00 GMT</pubDate><ttl>60</ttl><item><title>存储BitSet到MySQL中--相似度搜索</title><link>http://www.blogjava.net/rain1102/archive/2011/06/29/353331.html</link><dc:creator>周锐</dc:creator><author>周锐</author><pubDate>Wed, 29 Jun 2011 02:20:00 GMT</pubDate><guid>http://www.blogjava.net/rain1102/archive/2011/06/29/353331.html</guid><wfw:comment>http://www.blogjava.net/rain1102/comments/353331.html</wfw:comment><comments>http://www.blogjava.net/rain1102/archive/2011/06/29/353331.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/rain1102/comments/commentRss/353331.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/rain1102/services/trackbacks/353331.html</trackback:ping><description><![CDATA[分子结构式相似度搜索使用的是fingerprint进行比较，而<div style="display: inline-block; "></div>fingerprint是一个二进制数据，CDK中使用BitSet来存储该信息，如果要每次比对都去生成BitSet，那也太耗时间了，所以我们需要存储<div style="display: inline-block; "></div>fingerprint信息到数据库中，比较的时候，直接读取，而MySQL不支持存储BitSet数据，网站找了一下，有人想到把<div style="display: inline-block; "></div>BitSet转换成Blob信息进行存储，然后取的时候再转换回来，不愧是个好的方法。下面来看看代码实现：<br /><br /><div><div>/*</div><div>&nbsp;* Copyright (c) 2010-2020 Founder Ltd. All Rights Reserved.</div><div>&nbsp;*</div><div>&nbsp;* This software is the confidential and proprietary information of</div><div>&nbsp;* Founder. You shall not disclose such Confidential Information</div><div>&nbsp;* and shall use it only in accordance with the terms of the agreements</div><div>&nbsp;* you entered into with Founder.</div><div>&nbsp;*</div><div>&nbsp;*/</div><div>package com.founder.mysql;</div><div></div><div>import java.sql.Blob;</div><div>import java.sql.Connection;</div><div>import java.sql.SQLException;</div><div>import java.util.BitSet;</div><div></div><div>public class MySQLUtil {</div><div></div><div><span style="white-space:pre">	</span>public static Blob bitsetToBlob(BitSet myBitSet, Connection con) throws SQLException {</div><div><span style="white-space:pre">	</span> &nbsp; &nbsp;byte[] byteArray = toByteArray(myBitSet);</div><div><span style="white-space:pre">	</span> &nbsp; &nbsp;Blob blob = con.createBlob();</div><div><span style="white-space:pre">	</span> &nbsp; &nbsp;blob.setBytes(1, byteArray);</div><div><span style="white-space:pre">	</span> &nbsp; &nbsp;return blob;</div><div><span style="white-space:pre">	</span>}</div><div></div><div><span style="white-space:pre">	</span>private static byte[] toByteArray(BitSet bits) {</div><div><span style="white-space:pre">	</span> &nbsp; &nbsp;byte[] bytes = new byte[bits.length()/8+1];</div><div><span style="white-space:pre">	</span> &nbsp; &nbsp;for (int i=0; i&lt;bits.length(); i++) {</div><div><span style="white-space:pre">	</span> &nbsp; &nbsp; &nbsp; &nbsp;if (bits.get(i)) {</div><div><span style="white-space:pre">	</span> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;bytes[bytes.length-i/8-1] |= 1&lt;&lt;(i%8);</div><div><span style="white-space:pre">	</span> &nbsp; &nbsp; &nbsp; &nbsp;}</div><div><span style="white-space:pre">	</span> &nbsp; &nbsp;}</div><div><span style="white-space:pre">	</span> &nbsp; &nbsp;return bytes;</div><div><span style="white-space:pre">	</span>}</div><div><span style="white-space:pre">	</span></div><div><span style="white-space:pre">	</span>public static BitSet blobToBitSet(Blob blob) throws SQLException {</div><div><span style="white-space:pre">	</span> &nbsp; &nbsp;byte[] bytes = blob.getBytes(1, (int)blob.length());</div><div><span style="white-space:pre">	</span> &nbsp; &nbsp;BitSet bitSet = fromByteArray(bytes);</div><div></div><div><span style="white-space:pre">	</span> &nbsp; &nbsp;return bitSet;</div><div><span style="white-space:pre">	</span>}</div><div></div><div><span style="white-space:pre">	</span>private static BitSet fromByteArray(byte[] bytes) {</div><div><span style="white-space:pre">	</span> &nbsp; &nbsp;BitSet bits = new BitSet(1024); &nbsp;</div><div><span style="white-space:pre">	</span> &nbsp; &nbsp;for (int i=0; i&lt;bytes.length*8; i++) {</div><div><span style="white-space:pre">	</span> &nbsp; &nbsp; &nbsp; &nbsp;if ((bytes[bytes.length-i/8-1]&amp;(1&lt;&lt;(i%8))) &gt; 0) {</div><div><span style="white-space:pre">	</span> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;bits.set(i);</div><div><span style="white-space:pre">	</span> &nbsp; &nbsp; &nbsp; &nbsp;}</div><div><span style="white-space:pre">	</span> &nbsp; &nbsp;}</div><div><span style="white-space:pre">	</span> &nbsp; &nbsp;return bits;</div><div><span style="white-space:pre">	</span>}</div><div>}</div></div><div><br />通过以上代码，我们就可以把fingerprint的值计算出来，然后存储到MySQL数据库中了。<br />进行相似度搜索的时候，值需要取出已经存储的值进行比对就可以了。<br /><div>float coefficient = Tanimoto.calculate(query, MySQLUtil.blobToBitSet(results.getBlob("bits")));<br />笔者测试了187586条结构数据，大概需要12秒左右，基本满足一般需求。</div></div><img src ="http://www.blogjava.net/rain1102/aggbug/353331.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/rain1102/" target="_blank">周锐</a> 2011-06-29 10:20 <a href="http://www.blogjava.net/rain1102/archive/2011/06/29/353331.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>Improved SMILES Substructure Searching-提高子结构搜索速度</title><link>http://www.blogjava.net/rain1102/archive/2011/06/27/353100.html</link><dc:creator>周锐</dc:creator><author>周锐</author><pubDate>Mon, 27 Jun 2011 13:50:00 GMT</pubDate><guid>http://www.blogjava.net/rain1102/archive/2011/06/27/353100.html</guid><wfw:comment>http://www.blogjava.net/rain1102/comments/353100.html</wfw:comment><comments>http://www.blogjava.net/rain1102/archive/2011/06/27/353100.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/rain1102/comments/commentRss/353100.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/rain1102/services/trackbacks/353100.html</trackback:ping><description><![CDATA[<div>daylight上面有一篇文章，讲解如何提高子结构搜索速度：<a href="http://www.daylight.com/meetings/emug00/Sayle/substruct.html">http://www.daylight.com/meetings/emug00/Sayle/substruct.html</a><img border="0" alt="" src="http://www.blogjava.net/images/blogjava_net/rain1102/merlin.gif" width="860" height="212" /><br />其大概意思就是先通过Fingerprint进行筛选，这样可以快速的筛选掉一部分数据，对于复杂结构更有效；另外就是根据原子个数或者特殊原子个数进行比较，如果查询结构包含三个&#8220;N&#8221;原子，那么所要查询出的结构所含有&#8220;N&#8221;的个数必须大于等于3，这样对于包含一些特殊元素的效果是特别的好；还有就是根据分子的一些性质进行筛选过滤，比如芳香性等；最后再进行匹配，这样一来对于复杂结构以及含特殊元素的查询速度会提高很多。<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 最后文章中还给出测试数据，从中可以看出，速度一般提高了三倍左右：<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="widows: 2; text-transform: none; text-indent: 0px; border-collapse: separate; font: medium Simsun; white-space: normal; orphans: 2; letter-spacing: normal; color: rgb(0,0,0); word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px" class="Apple-style-span"><span class="Apple-style-span"> 
<table border="2" cellspacing="1" cellpadding="1" align="center">
<tbody>
<tr>
<td>Name</td>
<td>SMILES</td>
<td>Correct</td>
<td>FP</td>
<td>Triage</td>
<td>Before</td>
<td>After</td>
<td>Latest</td></tr>
<tr>
<td>Propane</td>
<td>CCC</td>
<td>65337</td>
<td>66352</td>
<td>42411</td>
<td>42.59</td>
<td>17.99</td>
<td>14.34</td></tr>
<tr>
<td>Selenium</td>
<td>[Se]</td>
<td>246</td>
<td>995</td>
<td>225</td>
<td>0.80</td>
<td>0.83</td>
<td>0.52</td></tr>
<tr>
<td>Benzene</td>
<td>c1ccccc1</td>
<td>79426</td>
<td>79486</td>
<td>50893</td>
<td>72.69</td>
<td>27.56</td>
<td>20.29</td></tr>
<tr>
<td>Methane</td>
<td>C</td>
<td>118519</td>
<td>118524</td>
<td>118511</td>
<td>61.29</td>
<td>5.47</td>
<td>4.25</td></tr>
<tr>
<td>Amido</td>
<td>NC=O</td>
<td>25695</td>
<td>26975</td>
<td>14702</td>
<td>18.89</td>
<td>9.84</td>
<td>8.16</td></tr>
<tr>
<td>Methylbenzene</td>
<td>Cc1ccccc1</td>
<td>54529</td>
<td>56869</td>
<td>20490</td>
<td>54.76</td>
<td>35.58</td>
<td>25.90</td></tr>
<tr>
<td>Carboxy</td>
<td>OC=O</td>
<td>33009</td>
<td>34369</td>
<td>17809</td>
<td>23.86</td>
<td>12.48</td>
<td>10.24</td></tr>
<tr>
<td>Chlorine</td>
<td>Cl</td>
<td>19424</td>
<td>23318</td>
<td>19424</td>
<td>11.23</td>
<td>1.38</td>
<td>1.12</td></tr>
<tr>
<td>Cyclopropane</td>
<td>C1CC1</td>
<td>863</td>
<td>4358</td>
<td>484</td>
<td>8.24</td>
<td>7.78</td>
<td>5.02</td></tr>
<tr>
<td>Biphenyl</td>
<td>c1ccccc1c2ccccc2</td>
<td>2967</td>
<td>5142</td>
<td>146</td>
<td>21.94</td>
<td>21.65</td>
<td>11.44</td></tr>
<tr></tr>
<tr></tr>
<tr></tr>
<tr>
<td>Dopamine</td>
<td>NCCc1ccc(O)c(O)c1</td>
<td>829</td>
<td>913</td>
<td>23</td>
<td>1.85</td>
<td>2.09</td>
<td>1.47</td></tr>
<tr>
<td>Sulfisoxazole</td>
<td></td>
<td>7</td>
<td>8</td>
<td>3</td>
<td>0.50</td>
<td>0.88</td>
<td>0.51</td></tr>
<tr>
<td>BetaCarotene</td>
<td></td>
<td>2</td>
<td>16</td>
<td>1</td>
<td>0.48</td>
<td>0.68</td>
<td>0.58</td></tr>
<tr>
<td>Nitrofurantoin</td>
<td></td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0.42</td>
<td>0.58</td>
<td>0.52</td></tr></tbody></table></span></span><br /></div><img src ="http://www.blogjava.net/rain1102/aggbug/353100.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/rain1102/" target="_blank">周锐</a> 2011-06-27 21:50 <a href="http://www.blogjava.net/rain1102/archive/2011/06/27/353100.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>chemtoolkits中分子描述符计算（molecular descriptor calculator）完成</title><link>http://www.blogjava.net/rain1102/archive/2011/04/12/348175.html</link><dc:creator>周锐</dc:creator><author>周锐</author><pubDate>Tue, 12 Apr 2011 14:54:00 GMT</pubDate><guid>http://www.blogjava.net/rain1102/archive/2011/04/12/348175.html</guid><wfw:comment>http://www.blogjava.net/rain1102/comments/348175.html</wfw:comment><comments>http://www.blogjava.net/rain1102/archive/2011/04/12/348175.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/rain1102/comments/commentRss/348175.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/rain1102/services/trackbacks/348175.html</trackback:ping><description><![CDATA[&nbsp;&nbsp;&nbsp;&nbsp; 摘要: 参照Rajarshi Guha的CDKDescUI代码，已经把CDK的分子描述符计算（molecular descriptor calculator）集成到chemtoolkits中了，其中包含44个描述符。&nbsp;&nbsp;<a href='http://www.blogjava.net/rain1102/archive/2011/04/12/348175.html'>阅读全文</a><img src ="http://www.blogjava.net/rain1102/aggbug/348175.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/rain1102/" target="_blank">周锐</a> 2011-04-12 22:54 <a href="http://www.blogjava.net/rain1102/archive/2011/04/12/348175.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>使用rcdk进行化合物结构聚类处理</title><link>http://www.blogjava.net/rain1102/archive/2011/04/11/348097.html</link><dc:creator>周锐</dc:creator><author>周锐</author><pubDate>Mon, 11 Apr 2011 13:41:00 GMT</pubDate><guid>http://www.blogjava.net/rain1102/archive/2011/04/11/348097.html</guid><wfw:comment>http://www.blogjava.net/rain1102/comments/348097.html</wfw:comment><comments>http://www.blogjava.net/rain1102/archive/2011/04/11/348097.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/rain1102/comments/commentRss/348097.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/rain1102/services/trackbacks/348097.html</trackback:ping><description><![CDATA[&nbsp;&nbsp;&nbsp;&nbsp; 摘要: rcdk, 是在R下面集成了CDK工具包，以此来通过CDK生成的化学性质数据进行更深层次的统计分析，下面来看看在rcdk中如何进行多个化合物结构的聚类。&nbsp;&nbsp;<a href='http://www.blogjava.net/rain1102/archive/2011/04/11/348097.html'>阅读全文</a><img src ="http://www.blogjava.net/rain1102/aggbug/348097.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/rain1102/" target="_blank">周锐</a> 2011-04-11 21:41 <a href="http://www.blogjava.net/rain1102/archive/2011/04/11/348097.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>chemtoolkits(CTK)部分功能和界面</title><link>http://www.blogjava.net/rain1102/archive/2011/04/09/347957.html</link><dc:creator>周锐</dc:creator><author>周锐</author><pubDate>Sat, 09 Apr 2011 09:16:00 GMT</pubDate><guid>http://www.blogjava.net/rain1102/archive/2011/04/09/347957.html</guid><wfw:comment>http://www.blogjava.net/rain1102/comments/347957.html</wfw:comment><comments>http://www.blogjava.net/rain1102/archive/2011/04/09/347957.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/rain1102/comments/commentRss/347957.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/rain1102/services/trackbacks/347957.html</trackback:ping><description><![CDATA[&nbsp;&nbsp;&nbsp;&nbsp; 摘要: Chemtoolkits(CTK)部分功能和界面已经完成，目前网站的主要辅助性的功能已经加入，比如新闻、文章、留言以及其他信息内容。化学信息学方面的目前只整合了里宾斯基五规则计算，以及比较流行的OSIRIS Property Explorer (LogP, 溶解度、成药可能性预测)小工具。&nbsp;&nbsp;<a href='http://www.blogjava.net/rain1102/archive/2011/04/09/347957.html'>阅读全文</a><img src ="http://www.blogjava.net/rain1102/aggbug/347957.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/rain1102/" target="_blank">周锐</a> 2011-04-09 17:16 <a href="http://www.blogjava.net/rain1102/archive/2011/04/09/347957.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>CDK中根据smiles计算Fingerprinter值</title><link>http://www.blogjava.net/rain1102/archive/2009/10/26/299848.html</link><dc:creator>周锐</dc:creator><author>周锐</author><pubDate>Mon, 26 Oct 2009 14:24:00 GMT</pubDate><guid>http://www.blogjava.net/rain1102/archive/2009/10/26/299848.html</guid><wfw:comment>http://www.blogjava.net/rain1102/comments/299848.html</wfw:comment><comments>http://www.blogjava.net/rain1102/archive/2009/10/26/299848.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/rain1102/comments/commentRss/299848.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/rain1102/services/trackbacks/299848.html</trackback:ping><description><![CDATA[<p>package com.founder.cdk;</p>
<p>import <a title="Java爱好者" href="http://www.blogjava.net/rain1102" >Java</a>.util.BitSet;</p>
<p>import org.openscience.cdk.DefaultChemObjectBuilder;<br />
import org.openscience.cdk.exception.CDKException;<br />
import org.openscience.cdk.exception.InvalidSmilesException;<br />
import org.openscience.cdk.fingerprint.ExtendedFingerprinter;<br />
import org.openscience.cdk.smiles.SmilesParser;</p>
<p>public class FingerprinterTest {</p>
<p>&nbsp;/**<br />
&nbsp; * @param args<br />
&nbsp; * @throws CDKException <br />
&nbsp; * @throws InvalidSmilesException <br />
&nbsp; */<br />
&nbsp;public static void main(String[] args) throws InvalidSmilesException, CDKException {<br />
&nbsp;&nbsp;ExtendedFingerprinter fingerprinter = new ExtendedFingerprinter();<br />
&nbsp;&nbsp;SmilesParser sp = new SmilesParser(DefaultChemObjectBuilder.getInstance());<br />
&nbsp;&nbsp;BitSet bt = fingerprinter.getFingerprint(sp.parseSmiles("c2ccc1ccccc1c2"));<br />
&nbsp;}</p>
<p>}<br />
</p><img src ="http://www.blogjava.net/rain1102/aggbug/299848.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/rain1102/" target="_blank">周锐</a> 2009-10-26 22:24 <a href="http://www.blogjava.net/rain1102/archive/2009/10/26/299848.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>使用CDK生成分子结构图</title><link>http://www.blogjava.net/rain1102/archive/2009/10/22/299271.html</link><dc:creator>周锐</dc:creator><author>周锐</author><pubDate>Thu, 22 Oct 2009 00:51:00 GMT</pubDate><guid>http://www.blogjava.net/rain1102/archive/2009/10/22/299271.html</guid><wfw:comment>http://www.blogjava.net/rain1102/comments/299271.html</wfw:comment><comments>http://www.blogjava.net/rain1102/archive/2009/10/22/299271.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/rain1102/comments/commentRss/299271.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/rain1102/services/trackbacks/299271.html</trackback:ping><description><![CDATA[<p>import <a title="Java爱好者" href="http://www.blogjava.net/rain1102" >Java</a>.awt.Dimension;<br />
import <a title="Java爱好者" href="http://www.blogjava.net/rain1102" >Java</a>.awt.Graphics2D;<br />
import <a title="Java爱好者" href="http://www.blogjava.net/rain1102" >Java</a>.awt.geom.Rectangle2D;<br />
import <a title="Java爱好者" href="http://www.blogjava.net/rain1102" >Java</a>.awt.image.BufferedImage;<br />
import <a title="Java爱好者" href="http://www.blogjava.net/rain1102" >Java</a>.io.OutputStream;<br />
import <a title="Java爱好者" href="http://www.blogjava.net/rain1102" >Java</a>.io.StringReader;<br />
import <a title="Java爱好者" href="http://www.blogjava.net/rain1102" >Java</a>.util.Iterator;</p>
<p>import javax.servlet.http.HttpServletResponse;<br />
import javax.vecmath.Point2d;</p>
<p>import org.apache.log4j.Logger;<br />
import org.openscience.cdk.Molecule;<br />
import org.openscience.cdk.interfaces.IAtom;<br />
import org.openscience.cdk.interfaces.IMolecule;<br />
import org.openscience.cdk.io.MDLReader;<br />
import org.openscience.cdk.layout.StructureDiagramGenerator;<br />
import org.openscience.cdk.renderer.Renderer2DModel;<br />
import org.openscience.cdk.renderer.SimpleRenderer2D;</p>
<p>public class ImageTypeExporterUtil {<br />
&nbsp;private static final Logger logger = Logger.getLogger(ImageTypeExporterUtil.class);<br />
&nbsp;<br />
&nbsp;/**<br />
&nbsp; * show molecule structure to image type (png, jpeg)<br />
&nbsp; * <br />
&nbsp; * @param mol String molecule stucture<br />
&nbsp; * @param length width and height<br />
&nbsp; * @param response HttpServletResponse object<br />
&nbsp; * @throws Exception<br />
&nbsp; *&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if occurred exception ,then throw Exception<br />
&nbsp; */<br />
&nbsp;public static void showAsImage(String stucture, Integer length, HttpServletResponse response) throws Exception {<br />
&nbsp;&nbsp;logger.debug("ImageTypeExporterUtil.showAsImage..");<br />
&nbsp;&nbsp;<br />
&nbsp;&nbsp;StringReader mdl = new StringReader(stucture);<br />
&nbsp;&nbsp;MDLReader cdkMDL = new MDLReader(mdl);<br />
&nbsp;&nbsp;Molecule mol = new Molecule();<br />
&nbsp;&nbsp;cdkMDL.read(mol);<br />
&nbsp;&nbsp;// null coordinates<br />
&nbsp;&nbsp;Iterator&lt;IAtom&gt; itatoms = mol.atoms();<br />
&nbsp;&nbsp;while (itatoms.hasNext()) {<br />
&nbsp;&nbsp;&nbsp;IAtom atom = itatoms.next();<br />
&nbsp;&nbsp;&nbsp;atom.setPoint2d(null);<br />
&nbsp;&nbsp;&nbsp;atom.setPoint3d(null);<br />
&nbsp;&nbsp;}<br />
&nbsp;&nbsp;// generate 2D coordinates<br />
&nbsp;&nbsp;StructureDiagramGenerator sdg = new StructureDiagramGenerator();<br />
&nbsp;&nbsp;sdg.setMolecule(mol);<br />
&nbsp;&nbsp;try {<br />
&nbsp;&nbsp;&nbsp;sdg.generateCoordinates();<br />
&nbsp;&nbsp;} catch (Exception ex) {<br />
&nbsp;&nbsp;&nbsp;ex.printStackTrace();<br />
&nbsp;&nbsp;}<br />
&nbsp;&nbsp;IMolecule layedOutMol = sdg.getMolecule();<br />
&nbsp;&nbsp;// scale molecule<br />
&nbsp;&nbsp;final double UNDEF_POS = 100000;<br />
&nbsp;&nbsp;double minX = UNDEF_POS, minY = UNDEF_POS, maxX = UNDEF_POS, maxY = UNDEF_POS;<br />
&nbsp;&nbsp;itatoms = layedOutMol.atoms();<br />
&nbsp;&nbsp;while (itatoms.hasNext()) {<br />
&nbsp;&nbsp;&nbsp;IAtom atom = itatoms.next();<br />
&nbsp;&nbsp;&nbsp;Point2d point2d = atom.getPoint2d();<br />
&nbsp;&nbsp;&nbsp;if (minX == UNDEF_POS || minX &gt; point2d.x)<br />
&nbsp;&nbsp;&nbsp;&nbsp;minX = point2d.x;<br />
&nbsp;&nbsp;&nbsp;if (minY == UNDEF_POS || minY &gt; point2d.y)<br />
&nbsp;&nbsp;&nbsp;&nbsp;minY = point2d.y;<br />
&nbsp;&nbsp;&nbsp;if (maxX == UNDEF_POS || maxX &lt; point2d.x)<br />
&nbsp;&nbsp;&nbsp;&nbsp;maxX = point2d.x;<br />
&nbsp;&nbsp;&nbsp;if (maxY == UNDEF_POS || maxY &lt; point2d.y)<br />
&nbsp;&nbsp;&nbsp;&nbsp;maxY = point2d.y;<br />
&nbsp;&nbsp;}<br />
&nbsp;&nbsp;double scaleX = length / (maxX - minX + 1);<br />
&nbsp;&nbsp;double scaleY = length / (maxY - minY + 1);<br />
&nbsp;&nbsp;double scale = scaleX &gt; scaleY ? scaleY : scaleX;<br />
&nbsp;&nbsp;double centreX = scale * (maxX + minX) / 2.;<br />
&nbsp;&nbsp;double centreY = scale * (maxY + minY) / 2.;<br />
&nbsp;&nbsp;double offsetX = length / 2. - centreX;<br />
&nbsp;&nbsp;double offsetY = length / 2. - centreY;<br />
&nbsp;&nbsp;itatoms = layedOutMol.atoms();<br />
&nbsp;&nbsp;while (itatoms.hasNext()) {<br />
&nbsp;&nbsp;&nbsp;IAtom atom = itatoms.next();<br />
&nbsp;&nbsp;&nbsp;Point2d a = atom.getPoint2d();<br />
&nbsp;&nbsp;&nbsp;Point2d b = new Point2d();<br />
&nbsp;&nbsp;&nbsp;b.x = a.x * scale + offsetX;<br />
&nbsp;&nbsp;&nbsp;b.y = a.y * scale + offsetY;<br />
&nbsp;&nbsp;&nbsp;atom.setPoint2d(b);<br />
&nbsp;&nbsp;}<br />
&nbsp;&nbsp;// set rendering properties<br />
&nbsp;&nbsp;Renderer2DModel r2dm = new Renderer2DModel();<br />
&nbsp;&nbsp;r2dm.setDrawNumbers(false);<br />
&nbsp;&nbsp;r2dm.setUseAntiAliasing(true);<br />
&nbsp;&nbsp;r2dm.setColorAtomsByType(true);<br />
&nbsp;&nbsp;r2dm.setShowAtomTypeNames(false);<br />
&nbsp;&nbsp;r2dm.setShowAromaticity(true);<br />
&nbsp;&nbsp;r2dm.setShowImplicitHydrogens(false);<br />
&nbsp;&nbsp;r2dm.setShowReactionBoxes(false);<br />
&nbsp;&nbsp;r2dm.setKekuleStructure(false);<br />
&nbsp;&nbsp;Dimension dim = new Dimension();<br />
&nbsp;&nbsp;dim.setSize(length, length);<br />
&nbsp;&nbsp;r2dm.setBackgroundDimension(dim);<br />
&nbsp;&nbsp;r2dm.setBackColor(java.awt.Color.WHITE);<br />
&nbsp;&nbsp;// render the image<br />
&nbsp;&nbsp;SimpleRenderer2D renderer = new SimpleRenderer2D();<br />
&nbsp;&nbsp;renderer.setRenderer2DModel(r2dm);<br />
&nbsp;&nbsp;BufferedImage bufferedImage = new BufferedImage(length, length,<br />
&nbsp;&nbsp;&nbsp;&nbsp;BufferedImage.TYPE_INT_RGB);<br />
&nbsp;&nbsp;Graphics2D graphics = bufferedImage.createGraphics();<br />
&nbsp;&nbsp;graphics.setPaint(java.awt.Color.WHITE);<br />
&nbsp;&nbsp;Rectangle2D.Float rectangle = new Rectangle2D.Float(0, 0, length, length);<br />
&nbsp;&nbsp;graphics.fill(rectangle);<br />
&nbsp;&nbsp;renderer.paintMolecule(layedOutMol, graphics);<br />
&nbsp;&nbsp;// write the image to response<br />
&nbsp;&nbsp;response.setContentType("image/png");<br />
&nbsp;&nbsp;OutputStream out = response.getOutputStream();<br />
&nbsp;&nbsp;try {<br />
&nbsp;&nbsp;&nbsp;javax.imageio.ImageIO.write(bufferedImage, "png", out);<br />
&nbsp;&nbsp;} finally {<br />
&nbsp;&nbsp;&nbsp;out.close();<br />
&nbsp;&nbsp;}<br />
&nbsp;}<br />
}<br />
</p><img src ="http://www.blogjava.net/rain1102/aggbug/299271.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/rain1102/" target="_blank">周锐</a> 2009-10-22 08:51 <a href="http://www.blogjava.net/rain1102/archive/2009/10/22/299271.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>使用CDK进行子结构搜索</title><link>http://www.blogjava.net/rain1102/archive/2009/10/20/298919.html</link><dc:creator>周锐</dc:creator><author>周锐</author><pubDate>Tue, 20 Oct 2009 00:33:00 GMT</pubDate><guid>http://www.blogjava.net/rain1102/archive/2009/10/20/298919.html</guid><wfw:comment>http://www.blogjava.net/rain1102/comments/298919.html</wfw:comment><comments>http://www.blogjava.net/rain1102/archive/2009/10/20/298919.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/rain1102/comments/commentRss/298919.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/rain1102/services/trackbacks/298919.html</trackback:ping><description><![CDATA[CDK提供了通过smiles值进行子结构搜索,&nbsp; org.openscience.cdk.smiles.smarts.SMARTSQueryTool<br />
<p>package com.founder.cdk;</p>
<p>import <a title="Java爱好者" href="http://www.blogjava.net/rain1102">Java</a>.io.File;<br />
import <a title="Java爱好者" href="http://www.blogjava.net/rain1102">Java</a>.io.FileNotFoundException;<br />
import <a title="Java爱好者" href="http://www.blogjava.net/rain1102">Java</a>.io.FileReader;<br />
import <a title="Java爱好者" href="http://www.blogjava.net/rain1102">Java</a>.util.ArrayList;<br />
import <a title="Java爱好者" href="http://www.blogjava.net/rain1102">Java</a>.util.List;</p>
<p>import org.openscience.cdk.ChemFile;<br />
import org.openscience.cdk.ChemObject;<br />
import org.openscience.cdk.exception.CDKException;<br />
import org.openscience.cdk.interfaces.IAtomContainer;<br />
import org.openscience.cdk.io.MDLV2000Reader;<br />
import org.openscience.cdk.smiles.smarts.SMARTSQueryTool;<br />
import org.openscience.cdk.tools.manipulator.ChemFileManipulator;</p>
<p>public class SMARTSQueryToolTest {</p>
<p>&nbsp;static SMARTSQueryTool sqt;static {<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; try {<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; sqt = new <span style="color: #008000">SMARTSQueryTool</span>("c2ccc1ccccc1c2");<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; } catch (CDKException e) {&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<br />
&nbsp;&nbsp;&nbsp; }</p>
<p>&nbsp;/**<br />
&nbsp; * @param args<br />
&nbsp; */<br />
&nbsp;public static void main(String[] args) {<br />
&nbsp;&nbsp;String filename = "H:\\molecules.sdf";<br />
&nbsp;&nbsp;try {<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; MDLV2000Reader reader = new MDLV2000Reader(new FileReader(new File(filename)));<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ChemFile chemFile = (ChemFile) reader.read((ChemObject) new ChemFile());<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; List&lt;IAtomContainer&gt; containersList = ChemFileManipulator.getAllAtomContainers(chemFile);<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; List&lt;IAtomContainer&gt; substructureList = new ArrayList&lt;IAtomContainer&gt;();<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span style="color: #008000">sqt.setSmarts("c1ccc3c(c1)ccc4c2ccccc2ccc34");&nbsp;</span> //重新设置匹配的smiles值<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;boolean matched = false;<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for (IAtomContainer molecule : containersList) {<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span style="color: #008000">matched = sqt.matches(molecule);</span><br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (matched){<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;substructureList.add(molecule);<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; } <br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; System.out.println(substructureList.size());<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for (IAtomContainer molecule : substructureList) {<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;System.out.println(molecule.getProperty("ID"));<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; } catch (CDKException e) {<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; e.printStackTrace();<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; } catch (FileNotFoundException e) {<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;e.printStackTrace();<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}</p>
<p>&nbsp;}</p>
<p>}<br />
<br />
通过测试, matches方法速度很慢, 一般一个结构需要200ms-1000ms左右.<br />
</p><img src ="http://www.blogjava.net/rain1102/aggbug/298919.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/rain1102/" target="_blank">周锐</a> 2009-10-20 08:33 <a href="http://www.blogjava.net/rain1102/archive/2009/10/20/298919.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>使用CDK解析SDF文件</title><link>http://www.blogjava.net/rain1102/archive/2009/10/19/298802.html</link><dc:creator>周锐</dc:creator><author>周锐</author><pubDate>Mon, 19 Oct 2009 01:45:00 GMT</pubDate><guid>http://www.blogjava.net/rain1102/archive/2009/10/19/298802.html</guid><wfw:comment>http://www.blogjava.net/rain1102/comments/298802.html</wfw:comment><comments>http://www.blogjava.net/rain1102/archive/2009/10/19/298802.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/rain1102/comments/commentRss/298802.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/rain1102/services/trackbacks/298802.html</trackback:ping><description><![CDATA[<p>package com.founder.cdk;</p>
<p>import <a title="Java爱好者" href="http://www.blogjava.net/rain1102" >Java</a>.io.File;<br />
import <a title="Java爱好者" href="http://www.blogjava.net/rain1102" >Java</a>.io.FileNotFoundException;<br />
import <a title="Java爱好者" href="http://www.blogjava.net/rain1102" >Java</a>.io.FileReader;<br />
import <a title="Java爱好者" href="http://www.blogjava.net/rain1102" >Java</a>.util.List;</p>
<p>import org.openscience.cdk.ChemFile;<br />
import org.openscience.cdk.ChemObject;<br />
import org.openscience.cdk.Molecule;<br />
import org.openscience.cdk.exception.CDKException;<br />
import org.openscience.cdk.interfaces.IAtomContainer;<br />
import org.openscience.cdk.io.MDLReader;<br />
import org.openscience.cdk.io.MDLV2000Reader;<br />
import org.openscience.cdk.tools.manipulator.ChemFileManipulator;</p>
<p>public class ReadSDFTest {</p>
<p>&nbsp;/**<br />
&nbsp; * @param args<br />
&nbsp; * @throws CDKException <br />
&nbsp; * @throws FileNotFoundException <br />
&nbsp; */<br />
&nbsp;public static void main(String[] args) throws CDKException, FileNotFoundException {<br />
&nbsp;&nbsp;String filename = "H:\\molecules.sdf";<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <br />
<span style="color: #008000">//&nbsp;&nbsp;InputStream ins = ReadSDFTest.class.getClassLoader().getResourceAsStream(filename);<br />
//&nbsp;&nbsp;MDLReader reader = new MDLReader(ins);</span></p>
<p>&nbsp;&nbsp; //alternatively, you can specify a file directly<br />
&nbsp;&nbsp; <span style="color: #008000">MDLV2000Reader reader = new MDLV2000Reader(new FileReader(new File(filename)));</span></p>
<p>&nbsp;&nbsp;<span style="color: #008000">ChemFile chemFile = (ChemFile)reader.read((ChemObject)new ChemFile());<br />
&nbsp;&nbsp;<br />
&nbsp;&nbsp;List&lt;IAtomContainer&gt; containersList = ChemFileManipulator.getAllAtomContainers(chemFile);<br />
</span>&nbsp;&nbsp;<br />
&nbsp;&nbsp;Molecule molecule = null;<br />
&nbsp;&nbsp;for (IAtomContainer mol : containersList) {<br />
&nbsp;&nbsp;&nbsp;molecule = (Molecule) mol;<br />
&nbsp;&nbsp;&nbsp;System.out.println(molecule.getProperties());<br />
&nbsp;&nbsp;&nbsp;System.out.println(molecule.getProperty("CD_MOLWEIGHT"));<br />
//&nbsp;&nbsp;&nbsp;Fingerprinter fp = new Fingerprinter();<br />
//&nbsp;&nbsp;&nbsp;BitSet bt = fp.getFingerprint(molecule);<br />
//&nbsp;&nbsp;&nbsp;System.out.println(bt);<br />
&nbsp;&nbsp;}<br />
&nbsp;}</p>
<p>}<br />
</p><img src ="http://www.blogjava.net/rain1102/aggbug/298802.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/rain1102/" target="_blank">周锐</a> 2009-10-19 09:45 <a href="http://www.blogjava.net/rain1102/archive/2009/10/19/298802.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>使用CDK进行相似度搜索</title><link>http://www.blogjava.net/rain1102/archive/2009/10/19/298801.html</link><dc:creator>周锐</dc:creator><author>周锐</author><pubDate>Mon, 19 Oct 2009 01:37:00 GMT</pubDate><guid>http://www.blogjava.net/rain1102/archive/2009/10/19/298801.html</guid><wfw:comment>http://www.blogjava.net/rain1102/comments/298801.html</wfw:comment><comments>http://www.blogjava.net/rain1102/archive/2009/10/19/298801.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/rain1102/comments/commentRss/298801.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/rain1102/services/trackbacks/298801.html</trackback:ping><description><![CDATA[<p>package com.founder.cdk;</p>
<p>import <a title="Java爱好者" href="http://www.blogjava.net/rain1102">Java</a>.io.StringReader;<br />
import <a title="Java爱好者" href="http://www.blogjava.net/rain1102">Java</a>.sql.Connection;<br />
import <a title="Java爱好者" href="http://www.blogjava.net/rain1102">Java</a>.sql.ResultSet;<br />
import <a title="Java爱好者" href="http://www.blogjava.net/rain1102">Java</a>.sql.SQLException;<br />
import <a title="Java爱好者" href="http://www.blogjava.net/rain1102">Java</a>.util.ArrayList;<br />
import <a title="Java爱好者" href="http://www.blogjava.net/rain1102">Java</a>.util.BitSet;<br />
import <a title="Java爱好者" href="http://www.blogjava.net/rain1102">Java</a>.util.List;</p>
<p>import org.openscience.cdk.Molecule;<br />
import org.openscience.cdk.exception.CDKException;<br />
import org.openscience.cdk.fingerprint.Fingerprinter;<br />
import org.openscience.cdk.io.MDLReader;<br />
import org.openscience.cdk.similarity.Tanimoto;</p>
<p>public class CDKTest {</p>
<p>&nbsp;/**<br />
&nbsp; * @param args<br />
&nbsp; */<br />
&nbsp;public static void main(String[] args) {<br />
&nbsp;&nbsp;<br />
&nbsp;&nbsp;// MySQL<br />
&nbsp;&nbsp;long t1 = System.currentTimeMillis();<br />
&nbsp;&nbsp;try {<br />
&nbsp;&nbsp;&nbsp;Class.forName("com.mysql.jdbc.Driver").newInstance();<br />
&nbsp;&nbsp;&nbsp;Connection con = <a title="Java爱好者" href="http://www.blogjava.net/rain1102">Java</a>.sql.DriverManager<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.getConnection(<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"jdbc:mysql://localhost/coocoo?useUnicode=true&amp;characterEncoding=utf-8&amp;zeroDateTimeBehavior=convertToNull",<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"root", "root");<br />
&nbsp;&nbsp;&nbsp;</p>
<p>&nbsp;&nbsp;&nbsp;ResultSet results = null;<br />
&nbsp;&nbsp;&nbsp;String querySQL = "select id, structure from structure ";<br />
&nbsp;&nbsp;&nbsp;<br />
&nbsp;&nbsp;&nbsp;results = con.createStatement().executeQuery(querySQL);<br />
&nbsp;<br />
&nbsp;&nbsp;&nbsp;// dump out the results</p>
<p>&nbsp;&nbsp;&nbsp;List&lt;Molecule&gt; list = new ArrayList&lt;Molecule&gt;();<br />
&nbsp;&nbsp;&nbsp;Fingerprinter fp = new Fingerprinter();<br />
&nbsp;&nbsp;&nbsp;BitSet bt = null;<br />
&nbsp;&nbsp;&nbsp;while (results.next()) {<br />
&nbsp;&nbsp;&nbsp;&nbsp;Long id = results.getLong("id");<br />
&nbsp;&nbsp;&nbsp;&nbsp;<br />
&nbsp;&nbsp;&nbsp;&nbsp;//根据结构数据生成分子对象<br />
&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: #008000">StringReader mdl = new StringReader(results.getString("structure"));<br />
&nbsp;&nbsp;&nbsp;&nbsp;MDLReader cdkMDL = new MDLReader(mdl);<br />
&nbsp;&nbsp;&nbsp;&nbsp;Molecule molecule = new Molecule();<br />
&nbsp;&nbsp;&nbsp;&nbsp;cdkMDL.read(molecule);</span><br />
&nbsp;&nbsp;&nbsp;&nbsp;if (id == 1220) {<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;bt = fp.getFingerprint(molecule);<br />
&nbsp;&nbsp;&nbsp;&nbsp;}<br />
&nbsp;&nbsp;&nbsp;&nbsp;list.add(molecule);<br />
&nbsp;&nbsp;&nbsp;&nbsp;<br />
&nbsp;&nbsp;&nbsp;}&nbsp;<br />
&nbsp;&nbsp;&nbsp;System.out.println("size:=" + list.size());<br />
&nbsp;&nbsp;&nbsp;<br />
&nbsp;&nbsp;&nbsp;List&lt;Molecule&gt; resultList = new ArrayList&lt;Molecule&gt;();<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; long t2 = System.currentTimeMillis();<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; System.out.println("Thread: collection data in " + (t2 - t1) + " ms.");<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for (Molecule molecule : list) {<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; try {<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span style="color: #008000">float coefficient = Tanimoto.calculate(fp.getFingerprint(molecule), bt);&nbsp; //计算相似度<br />
</span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (coefficient &gt; 0.9) {<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;resultList.add(molecule);<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; } catch (CDKException e) {</p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; long t3 = System.currentTimeMillis();<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; System.out.println(resultList.size());<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; System.out.println("Thread: Search in " + (t3 - t2) + " ms.");<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <br />
&nbsp;&nbsp;&nbsp;con.close();<br />
&nbsp;&nbsp;} catch (InstantiationException e) {<br />
&nbsp;&nbsp;&nbsp;e.printStackTrace();<br />
&nbsp;&nbsp;} catch (IllegalAccessException e) {<br />
&nbsp;&nbsp;&nbsp;e.printStackTrace();<br />
&nbsp;&nbsp;} catch (ClassNotFoundException e) {<br />
&nbsp;&nbsp;&nbsp;e.printStackTrace();<br />
&nbsp;&nbsp;} catch (SQLException e) {<br />
&nbsp;&nbsp;&nbsp;e.printStackTrace();<br />
&nbsp;&nbsp;} catch (CDKException e) {<br />
&nbsp;&nbsp;&nbsp;e.printStackTrace();<br />
&nbsp;&nbsp;} <br />
&nbsp;&nbsp;long t4 = System.currentTimeMillis();<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; System.out.println("Thread: all in " + (t4 - t1) + " ms.");<br />
&nbsp;}</p>
<p>}<br />
</p><img src ="http://www.blogjava.net/rain1102/aggbug/298801.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/rain1102/" target="_blank">周锐</a> 2009-10-19 09:37 <a href="http://www.blogjava.net/rain1102/archive/2009/10/19/298801.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>Faster Fingerprint Search with Java &amp; CDK</title><link>http://www.blogjava.net/rain1102/archive/2009/10/18/298745.html</link><dc:creator>周锐</dc:creator><author>周锐</author><pubDate>Sun, 18 Oct 2009 06:09:00 GMT</pubDate><guid>http://www.blogjava.net/rain1102/archive/2009/10/18/298745.html</guid><wfw:comment>http://www.blogjava.net/rain1102/comments/298745.html</wfw:comment><comments>http://www.blogjava.net/rain1102/archive/2009/10/18/298745.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/rain1102/comments/commentRss/298745.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/rain1102/services/trackbacks/298745.html</trackback:ping><description><![CDATA[原文来自:http://chemhack.com/cn/2008/11/faster-fingerprint-search-with-java-cdk/<br />
<p><a href="http://depth-first.com/" jquery1255846148312="3">Rich Apodaca</a>&nbsp;wrote a great serious posts named <em>Fast Substructure Search Using Open Source Tools</em>&nbsp;providing details on substructure search with MySQL. But, however, poor binary data operation functions of MySQL limited the&nbsp;implementation&nbsp;of similar structure search which typically depends on the calculation of&nbsp;Tanimato&nbsp;coefficient. We are going to use <a title="Java爱好者" href="http://www.blogjava.net/rain1102">Java</a> &amp; CDK to add this feature.</p>
<p>As default output of CDK fingerprint, <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/util/BitSet.html" jquery1255846148312="4">java.util.BitSet</a>&nbsp;with&nbsp;<a title="interface in java.io" href="http://java.sun.com/j2se/1.5.0/docs/api/java/io/Serializable.html" jquery1255846148312="5">Serializable</a>&nbsp;interface is perfect data format of fingerprint data storage. <a title="Java爱好者" href="http://www.blogjava.net/rain1102">Java</a> itself provides several collections such as <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/util/ArrayList.html" jquery1255846148312="6">ArrayList</a>, <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/util/LinkedList.html" jquery1255846148312="7">LinkedList</a>, <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/util/Vector.html" jquery1255846148312="8">Vector</a> class in package <a title="Java爱好者" href="http://www.blogjava.net/rain1102">Java</a>.util. To provide web access to the search engine, thread unsafe ArrayList and LinkedList have to be kicked out. How about Vector? Once all the fingerprint data is well prepared, the&nbsp;collection &nbsp;function we need to do&nbsp;similarity search is just iteration. No add, no delete. So, a light weight array is enough.</p>
<p>Most of the molecule information is stored in MySQL database, so we are going to map fingerprint to&nbsp;corresponding&nbsp;row in&nbsp;data table. Here is the MolDFData class, we use a long variable to store&nbsp;corresponding primary key in data table.</p>
<pre lang="java"><span style="color: #008000">public class MolDFData implements Serializable {<br />
&nbsp;&nbsp;&nbsp;&nbsp;private long id;<br />
&nbsp;&nbsp;&nbsp;private BitSet fingerprint;<br />
&nbsp;&nbsp;&nbsp;&nbsp;public MolDFData(long id, BitSet fingerprint) {<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;this.id = id;<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;this.fingerprint = fingerprint;<br />
&nbsp;&nbsp;&nbsp;&nbsp;}<br />
&nbsp;&nbsp;&nbsp;&nbsp;public long getId() {<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;return id;<br />
&nbsp;&nbsp;&nbsp;&nbsp;}<br />
&nbsp;&nbsp;&nbsp;&nbsp;public void setId(long id) {<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;this.id = id;<br />
&nbsp;&nbsp;&nbsp;&nbsp;}<br />
&nbsp;&nbsp;&nbsp;&nbsp;public BitSet getFingerprint() {<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;return fingerprint;<br />
&nbsp;&nbsp;&nbsp;&nbsp;}<br />
&nbsp;&nbsp;&nbsp;&nbsp;public void setFingerprint(BitSet fingerprint) {<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;this.fingerprint = fingerprint;<br />
&nbsp;&nbsp;&nbsp;&nbsp;}<br />
}</span></pre>
<p>This is how we storage our fingerprints.</p>
<pre lang="java">private MolFPData[] arrayData;</pre>
<p>No big deal with similarity search. Just calculate the&nbsp;Tanimoto&nbsp;coefficient, if it&#8217;s bigger than&nbsp;minimal&nbsp;&nbsp;similarity you set, add this one into result.</p>
<pre lang="java">    <span style="color: #008000">public List searchTanimoto(BitSet bt, float minSimlarity) {<br />
</pre>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;List resultList = new LinkedList();<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;int i;<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;for (i = 0; i &lt; arrayData.length; i++) {<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;MolDFData aListData = arrayData[i];<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;try {<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;float coefficient = Tanimoto.calculate(aListData.getFingerprint(), bt);<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;if (coefficient &gt; minSimlarity) {<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;resultList.add(new SearchResultData(aListData.getId(), coefficient));<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} catch (CDKException e) {<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Collections.sort(resultList);<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;return resultList;<br />
&nbsp;&nbsp;&nbsp;&nbsp;}<br />
</span>Pretty&nbsp;ugly&nbsp;code? &nbsp;Maybe. But it really works, at a acceptable speed.</p>
<p>Tests were done using the code blow on a macbook(Intel Core Due 1.83 GHz, 2G RAM).<span style="line-height: 18px; font-family: 'Courier New'; white-space: pre"> </span></p>
<pre lang="java">long t3 = System.currentTimeMillis();
List&lt;SearchResultData&gt; listResult = se.searchTanimoto(bs, 0.8f);
long t4 = System.currentTimeMillis();
System.out.println("Thread: Search done in " + (t4 - t3) + " ms.");</pre>
<p>In my database of&nbsp;87364&nbsp;commercial&nbsp;compounds, it takes 335 ms.</p><img src ="http://www.blogjava.net/rain1102/aggbug/298745.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/rain1102/" target="_blank">周锐</a> 2009-10-18 14:09 <a href="http://www.blogjava.net/rain1102/archive/2009/10/18/298745.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>CDK中的相似度搜索</title><link>http://www.blogjava.net/rain1102/archive/2009/10/18/298744.html</link><dc:creator>周锐</dc:creator><author>周锐</author><pubDate>Sun, 18 Oct 2009 05:36:00 GMT</pubDate><guid>http://www.blogjava.net/rain1102/archive/2009/10/18/298744.html</guid><wfw:comment>http://www.blogjava.net/rain1102/comments/298744.html</wfw:comment><comments>http://www.blogjava.net/rain1102/archive/2009/10/18/298744.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/rain1102/comments/commentRss/298744.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/rain1102/services/trackbacks/298744.html</trackback:ping><description><![CDATA[<p>/*&nbsp; $RCSfile$<br />
&nbsp;*&nbsp; $Author$<br />
&nbsp;*&nbsp; $Date$<br />
&nbsp;*&nbsp; $Revision$<br />
&nbsp;*<br />
&nbsp;*&nbsp; Copyright (C) 1997-2007&nbsp; The Chemistry Development Kit (CDK) project<br />
&nbsp;*<br />
&nbsp;*&nbsp; Contact: cdk-devel@lists.sourceforge.net<br />
&nbsp;*<br />
&nbsp;*&nbsp; This program is free software; you can redistribute it and/or<br />
&nbsp;*&nbsp; modify it under the terms of the GNU Lesser General Public License<br />
&nbsp;*&nbsp; as published by the Free Software Foundation; either version 2.1<br />
&nbsp;*&nbsp; of the License, or (at your option) any later version.<br />
&nbsp;*&nbsp; All we ask is that proper credit is given for our work, which includes<br />
&nbsp;*&nbsp; - but is not limited to - adding the above copyright notice to the beginning<br />
&nbsp;*&nbsp; of your source code files, and to any copyright notice that you may distribute<br />
&nbsp;*&nbsp; with programs based on this work.<br />
&nbsp;*<br />
&nbsp;*&nbsp; This program is distributed in the hope that it will be useful,<br />
&nbsp;*&nbsp; but WITHOUT ANY WARRANTY; without even the implied warranty of<br />
&nbsp;*&nbsp; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.&nbsp; See the<br />
&nbsp;*&nbsp; GNU Lesser General Public License for more details.<br />
&nbsp;*<br />
&nbsp;*&nbsp; You should have received a copy of the GNU Lesser General Public License<br />
&nbsp;*&nbsp; along with this program; if not, write to the Free Software<br />
&nbsp;*&nbsp; Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.<br />
&nbsp;*<br />
&nbsp;*/<br />
package org.openscience.cdk.similarity;</p>
<p><br />
import org.openscience.cdk.annotations.TestClass;<br />
import org.openscience.cdk.annotations.TestMethod;<br />
import org.openscience.cdk.exception.CDKException;</p>
<p>import <a title="Java爱好者" href="http://www.blogjava.net/rain1102" >Java</a>.util.BitSet;</p>
<p>/**<br />
&nbsp;*&nbsp; Calculates the Tanimoto coefficient for a given pair of two <br />
&nbsp;*&nbsp; fingerprint bitsets or real valued feature vectors.<br />
&nbsp;*<br />
&nbsp;*&nbsp; The Tanimoto coefficient is one way to <br />
&nbsp;*&nbsp; quantitatively measure the "distance" or similarity of <br />
&nbsp;*&nbsp; two chemical structures. <br />
&nbsp;*<br />
&nbsp;*&nbsp; &lt;p&gt;You can use the FingerPrinter class to retrieve two fingerprint bitsets.<br />
&nbsp;*&nbsp; We assume that you have two structures stored in cdk.Molecule objects.<br />
&nbsp;*&nbsp; A tanimoto coefficient can then be calculated like:<br />
&nbsp;*&nbsp; &lt;pre&gt;<br />
&nbsp;*&nbsp;&nbsp; BitSet fingerprint1 = Fingerprinter.getFingerprint(molecule1);<br />
&nbsp;*&nbsp;&nbsp; BitSet fingerprint2 = Fingerprinter.getFingerprint(molecule2);<br />
&nbsp;*&nbsp;&nbsp; float tanimoto_coefficient = Tanimoto.calculate(fingerprint1, fingerprint2);<br />
&nbsp;*&nbsp; &lt;/pre&gt;<br />
&nbsp;*<br />
&nbsp;*&nbsp; &lt;p&gt;The FingerPrinter assumes that hydrogens are explicitely given, if this <br />
&nbsp;*&nbsp; is desired! <br />
&nbsp;*&nbsp; &lt;p&gt;Note that the continuous Tanimoto coefficient does not lead to a metric space<br />
&nbsp;*<br />
&nbsp;*@author&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; steinbeck<br />
&nbsp;* @cdk.githash<br />
&nbsp;*@cdk.created&nbsp;&nbsp;&nbsp; 2005-10-19<br />
&nbsp;*@cdk.keyword&nbsp;&nbsp;&nbsp; jaccard<br />
&nbsp;*@cdk.keyword&nbsp;&nbsp;&nbsp; similarity, tanimoto<br />
&nbsp;* @cdk.module fingerprint<br />
&nbsp;*/<br />
@TestClass("org.openscience.cdk.similarity.TanimotoTest")<br />
public class Tanimoto <br />
{</p>
<p>&nbsp;&nbsp;&nbsp; /**<br />
&nbsp;&nbsp;&nbsp;&nbsp; * Evaluates Tanimoto coefficient for two bit sets.<br />
&nbsp;&nbsp;&nbsp;&nbsp; *<br />
&nbsp;&nbsp;&nbsp;&nbsp; * @param bitset1 A bitset (such as a fingerprint) for the first molecule<br />
&nbsp;&nbsp;&nbsp;&nbsp; * @param bitset2 A bitset (such as a fingerprint) for the second molecule<br />
&nbsp;&nbsp;&nbsp;&nbsp; * @return The Tanimoto coefficient<br />
&nbsp;&nbsp;&nbsp;&nbsp; * @throws org.openscience.cdk.exception.CDKException&nbsp; if bitsets are not of the same length<br />
&nbsp;&nbsp;&nbsp;&nbsp; */<br />
&nbsp;&nbsp;&nbsp; @TestMethod("testTanimoto1,testTanimoto2")<br />
&nbsp;&nbsp;&nbsp; public static float <span style="color: red">calculate</span>(BitSet bitset1, BitSet bitset2) throws CDKException<br />
&nbsp;&nbsp;&nbsp; {<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; float _bitset1_cardinality = bitset1.cardinality();<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; float _bitset2_cardinality = bitset2.cardinality();<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (bitset1.size() != bitset2.size()) {<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; throw new CDKException("Bisets must have the same bit length");<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; BitSet one_and_two = (BitSet)bitset1.clone();<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; one_and_two.and(bitset2);<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; float _common_bit_count = one_and_two.cardinality();<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return _common_bit_count/(_bitset1_cardinality + _bitset2_cardinality - _common_bit_count);<br />
&nbsp;&nbsp;&nbsp; }<br />
&nbsp;&nbsp;&nbsp; <br />
&nbsp;&nbsp;&nbsp; /**<br />
&nbsp;&nbsp;&nbsp;&nbsp; * Evaluates the continuous Tanimoto coefficient for two real valued vectors.<br />
&nbsp;&nbsp;&nbsp;&nbsp; *<br />
&nbsp;&nbsp;&nbsp;&nbsp; * @param features1 The first feature vector<br />
&nbsp;&nbsp;&nbsp;&nbsp; * @param features2 The second feature vector<br />
&nbsp;&nbsp;&nbsp;&nbsp; * @return The continuous Tanimoto coefficient<br />
&nbsp;&nbsp;&nbsp;&nbsp; * @throws org.openscience.cdk.exception.CDKException&nbsp; if the features are not of the same length<br />
&nbsp;&nbsp;&nbsp;&nbsp; */<br />
&nbsp;&nbsp;&nbsp; @TestMethod("testTanimoto3")<br />
&nbsp;&nbsp;&nbsp; public static float <span style="color: red">calculate</span>(double[] features1, double[] features2) throws CDKException {</p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (features1.length != features2.length) {<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; throw new CDKException("Features vectors must be of the same length");<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }</p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; int n = features1.length;<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; double ab = 0.0;<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; double a2 = 0.0;<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; double b2 = 0.0;</p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for (int i = 0; i &lt; n; i++) {<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ab += features1[i] * features2[i];<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; a2 += features1[i]*features1[i];<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; b2 += features2[i]*features2[i];<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return (float)ab/(float)(a2+b2-ab);<br />
&nbsp;&nbsp;&nbsp; }<br />
}<br />
<br />
通过源码可以看出<span style="color: red">calculate</span>(BitSet bitset1, BitSet bitset2)方法,是通过比较两个分子的fingerprint的位,来计算相似度.通过BitSet的and操作得到共同的个数,然后在除以总共为true的个数,这样就得到相似值.</p><img src ="http://www.blogjava.net/rain1102/aggbug/298744.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/rain1102/" target="_blank">周锐</a> 2009-10-18 13:36 <a href="http://www.blogjava.net/rain1102/archive/2009/10/18/298744.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>JChemPaint (画2D化学结构的Java程序)</title><link>http://www.blogjava.net/rain1102/archive/2009/10/17/298709.html</link><dc:creator>周锐</dc:creator><author>周锐</author><pubDate>Sat, 17 Oct 2009 13:53:00 GMT</pubDate><guid>http://www.blogjava.net/rain1102/archive/2009/10/17/298709.html</guid><wfw:comment>http://www.blogjava.net/rain1102/comments/298709.html</wfw:comment><comments>http://www.blogjava.net/rain1102/archive/2009/10/17/298709.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/rain1102/comments/commentRss/298709.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/rain1102/services/trackbacks/298709.html</trackback:ping><description><![CDATA[<p>JChemPaint (or JCP for short here) is the editor and viewer included in <a title="Main Page" href="http://sourceforge.net/apps/mediawiki/cdk/index.php?title=Main_Page" cmimpressionsent="1">CDK</a> for 2D chemical structures. It is implemented in several forms: a <a title="Java爱好者" href="http://www.blogjava.net/rain1102">Java</a> application and two varieties of <a title="Java爱好者" href="http://www.blogjava.net/rain1102">Java</a> applet. </p>
<p>JChemPaint was started by <a class="external text" title="http://www.ebi.ac.uk/steinbeck" href="http://www.ebi.ac.uk/steinbeck" rel="nofollow" cmimpressionsent="1">Christoph Steinbeck</a> in the late 1990's to be the complementary structure editor to <a class="external text" title="http://www.jmol.org/" href="http://www.jmol.org/" rel="nofollow" cmimpressionsent="1">Jmol</a>. It was then co-developed by <a class="external text" title="http://chem-bla-ics.blogspot.com" href="http://chem-bla-ics.blogspot.com/" rel="nofollow" cmimpressionsent="1">Egon Willighagen</a> and others. Jmol again is a visualisation and analysis tool for 3D molecular structures, started by <a class="external text" title="http://www.nd.edu/~gezelter/" href="http://www.nd.edu/~gezelter/" rel="nofollow" cmimpressionsent="1">Dan Gezelter at Notre Dame University</a>, initiator of the <a class="external text" title="http://www.openscience.org/" href="http://www.openscience.org/" rel="nofollow" cmimpressionsent="1">Open Science Project</a> and, like JChemPaint, developed by an international team of opensource programmers. </p>
<p>In at least three aspects JChemPaint is different from other 2D editors: </p>
<ul>
    <li>JChemPaint is <a class="external text" title="http://en.wikipedia.org/wiki/Open_source" href="http://en.wikipedia.org/wiki/Open_source" rel="nofollow" cmimpressionsent="1">open source</a> and <a class="external text" title="http://www.gnu.org/free-sw.html" href="http://www.gnu.org/free-sw.html" rel="nofollow" cmimpressionsent="1">free software</a>. We believe that scientific software, especially when its development was publicly funded, should be free. As the GNU people put it: &#171;`Free software&#180; is a matter of liberty, not price. To understand the concept, you should think of `free speech&#180;, not `free beer&#180;&#187;. Everyone can participate in the development of the program. Everyone can download and change the source code, provided that they make the changes publicly available again, according to the <a class="external text" title="http://www.gnu.org/licenses/lgpl.html" href="http://www.gnu.org/licenses/lgpl.html" rel="nofollow" cmimpressionsent="1">GNU Lesser General Public License, LGPL</a>. This ensures that the community can take advantage of any bugfix or enhancement made to the system. It also ensures that a scientist, who needs a standard piece of software like a structure editor as a helper application in his/her new program, does not have to reinvent the wheel over and over again because all the structure editors that have been written before are now proprietary software. If there is a free structure editor, he/she can focus on the real science.
    <li>JChemPaint is in constant development and <strong>you</strong> can help (see below).
    <li>Since JChemPaint is written in <a title="Java爱好者" href="http://www.blogjava.net/rain1102">Java</a>, it runs on any computing platform and operating system for which a <a title="Java爱好者" href="http://www.blogjava.net/rain1102">Java</a> Virtual Machine (of version &gt;= 1.3 up to JCP 2.4 and version &gt;= 1.5 for JCP &gt; 2.4) has been implemented (like Linux, Windows, Solaris, AIX and others).
    <li>JChemPaint is available free of charge.
    <li>JChemPaint is translated into several languages: Dutch, French, German, Polish, Portuguese and Spanish. </li>
</ul>
集成EditorApplett到jsp页面里:<br />
<br />
&lt;applet code="org.openscience.jchempaint.applet.JChemPaintEditorApplet" <br />
archive="jchempaint-applet-core.jar" name="Editor"<br />
width="550" height="400"&gt;<br />
&lt;/applet&gt;<br />
<img height="399" alt="" src="http://www.blogjava.net/images/blogjava_net/rain1102/34503/r_jchempaint.JPG" width="548" border="0" /><br />
集成ViewerApplet到jsp页面里:<br />
&lt;applet code="org.openscience.jchempaint.applet.JChemPaintViewerApplet" <br />
archive="jchempaint-applet-core.jar" <br />
width="550" height="400"&gt;<br />
<br />
<span class="mw-headline">Applet methods:<br />
<h3><span class="mw-headline">Reading from the applet </span></h3>
<ul>
    <li>getMolFile()
    <li>getSmiles()
    <li>getSmilesChiral()
    <li>getParameter()
    <li>getParameterInfo()
    <li>getAppletInfo()
    <li>getLocale()
    <li>getImage()
    <li>getTheJcpp() </li>
</ul>
<h3><span class="mw-headline">Writing to the applet </span></h3>
<ul>
    <li>setMolFile()
    <li>setMolFileWithReplace()
    <li>addMolFileWithReplace()
    <li>loadModelFromUrl()
    <li>loadModelFromSmiles()
    <li>clear()
    <li>selectAtom()
    <li>init()
    <li>start()
    <li>stop()
    <li>initPanelAndModel()
    <li>setTheJcpp()
    <li>setTheModel() </li>
</ul>
</span><img src ="http://www.blogjava.net/rain1102/aggbug/298709.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/rain1102/" target="_blank">周锐</a> 2009-10-17 21:53 <a href="http://www.blogjava.net/rain1102/archive/2009/10/17/298709.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item></channel></rss>