﻿<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/"><channel><title>BlogJava-自已的天空-随笔分类-JAVA基础篇</title><link>http://www.blogjava.net/bluelily22/category/3966.html</link><description /><language>zh-cn</language><lastBuildDate>Wed, 28 Feb 2007 03:24:22 GMT</lastBuildDate><pubDate>Wed, 28 Feb 2007 03:24:22 GMT</pubDate><ttl>60</ttl><item><title>读取hibernate配制文件修改连接的ip地址</title><link>http://www.blogjava.net/bluelily22/archive/2006/03/13/35016.html</link><dc:creator>丁丁</dc:creator><author>丁丁</author><pubDate>Mon, 13 Mar 2006 05:59:00 GMT</pubDate><guid>http://www.blogjava.net/bluelily22/archive/2006/03/13/35016.html</guid><wfw:comment>http://www.blogjava.net/bluelily22/comments/35016.html</wfw:comment><comments>http://www.blogjava.net/bluelily22/archive/2006/03/13/35016.html#Feedback</comments><slash:comments>2</slash:comments><wfw:commentRss>http://www.blogjava.net/bluelily22/comments/commentRss/35016.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/bluelily22/services/trackbacks/35016.html</trackback:ping><description><![CDATA[<P>把这个类放到和hibernate.cfg.xml一个目录下，编译执行，注意把需要的包(dom4j)引进去</P>
<P>操作xml基本上就这么东西，你仔细看看，很简单的</P>
<P>import java.io.File;<BR>import java.io.FileOutputStream;<BR>import java.util.Iterator;<BR>import java.util.List;</P>
<P>import org.dom4j.Document;<BR>import org.dom4j.Element;<BR>import org.dom4j.Node;<BR>import org.dom4j.io.SAXReader;<BR>import org.dom4j.io.XMLWriter;</P>
<P>public class HiberCFG {</P>
<P>&nbsp;/**<BR>&nbsp; * @param args<BR>&nbsp; */<BR>&nbsp;<BR>&nbsp;public void readXML(){<BR>&nbsp;&nbsp;try{<BR>&nbsp;&nbsp;&nbsp;String fname="hibernate.cfg.xml";<BR>&nbsp;&nbsp;&nbsp;SAXReader reader=new SAXReader();<BR>&nbsp;&nbsp;&nbsp;Document document=reader.read(new File(fname));<BR>&nbsp;&nbsp;&nbsp;Element root=document.getRootElement();<BR>&nbsp;&nbsp;&nbsp;List list=root.selectNodes("/hibernate-configuration/session-factory/property");<BR>&nbsp;&nbsp;&nbsp;for(Iterator it=list.iterator();it.hasNext();){<BR>&nbsp;&nbsp;&nbsp;&nbsp;Node node=(Node)it.next();<BR>&nbsp;&nbsp;&nbsp;&nbsp;if(node.valueOf("@name").equals("hibernate.connection.url")){<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;//原url<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;String url=node.getText();&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;System.out.println(url);<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;//IP地址前的部分<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;String a1=url.substring(0,url.indexOf("//")+2);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;System.out.println(a1);<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;//IP地址后部分<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;String a2=url.substring(url.indexOf(":",(url.indexOf("//")+2)),url.length());<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;System.out.println(a2);<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;String newIP="192.168.0.1";<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;//修改后的url<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;String newUrl=a1+newIP+a2;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;System.out.println(newUrl);<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;//将新url替换<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;node.setText(newUrl);<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;}<BR>&nbsp;&nbsp;&nbsp;}<BR>&nbsp;&nbsp;&nbsp;<BR>&nbsp;&nbsp;&nbsp;<BR>&nbsp;&nbsp;&nbsp;//将文件保存<BR>&nbsp;&nbsp;&nbsp;String indent="&nbsp; ";//缩进符号<BR>&nbsp;&nbsp;&nbsp;boolean newLines=true;// 是否产生新行(即一个元素一行)<BR>&nbsp;&nbsp;&nbsp;XMLWriter writer=new XMLWriter(new FileOutputStream(fname),new org.dom4j.io.OutputFormat(indent,newLines,"utf-8"));<BR>&nbsp;&nbsp;&nbsp;writer.write(document);<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; writer.flush();<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; writer.close();<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; System.out.println("成功");<BR>&nbsp;&nbsp;}<BR>&nbsp;&nbsp;catch(Exception ex){<BR>&nbsp;&nbsp;&nbsp;System.out.println("失败");<BR>&nbsp;&nbsp;&nbsp;ex.printStackTrace();<BR>&nbsp;&nbsp;&nbsp;<BR>&nbsp;&nbsp;}<BR>&nbsp;&nbsp;<BR>&nbsp;}<BR>&nbsp;public static void main(String[] args) {<BR>&nbsp;&nbsp;// TODO Auto-generated method stub<BR>&nbsp;&nbsp;HiberCFG h=new HiberCFG();<BR>&nbsp;&nbsp;h.readXML();</P>
<P>&nbsp;}</P>
<P>}<BR></P><img src ="http://www.blogjava.net/bluelily22/aggbug/35016.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/bluelily22/" target="_blank">丁丁</a> 2006-03-13 13:59 <a href="http://www.blogjava.net/bluelily22/archive/2006/03/13/35016.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>编写Java程序最容易犯的21种错误 </title><link>http://www.blogjava.net/bluelily22/archive/2005/12/29/25827.html</link><dc:creator>丁丁</dc:creator><author>丁丁</author><pubDate>Thu, 29 Dec 2005 01:20:00 GMT</pubDate><guid>http://www.blogjava.net/bluelily22/archive/2005/12/29/25827.html</guid><wfw:comment>http://www.blogjava.net/bluelily22/comments/25827.html</wfw:comment><comments>http://www.blogjava.net/bluelily22/archive/2005/12/29/25827.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/bluelily22/comments/commentRss/25827.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/bluelily22/services/trackbacks/25827.html</trackback:ping><description><![CDATA[<P>1.Duplicated Code</P>
<P>　　代码重复几乎是最常见的异味了。他也是Refactoring的主要目标之一。代码重复往往来自于copy-and-paste的编程风格。与他相对应OAOO是一个好系统的重要标志。</P>
<P>　　2.Long method</P>
<P>　　它是传统结构化的“遗毒”。一个方法应当具有自我独立的意图，不要把几个意图放在一起。</P>
<P>　　3.Large Class</P>
<P>　　大类就是你把太多的责任交给了一个类。这里的规则是One Class One Responsibility.</P>
<P>　　4.Divergent Change</P>
<P>　　一个类里面的内容变化率不同。某些状态一个小时变一次，某些则几个月一年才变一次；某些状态因为这方面的原因发生变化，而另一些则因为其他方面的原因变一次。面向对象的抽象就是把相对不变的和相对变化相隔离。把问题变化的一方面和另一方面相隔离。这使得这些相对不变的可以重用。问题变化的每个方面都可以单独重用。这种相异变化的共存使得重用非常困难。</P>
<P>　　5.Shotgun Surgery</P>
<P>　　这正好和上面相反。对系统一个地方的改变涉及到其他许多地方的相关改变。这些变化率和变化内容相似的状态和行为通常应当放在同一个类中。</P>
<P>　　6.Feature Envy</P>
<P>　　对象的目的就是封装状态以及与这些状态紧密相关的行为。如果一个类的方法频繁用get 方法存取其他类的状态进行计算，那么你要考虑把行为移到涉及状态数目最多的那个类。</P>
<P>　　7.Data Clumps</P>
<P>　　某些数据通常像孩子一样成群玩耍：一起出现在很多类的成员变量中，一起出现在许多方法的参数中，这些数据或许应该自己独立形成对象。</P>
<P>　　8.Primitive Obsession</P>
<P>　　面向对象的新手通常习惯使用几个原始类型的数据来表示一个概念。譬如对于范围，他们会使用两个数字。对于Money，他们会用一个浮点数来表示。因为你没有使用对象来表达问题中存在的概念，这使得代码变的难以理解，解决问题的难度大大增加。好的习惯是扩充语言所能提供原始类型，用小对象来表示范围、金额、转化率、邮政编码等等。</P>
<P>　　9.Switch Statement</P>
<P>　　基于常量的开关语句是OO 的大敌，你应当把他变为子类、state或strategy.</P>
<P>　　10. Parallel Inheritance Hierarchies</P>
<P>　　并行的继承层次是shotgun surgery的特殊情况。因为当你改变一个层次中的某一个类时，你必须同时改变另外一个层次的并行子类。</P>
<P>　　11. Lazy Class</P>
<P>　　一个干活不多的类。类的维护需要额外的开销，如果一个类承担了太少的责任，应当消除它。</P>
<P>　　12. Speculative Generality</P>
<P>　　一个类实现了从未用到的功能和通用性。通常这样的类或方法唯一的用户是testcase.不要犹豫，删除它。</P>
<P>　　13. Temporary Field</P>
<P>　　一个对象的属性可能只在某些情况下才有意义。这样的代码将难以理解。专门建立一个对象来持有这样的孤儿属性，把只和他相关的行为移到该类。最常见的是一个特定的算法需要某些只有该算法才有用的变量。</P>
<P>　　14. Message Chain</P>
<P>　　消息链发生于当一个客户向一个对象要求另一个对象，然后客户又向这另一对象要求另一个对象，再向这另一个对象要求另一个对象，如此如此。这时，你需要隐藏分派。</P>
<P>　　15. Middle Man</P>
<P>　　对象的基本特性之一就是封装，而你经常会通过分派去实现封装。但是这一步不能走得太远，如果你发现一个类接口的一大半方法都在做分派，你可能需要移去这个中间人。</P>
<P>　　16. Inappropriate Intimacy</P>
<P>　　某些类相互之间太亲密，它们花费了太多的时间去砖研别人的私有部分。对人类而言，我们也许不应该太假正经，但我们应当让自己的类严格遵守禁欲主义。</P>
<P>　　17. Alternative Classes with Different Interfaces</P>
<P>　　做相同事情的方法有不同的函数signature，一致把它们往类层次上移，直至协议一致。</P>
<P>　　18. Incomplete Library Class</P>
<P>　　要建立一个好的类库非常困难。我们大量的程序工作都基于类库实现。然而，如此广泛而又相异的目标对库构建者提出了苛刻的要求。库构建者也不是万能的。有时候我们会发现库类无法实现我们需要的功能。而直接对库类的修改有非常困难。这时候就需要用各种手段进行Refactoring.</P>
<P>　　19. Data Class</P>
<P>　　对象包括状态和行为。如果一个类只有状态没有行为，那么肯定有什么地方出问题了。</P>
<P>　　20. Refused Bequest</P>
<P>　　超类传下来很多行为和状态，而子类只是用了其中的很小一部分。这通常意味着你的类层次有问题。</P>
<P>　　21. Comments</P>
<P>　　经常觉得要写很多注释表示你的代码难以理解。如果这种感觉太多，表示你需要Refactoring。</P><img src ="http://www.blogjava.net/bluelily22/aggbug/25827.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/bluelily22/" target="_blank">丁丁</a> 2005-12-29 09:20 <a href="http://www.blogjava.net/bluelily22/archive/2005/12/29/25827.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>JSP上传图片并生成缩略图 </title><link>http://www.blogjava.net/bluelily22/archive/2005/11/03/18028.html</link><dc:creator>丁丁</dc:creator><author>丁丁</author><pubDate>Thu, 03 Nov 2005 15:02:00 GMT</pubDate><guid>http://www.blogjava.net/bluelily22/archive/2005/11/03/18028.html</guid><wfw:comment>http://www.blogjava.net/bluelily22/comments/18028.html</wfw:comment><comments>http://www.blogjava.net/bluelily22/archive/2005/11/03/18028.html#Feedback</comments><slash:comments>3</slash:comments><wfw:commentRss>http://www.blogjava.net/bluelily22/comments/commentRss/18028.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/bluelily22/services/trackbacks/18028.html</trackback:ping><description><![CDATA[<P>本例子使用了jspsmart组件进行上传，这里可以免费下载该组件<A href="http://www.jspsmart.com/">www.jspsmart.com</A><BR>下载解压后，将jar包复制到　\WEB-INF\lib　目录后重启服务器，jspsmart即可正常使用了</P>
<P><STRONG>1、uploadimage.jsp</STRONG></P>
<P>&lt;%@ page contentType="text/html;charset=gb2312" language="java" import="java.io.*,java.awt.Image,java.awt.image.*,com.sun.image.codec.jpeg.*,<BR>java.sql.*,com.jspsmart.upload.*,java.util.*,cn.oof.database.*,cn.oof.house.*"%&gt;<BR>&lt;%<BR>SmartUpload mySmartUpload =new SmartUpload();<BR>long file_size_max=4000000;<BR>String fileName2="",ext="",testvar="";<BR>String url="uploadfile/images/";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; //应保证在根目录中有此目录的存在<BR>//初始化<BR>mySmartUpload.initialize(pageContext);<BR>//只允许上载此类文件<BR>try {<BR>&nbsp;mySmartUpload.setAllowedFilesList("jpg,gif");<BR>//上载文件 <BR>&nbsp;mySmartUpload.upload();<BR>} catch (Exception e){<BR>%&gt;<BR>&nbsp; &lt;SCRIPT language=javascript&gt;<BR>&nbsp; alert("只允许上传.jpg和.gif类型图片文件");<BR>&nbsp; window.location='upfile.jsp';<BR>&nbsp; &lt;/script&gt;<BR>&lt;%<BR>}<BR>try{</P>
<P>&nbsp;&nbsp;&nbsp; com.jspsmart.upload.File myFile = mySmartUpload.getFiles().getFile(0);<BR>&nbsp;&nbsp;&nbsp; if (myFile.isMissing()){%&gt;<BR>&nbsp;&nbsp; &lt;SCRIPT language=javascript&gt;<BR>&nbsp;&nbsp; alert("请先选择要上传的文件");<BR>&nbsp;&nbsp; window.location='upfile.jsp';<BR>&nbsp;&nbsp; &lt;/script&gt;<BR>&nbsp;&nbsp;&nbsp; &lt;%}<BR>&nbsp;&nbsp;&nbsp; else{<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; //String myFileName=myFile.getFileName(); //取得上载的文件的文件名<BR>&nbsp;&nbsp; ext= myFile.getFileExt();&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; //取得后缀名<BR>&nbsp;&nbsp; int file_size=myFile.getSize();&nbsp;&nbsp;&nbsp;&nbsp; //取得文件的大小&nbsp; <BR>&nbsp;&nbsp; String saveurl="";<BR>&nbsp;&nbsp; if(file_size&lt;file_size_max){<BR>&nbsp;&nbsp;&nbsp; //更改文件名，取得当前上传时间的毫秒数值<BR>&nbsp;&nbsp;&nbsp; Calendar calendar = Calendar.getInstance();<BR>&nbsp;&nbsp;&nbsp; String filename = String.valueOf(calendar.getTimeInMillis()); <BR>&nbsp;&nbsp;&nbsp; saveurl=request.getRealPath("/")+url;<BR>&nbsp;&nbsp;&nbsp; saveurl+=filename+"."+ext;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; //保存路径<BR>&nbsp;&nbsp;&nbsp; myFile.saveAs(saveurl,mySmartUpload.SAVE_PHYSICAL);<BR>&nbsp;&nbsp;&nbsp; //out.print(filename);<BR>//-----------------------上传完成，开始生成缩略图-------------------------&nbsp;&nbsp;&nbsp; <BR>&nbsp;&nbsp;&nbsp; java.io.File file = new java.io.File(saveurl);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; //读入刚才上传的文件<BR>&nbsp;&nbsp;&nbsp; String newurl=request.getRealPath("/")+url+filename+"_min."+ext;&nbsp; //新的缩略图保存地址<BR>&nbsp;&nbsp;&nbsp; Image src = javax.imageio.ImageIO.read(file);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; //构造Image对象<BR>&nbsp;&nbsp;&nbsp; float tagsize=200;<BR>&nbsp;&nbsp;&nbsp; int old_w=src.getWidth(null);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; //得到源图宽<BR>&nbsp;&nbsp;&nbsp; int old_h=src.getHeight(null);&nbsp;&nbsp; <BR>&nbsp;&nbsp;&nbsp; int new_w=0;<BR>&nbsp;&nbsp;&nbsp; int new_h=0;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; //得到源图长<BR>&nbsp;&nbsp;&nbsp; int tempsize;<BR>&nbsp;&nbsp;&nbsp; float tempdouble; <BR>&nbsp;&nbsp;&nbsp; if(old_w&gt;old_h){<BR>&nbsp;&nbsp;&nbsp;&nbsp; tempdouble=old_w/tagsize;<BR>&nbsp;&nbsp;&nbsp; }else{<BR>&nbsp;&nbsp;&nbsp;&nbsp; tempdouble=old_h/tagsize;<BR>&nbsp;&nbsp;&nbsp; }<BR>&nbsp;&nbsp;&nbsp; new_w=Math.round(old_w/tempdouble);<BR>&nbsp;&nbsp;&nbsp; new_h=Math.round(old_h/tempdouble);//计算新图长宽<BR>&nbsp;&nbsp;&nbsp; BufferedImage tag = new BufferedImage(new_w,new_h,BufferedImage.TYPE_INT_RGB);<BR>&nbsp;&nbsp;&nbsp; tag.getGraphics().drawImage(src,0,0,new_w,new_h,null);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; //绘制缩小后的图<BR>&nbsp;&nbsp;&nbsp; FileOutputStream newimage=new FileOutputStream(newurl);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; //输出到文件流<BR>&nbsp;&nbsp;&nbsp; JPEGImageEncoder encoder = JPEGCodec.createJPEGEncoder(newimage);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <BR>&nbsp;&nbsp;&nbsp; encoder.encode(tag);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; //近JPEG编码<BR>&nbsp;&nbsp;&nbsp;&nbsp; newimage.close();&nbsp;&nbsp;&nbsp; </P>
<P>&nbsp;&nbsp; }<BR>&nbsp;&nbsp; else{<BR>&nbsp;&nbsp;&nbsp; out.print("&lt;SCRIPT language='javascript'&gt;");<BR>&nbsp;&nbsp;&nbsp; out.print("alert('上传文件大小不能超过"+(file_size_max/1000)+"K');");<BR>&nbsp;&nbsp;&nbsp; out.print("window.location='upfile.jsp;'");<BR>&nbsp;&nbsp;&nbsp; out.print("&lt;/SCRIPT&gt;");<BR>&nbsp;&nbsp; }<BR>&nbsp; }<BR>}catch (Exception e){</P>
<P>e.toString();</P>
<P>}<BR>%&gt; </P>
<P><STRONG>2 upload.htm</STRONG><BR>&lt;html&gt;<BR>&lt;head&gt;<BR>&lt;title&gt;请选择上传的图片&lt;/title&gt;<BR>&lt;/head&gt; <BR>&lt;body&gt;<BR>&lt;table border="0" align="center" cellpadding="0" cellspacing="0"&gt;<BR>&nbsp; &lt;tr&gt;<BR>&nbsp;&nbsp;&nbsp; &lt;td height="45" align="center" valign="middle"&gt;&lt;form action="uploadimage.jsp" method="post" enctype="multipart/form-data" name="form1"&gt;<BR>请选择上传的图片<BR>&nbsp;&nbsp;&nbsp; &lt;input type="file" name="file"&gt;<BR>&lt;input type="submit" name="Submit" value="上传"&gt;<BR>&nbsp;&nbsp;&nbsp; &lt;/form&gt;&lt;/td&gt;<BR>&nbsp; &lt;/tr&gt;<BR>&lt;/table&gt;<BR>&lt;/body&gt;<BR>&lt;/html&gt;</P><img src ="http://www.blogjava.net/bluelily22/aggbug/18028.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/bluelily22/" target="_blank">丁丁</a> 2005-11-03 23:02 <a href="http://www.blogjava.net/bluelily22/archive/2005/11/03/18028.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>判断文件字符编码形式</title><link>http://www.blogjava.net/bluelily22/archive/2005/10/21/16330.html</link><dc:creator>丁丁</dc:creator><author>丁丁</author><pubDate>Fri, 21 Oct 2005 11:50:00 GMT</pubDate><guid>http://www.blogjava.net/bluelily22/archive/2005/10/21/16330.html</guid><wfw:comment>http://www.blogjava.net/bluelily22/comments/16330.html</wfw:comment><comments>http://www.blogjava.net/bluelily22/archive/2005/10/21/16330.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/bluelily22/comments/commentRss/16330.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/bluelily22/services/trackbacks/16330.html</trackback:ping><description><![CDATA[<P>import java.lang.*;<BR>import java.util.*;<BR>import java.io.*;<BR>import java.net.*;</P>
<P>public class SinoDetect {</P>
<P>&nbsp;&nbsp;&nbsp; static final int GB2312 = 0;<BR>&nbsp;&nbsp;&nbsp; static final int GBK = 1;<BR>&nbsp;&nbsp;&nbsp; static final int HZ = 2;<BR>&nbsp;&nbsp;&nbsp; static final int BIG5 = 3;<BR>&nbsp;&nbsp;&nbsp; static final int EUC_TW = 4;<BR>&nbsp;&nbsp;&nbsp; static final int ISO_2022_CN = 5;<BR>&nbsp;&nbsp;&nbsp; static final int UTF8 = 6;<BR>&nbsp;&nbsp;&nbsp; static final int UNICODE = 7;<BR>&nbsp;&nbsp;&nbsp; static final int ASCII = 8;<BR>&nbsp;&nbsp;&nbsp; static final int OTHER = 9;</P>
<P>&nbsp;&nbsp;&nbsp; static final int TOTAL_ENCODINGS = 10;</P>
<P><BR>&nbsp;&nbsp;&nbsp; // Frequency tables to hold the GB, Big5, and EUC-TW character<BR>&nbsp;&nbsp;&nbsp; // frequencies<BR>&nbsp;&nbsp;&nbsp; int&nbsp;GBFreq[][];<BR>&nbsp;&nbsp;&nbsp; int GBKFreq[][];<BR>&nbsp;&nbsp;&nbsp; int Big5Freq[][];<BR>&nbsp;&nbsp;&nbsp; int EUC_TWFreq[][];<BR>&nbsp;&nbsp;&nbsp; //int UnicodeFreq[94][128];</P>
<P>&nbsp;&nbsp;&nbsp; public static String[] nicename;<BR>&nbsp;&nbsp;&nbsp; public static String[] codings;</P>
<P><BR>&nbsp;&nbsp;&nbsp; public SinoDetect() {<BR>&nbsp;// Initialize the Frequency Table for GB, Big5, EUC-TW<BR>&nbsp;GBFreq = new int[94][94];<BR>&nbsp;GBKFreq = new int[126][191];<BR>&nbsp;Big5Freq = new int[94][158];<BR>&nbsp;EUC_TWFreq = new int[94][94];</P>
<P>&nbsp;codings = new String[TOTAL_ENCODINGS];<BR>&nbsp;codings[GB2312] = "GB2312";<BR>&nbsp;codings[GBK] = "GBK";<BR>&nbsp;codings[HZ] = "HZ";<BR>&nbsp;codings[BIG5] = "BIG5";<BR>&nbsp;codings[EUC_TW] = "CNS11643";<BR>&nbsp;codings[ISO_2022_CN] = "ISO2022CN";<BR>&nbsp;codings[UTF8] = "UTF8";<BR>&nbsp;codings[UNICODE] = "Unicode";<BR>&nbsp;codings[ASCII] = "ASCII";<BR>&nbsp;codings[OTHER] = "OTHER";</P>
<P>&nbsp;nicename = new String[TOTAL_ENCODINGS];<BR>&nbsp;nicename[GB2312] = "GB2312";<BR>&nbsp;nicename[GBK] = "GBK";<BR>&nbsp;nicename[HZ] = "HZ";<BR>&nbsp;nicename[BIG5] = "Big5";<BR>&nbsp;nicename[EUC_TW] = "CNS 11643";<BR>&nbsp;nicename[ISO_2022_CN] = "ISO 2022-CN";<BR>&nbsp;nicename[UTF8] = "UTF-8";<BR>&nbsp;nicename[UNICODE] = "Unicode";<BR>&nbsp;nicename[ASCII] = "ASCII";<BR>&nbsp;nicename[OTHER] = "OTHER";</P>
<P>&nbsp;initialize_frequencies();<BR>&nbsp;&nbsp;&nbsp; }</P>
<P><BR>&nbsp; public static void main(String argc[])<BR>&nbsp; {<BR>&nbsp;&nbsp; SinoDetect sinodetector;<BR>&nbsp;&nbsp; int result = OTHER;</P>
<P>&nbsp;&nbsp; argc = new String[1];<BR>&nbsp;&nbsp; //argc[0] = "c:\\chinesedata\\codeconvert\\voaunit.txt";<BR>&nbsp;&nbsp;&nbsp; argc[0] = "中文";<BR>&nbsp;&nbsp; sinodetector = new SinoDetect();<BR>&nbsp;&nbsp; if (argc[0].startsWith("http://") == true)<BR>&nbsp;&nbsp; {<BR>&nbsp;&nbsp;&nbsp;&nbsp; try {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; result = sinodetector.detectEncoding(new URL(argc[0]));<BR>&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;&nbsp;&nbsp;&nbsp; catch (Exception e) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; System.err.println("Bad URL " + e.toString());<BR>&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp; &nbsp;} else {<BR>&nbsp;&nbsp;&nbsp;&nbsp; //result = sinodetector.detectEncoding(new File(argc[0]));<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; result = sinodetector.detectEncoding(argc[0].getBytes());<BR>&nbsp;&nbsp; }<BR>&nbsp;&nbsp; System.out.println(nicename[result]);<BR>&nbsp; }</P>
<P><BR>&nbsp;&nbsp;&nbsp; /** Function&nbsp; :&nbsp; detectEncoding<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Aruguments:&nbsp; URL<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Returns&nbsp;&nbsp; :&nbsp; One of the encodings from the Encoding enumeration<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (GB2312, HZ, BIG5, EUC_TW, ASCII, or OTHER)<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Description: This function looks at the URL contents<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; and assigns it a probability score for each encoding type.<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; The encoding type with the highest probability is returned.<BR>&nbsp;&nbsp;&nbsp; */</P>
<P>&nbsp;&nbsp;&nbsp; public int detectEncoding(URL testurl) {<BR>&nbsp;byte[] rawtext = new byte[10000];<BR>&nbsp;int bytesread = 0, byteoffset = 0;<BR>&nbsp;int guess = OTHER;<BR>&nbsp;InputStream chinesestream;</P>
<P>&nbsp;try {<BR>&nbsp;&nbsp;&nbsp;&nbsp; chinesestream = testurl.openStream();</P>
<P>&nbsp;&nbsp;&nbsp;&nbsp; while ((bytesread = chinesestream.read(rawtext, byteoffset, rawtext.length - byteoffset)) &gt; 0) {<BR>&nbsp;&nbsp;byteoffset += bytesread;<BR>&nbsp;&nbsp;&nbsp;&nbsp; };<BR>&nbsp;&nbsp;&nbsp;&nbsp; chinesestream.close();<BR>&nbsp;&nbsp;&nbsp;&nbsp; guess = detectEncoding(rawtext);</P>
<P><BR>&nbsp;}<BR>&nbsp;catch (Exception e) {<BR>&nbsp;&nbsp;&nbsp;&nbsp; System.err.println("Error loading or using URL " + e.toString());<BR>&nbsp;&nbsp;&nbsp;&nbsp; guess = OTHER;<BR>&nbsp;}</P>
<P>&nbsp;return guess;<BR>&nbsp;&nbsp;&nbsp; }</P>
<P>&nbsp;&nbsp;&nbsp; /** Function&nbsp; :&nbsp; detectEncoding<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Aruguments:&nbsp; File<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Returns&nbsp;&nbsp; :&nbsp; One of the encodings from the Encoding enumeration<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (GB2312, HZ, BIG5, EUC_TW, ASCII, or OTHER)<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Description: This function looks at the file<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; and assigns it a probability score for each encoding type.<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; The encoding type with the highest probability is returned.<BR>&nbsp;&nbsp;&nbsp; */</P>
<P>&nbsp;&nbsp;&nbsp; public int detectEncoding(File testfile) {<BR>&nbsp;FileInputStream chinesefile;<BR>&nbsp;byte[] rawtext;</P>
<P>&nbsp;rawtext = new byte[(int)testfile.length()];<BR>&nbsp;try {<BR>&nbsp;&nbsp;&nbsp;&nbsp; chinesefile = new FileInputStream(testfile);<BR>&nbsp;&nbsp;&nbsp;&nbsp; chinesefile.read(rawtext);<BR>&nbsp;}<BR>&nbsp;catch (Exception e) {<BR>&nbsp;&nbsp;&nbsp;&nbsp; System.err.println("Error: " + e);<BR>&nbsp;}</P>
<P>&nbsp;return detectEncoding(rawtext);<BR>&nbsp;&nbsp;&nbsp; }</P>
<P>&nbsp;</P>
<P>&nbsp;&nbsp;&nbsp; /** Function&nbsp; :&nbsp; detectEncoding<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Aruguments:&nbsp; byte array<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Returns&nbsp;&nbsp; :&nbsp; One of the encodings from the Encoding enumeration<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (GB2312, HZ, BIG5, EUC_TW, ASCII, or OTHER)<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Description: This function looks at the byte array<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; and assigns it a probability score for each encoding type.<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; The encoding type with the highest probability is returned.<BR>&nbsp;&nbsp;&nbsp; */</P>
<P>&nbsp;&nbsp;&nbsp; public int detectEncoding(byte[] rawtext) {<BR>&nbsp;int[] scores;<BR>&nbsp;int index, maxscore = 0;<BR>&nbsp;int encoding_guess = OTHER;</P>
<P>&nbsp;scores = new int[TOTAL_ENCODINGS];</P>
<P>&nbsp;// Assign Scores<BR>&nbsp;scores[GB2312]&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; = gb2312_probability(rawtext);<BR>&nbsp;scores[GBK]&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; = gbk_probability(rawtext);<BR>&nbsp;scores[HZ]&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; = hz_probability(rawtext);<BR>&nbsp;scores[BIG5]&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; = big5_probability(rawtext);<BR>&nbsp;scores[EUC_TW]&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; = euc_tw_probability(rawtext);<BR>&nbsp;scores[ISO_2022_CN] = iso_2022_cn_probability(rawtext);<BR>&nbsp;scores[UTF8]&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; = utf8_probability(rawtext);<BR>&nbsp;scores[UNICODE]&nbsp;&nbsp;&nbsp;&nbsp; = utf16_probability(rawtext);<BR>&nbsp;scores[ASCII]&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; = ascii_probability(rawtext);<BR>&nbsp;scores[OTHER]&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; = 0;</P>
<P>&nbsp;// Tabulate Scores<BR>&nbsp;for (index = 0; index &lt; TOTAL_ENCODINGS; index++) {<BR>&nbsp;&nbsp;&nbsp;&nbsp; if (scores[index] &gt; maxscore) {<BR>&nbsp;&nbsp;encoding_guess = index;<BR>&nbsp;&nbsp;maxscore = scores[index];<BR>&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;}</P>
<P>&nbsp;// Return OTHER if nothing scored above 50<BR>&nbsp;if (maxscore &lt;= 50) {<BR>&nbsp;&nbsp;&nbsp;&nbsp; encoding_guess = OTHER;<BR>&nbsp;}</P>
<P>&nbsp;return encoding_guess;<BR>&nbsp;&nbsp;&nbsp; }</P>
<P>&nbsp;</P>
<P><BR>&nbsp;&nbsp;&nbsp; /* Function:&nbsp; gb2312_probability<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Argument:&nbsp; pointer to byte array<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Returns :&nbsp; number from 0 to 100 representing probability<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; text in array uses GB-2312 encoding<BR>&nbsp;&nbsp;&nbsp; */</P>
<P>&nbsp;&nbsp;&nbsp; int gb2312_probability(byte[] rawtext) {<BR>&nbsp;int i, rawtextlen = 0;</P>
<P>&nbsp;int dbchars = 1, gbchars = 1;<BR>&nbsp;long gbfreq = 0, totalfreq = 1;<BR>&nbsp;float rangeval = 0, freqval = 0;<BR>&nbsp;int row, column;</P>
<P>&nbsp;// Stage 1:&nbsp; Check to see if characters fit into acceptable ranges</P>
<P>&nbsp;rawtextlen = rawtext.length;<BR>&nbsp;for (i = 0; i &lt; rawtextlen-1; i++) {<BR>&nbsp;&nbsp;&nbsp;&nbsp; //System.err.println(rawtext[i]);<BR>&nbsp;&nbsp;&nbsp;&nbsp; if (rawtext[i] &gt;= 0) {<BR>&nbsp;&nbsp;//asciichars++;<BR>&nbsp;&nbsp;&nbsp;&nbsp; } else {<BR>&nbsp;&nbsp;dbchars++;<BR>&nbsp;&nbsp;if ((byte)0xA1 &lt;= rawtext[i] &amp;&amp; rawtext[i] &lt;= (byte)0xF7 &amp;&amp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (byte)0xA1 &lt;= rawtext[i+1] &amp;&amp; rawtext[i+1] &lt;= (byte)0xFE)<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {<BR>&nbsp;&nbsp;&nbsp;gbchars++;<BR>&nbsp;&nbsp;&nbsp;totalfreq += 500;<BR>&nbsp;&nbsp;&nbsp;row = rawtext[i] + 256 - 0xA1;<BR>&nbsp;&nbsp;&nbsp;column = rawtext[i+1] + 256 - 0xA1;<BR>&nbsp;&nbsp;&nbsp;if (GBFreq[row][column] != 0) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; gbfreq += GBFreq[row][column];<BR>&nbsp;&nbsp;&nbsp;} else if (15 &lt;= row &amp;&amp; row &lt; 55) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; gbfreq += 200;<BR>&nbsp;&nbsp;&nbsp;}</P>
<P>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;&nbsp;i++;<BR>&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;}<BR>&nbsp;rangeval = 50 * ((float)gbchars/(float)dbchars);<BR>&nbsp;freqval = 50 * ((float)gbfreq/(float)totalfreq);</P>
<P>&nbsp;return (int)(rangeval + freqval);<BR>&nbsp;&nbsp;&nbsp; }</P>
<P><BR>&nbsp;&nbsp;&nbsp; /* Function:&nbsp; gb2312_probability<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Argument:&nbsp; pointer to byte array<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Returns :&nbsp; number from 0 to 100 representing probability<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; text in array uses GB-2312 encoding<BR>&nbsp;&nbsp;&nbsp; */</P>
<P>&nbsp;&nbsp;&nbsp; int gbk_probability(byte[] rawtext) {<BR>&nbsp;int i, rawtextlen = 0;</P>
<P>&nbsp;int dbchars = 1, gbchars = 1;<BR>&nbsp;long gbfreq = 0, totalfreq = 1;<BR>&nbsp;float rangeval = 0, freqval = 0;<BR>&nbsp;int row, column;</P>
<P>&nbsp;// Stage 1:&nbsp; Check to see if characters fit into acceptable ranges<BR>&nbsp;rawtextlen = rawtext.length;<BR>&nbsp;for (i = 0; i &lt; rawtextlen-1; i++) {<BR>&nbsp;&nbsp;&nbsp;&nbsp; //System.err.println(rawtext[i]);<BR>&nbsp;&nbsp;&nbsp;&nbsp; if (rawtext[i] &gt;= 0) {<BR>&nbsp;&nbsp;//asciichars++;<BR>&nbsp;&nbsp;&nbsp;&nbsp; } else {<BR>&nbsp;&nbsp;dbchars++;<BR>&nbsp;&nbsp;if ((byte)0xA1 &lt;= rawtext[i] &amp;&amp; rawtext[i] &lt;= (byte)0xF7 &amp;&amp;&nbsp;&nbsp; // Original GB range<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (byte)0xA1 &lt;= rawtext[i+1] &amp;&amp; rawtext[i+1] &lt;= (byte)0xFE)<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {<BR>&nbsp;&nbsp;&nbsp;gbchars++;<BR>&nbsp;&nbsp;&nbsp;totalfreq += 500;<BR>&nbsp;&nbsp;&nbsp;row = rawtext[i] + 256 - 0xA1;<BR>&nbsp;&nbsp;&nbsp;column = rawtext[i+1] + 256 - 0xA1;</P>
<P>&nbsp;&nbsp;&nbsp;//System.out.println("original row " + row + " column " + column);<BR>&nbsp;&nbsp;&nbsp;if (GBFreq[row][column] != 0) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; gbfreq += GBFreq[row][column];<BR>&nbsp;&nbsp;&nbsp;} else if (15 &lt;= row &amp;&amp; row &lt; 55) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; gbfreq += 200;<BR>&nbsp;&nbsp;&nbsp;}</P>
<P>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;&nbsp;else if ((byte)0x81 &lt;= rawtext[i] &amp;&amp; rawtext[i] &lt;= (byte)0xFE &amp;&amp;&nbsp;&nbsp; // Extended GB range<BR>&nbsp;&nbsp;&nbsp; (((byte)0x80 &lt;= rawtext[i+1] &amp;&amp; rawtext[i+1] &lt;= (byte)0xFE) ||<BR>&nbsp;&nbsp;&nbsp;&nbsp; ((byte)0x40 &lt;= rawtext[i+1] &amp;&amp; rawtext[i+1] &lt;= (byte)0x7E)))<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {<BR>&nbsp;&nbsp;&nbsp;gbchars++;<BR>&nbsp;&nbsp;&nbsp;totalfreq += 500;<BR>&nbsp;&nbsp;&nbsp;row = rawtext[i] + 256 - 0x81;<BR>&nbsp;&nbsp;&nbsp;if (0x40 &lt;= rawtext[i+1] &amp;&amp; rawtext[i+1] &lt;= 0x7E) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; column = rawtext[i+1] - 0x40;<BR>&nbsp;&nbsp;&nbsp;} else {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; column = rawtext[i+1] + 256 - 0x80;<BR>&nbsp;&nbsp;&nbsp;}<BR>&nbsp;&nbsp;&nbsp;//System.out.println("extended row " + row + " column " + column + " rawtext[i] " + rawtext[i]);<BR>&nbsp;&nbsp;&nbsp;if (GBKFreq[row][column] != 0) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; gbfreq += GBKFreq[row][column];<BR>&nbsp;&nbsp;&nbsp;}<BR>&nbsp;&nbsp;}<BR>&nbsp;&nbsp;i++;<BR>&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;}<BR>&nbsp;rangeval = 50 * ((float)gbchars/(float)dbchars);<BR>&nbsp;freqval = 50 * ((float)gbfreq/(float)totalfreq);</P>
<P>&nbsp;// For regular GB files, this would give the same score, so I handicap it slightly<BR>&nbsp;return (int)(rangeval + freqval) - 1;<BR>&nbsp;&nbsp;&nbsp; }</P>
<P>&nbsp;</P>
<P>&nbsp;&nbsp;&nbsp; /* Function:&nbsp; hz_probability<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Argument:&nbsp; byte array<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Returns :&nbsp; number from 0 to 100 representing probability<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; text in array uses HZ encoding<BR>&nbsp;&nbsp;&nbsp; */</P>
<P>&nbsp;&nbsp;&nbsp; int hz_probability(byte[] rawtext) {<BR>&nbsp;int i, rawtextlen;<BR>&nbsp;int hzchars = 0, dbchars = 1;<BR>&nbsp;long hzfreq = 0, totalfreq = 1;<BR>&nbsp;float rangeval = 0, freqval = 0;<BR>&nbsp;int hzstart = 0, hzend = 0;<BR>&nbsp;int row, column;</P>
<P>&nbsp;rawtextlen = rawtext.length;</P>
<P>&nbsp;for (i = 0; i &lt; rawtextlen; i++) {<BR>&nbsp;&nbsp;&nbsp;&nbsp; if (rawtext[i] == '~') {<BR>&nbsp;&nbsp;if (rawtext[i+1] == '{') {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; hzstart++;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; i+=2;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; while (i &lt; rawtextlen - 1) {<BR>&nbsp;&nbsp;&nbsp;if (rawtext[i] == 0x0A || rawtext[i] == 0x0D) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; break;<BR>&nbsp;&nbsp;&nbsp;} else if (rawtext[i] == '~' &amp;&amp; rawtext[i+1] == '}') {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; hzend++;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; i++;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; break;<BR>&nbsp;&nbsp;&nbsp;} else if ((0x21 &lt;= rawtext[i] &amp;&amp; rawtext[i] &lt;= 0x77) &amp;&amp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (0x21 &lt;= rawtext[i+1] &amp;&amp; rawtext[i+1] &lt;= 0x77)) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; hzchars+=2;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; row = rawtext[i] - 0x21;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; column = rawtext[i+1] - 0x21;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; totalfreq += 500;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (GBFreq[row][column] != 0) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;hzfreq += GBFreq[row][column];<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; } else if (15 &lt;= row &amp;&amp; row &lt; 55) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;hzfreq += 200;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;&nbsp;&nbsp;} else if ((0xA1 &lt;= rawtext[i] &amp;&amp; rawtext[i] &lt;= 0xF7) &amp;&amp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (0xA1 &lt;= rawtext[i+1] &amp;&amp; rawtext[i+1] &lt;= 0xF7)) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; hzchars+=2;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; row = rawtext[i] + 256 - 0xA1;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; column = rawtext[i+1] + 256 - 0xA1;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; totalfreq += 500;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (GBFreq[row][column] != 0) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;hzfreq += GBFreq[row][column];<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; } else if (15 &lt;= row &amp;&amp; row &lt; 55) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;hzfreq += 200;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;&nbsp;&nbsp;}<BR>&nbsp;&nbsp;&nbsp;dbchars+=2;<BR>&nbsp;&nbsp;&nbsp;i+=2;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;&nbsp;} else if (rawtext[i+1] == '}') {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; hzend++;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; i++;<BR>&nbsp;&nbsp;} else if (rawtext[i+1] == '~') {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; i++;<BR>&nbsp;&nbsp;}<BR>&nbsp;&nbsp;&nbsp;&nbsp; }</P>
<P>&nbsp;}</P>
<P>&nbsp;if (hzstart &gt; 4) {<BR>&nbsp;&nbsp;&nbsp;&nbsp; rangeval = 50;<BR>&nbsp;} else if (hzstart &gt; 1) {<BR>&nbsp;&nbsp;&nbsp;&nbsp; rangeval = 41;<BR>&nbsp;} else if (hzstart &gt; 0) { // Only 39 in case the sequence happened to occur<BR>&nbsp;&nbsp;&nbsp;&nbsp; rangeval = 39; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // in otherwise non-Hz text<BR>&nbsp;} else {<BR>&nbsp;&nbsp;&nbsp;&nbsp; rangeval = 0;<BR>&nbsp;}<BR>&nbsp;freqval = 50 * ((float)hzfreq/(float)totalfreq);</P>
<P>&nbsp;return (int)(rangeval + freqval);<BR>&nbsp;&nbsp;&nbsp; }</P>
<P>&nbsp;</P>
<P><BR>&nbsp;&nbsp;&nbsp; /** Function:&nbsp; big5_probability<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Argument:&nbsp; byte array<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Returns :&nbsp; number from 0 to 100 representing probability<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; text in array uses Big5 encoding<BR>&nbsp;&nbsp;&nbsp; */</P>
<P>&nbsp;&nbsp;&nbsp; int big5_probability(byte[] rawtext) {<BR>&nbsp;int score = 0;<BR>&nbsp;int i, rawtextlen = 0;<BR>&nbsp;int dbchars = 1, bfchars = 1;<BR>&nbsp;float rangeval = 0, freqval = 0;<BR>&nbsp;long bffreq = 0, totalfreq = 1;<BR>&nbsp;int row, column;</P>
<P>&nbsp;// Check to see if characters fit into acceptable ranges</P>
<P>&nbsp;rawtextlen = rawtext.length;<BR>&nbsp;for (i = 0; i &lt; rawtextlen-1; i++) {<BR>&nbsp;&nbsp;&nbsp;&nbsp; if (rawtext[i] &gt;= 0) {<BR>&nbsp;&nbsp;//asciichars++;<BR>&nbsp;&nbsp;&nbsp;&nbsp; } else {<BR>&nbsp;&nbsp;dbchars++;<BR>&nbsp;&nbsp;if ((byte)0xA1 &lt;= rawtext[i] &amp;&amp; rawtext[i] &lt;= (byte)0xF9 &amp;&amp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (((byte)0x40 &lt;= rawtext[i+1] &amp;&amp; rawtext[i+1] &lt;= (byte)0x7E) ||<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ((byte)0xA1 &lt;= rawtext[i+1] &amp;&amp; rawtext[i+1] &lt;= (byte)0xFE)))<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {<BR>&nbsp;&nbsp;&nbsp;bfchars++;<BR>&nbsp;&nbsp;&nbsp;totalfreq += 500;<BR>&nbsp;&nbsp;&nbsp;row = rawtext[i] + 256 - 0xA1;<BR>&nbsp;&nbsp;&nbsp;if (0x40 &lt;= rawtext[i+1] &amp;&amp; rawtext[i+1] &lt;= 0x7E) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; column = rawtext[i+1] - 0x40;<BR>&nbsp;&nbsp;&nbsp;} else {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; column = rawtext[i+1] + 256 - 0x61;<BR>&nbsp;&nbsp;&nbsp;}<BR>&nbsp;&nbsp;&nbsp;if (Big5Freq[row][column] != 0) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; bffreq += Big5Freq[row][column];<BR>&nbsp;&nbsp;&nbsp;} else if (3 &lt;= row &amp;&amp; row &lt;= 37) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; bffreq += 200;<BR>&nbsp;&nbsp;&nbsp;}<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;&nbsp;i++;<BR>&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;}<BR>&nbsp;rangeval = 50 * ((float)bfchars/(float)dbchars);<BR>&nbsp;freqval = 50 * ((float)bffreq/(float)totalfreq);</P>
<P>&nbsp;return (int)(rangeval + freqval);<BR>&nbsp;&nbsp;&nbsp; }</P>
<P>&nbsp;</P>
<P>&nbsp;&nbsp;&nbsp; /* Function:&nbsp; euc_tw_probability<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Argument:&nbsp; byte array<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Returns :&nbsp; number from 0 to 100 representing probability<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; text in array uses EUC-TW (CNS 11643) encoding<BR>&nbsp;&nbsp;&nbsp; */</P>
<P>&nbsp;&nbsp;&nbsp; int euc_tw_probability(byte[] rawtext) {<BR>&nbsp;int i, rawtextlen = 0;<BR>&nbsp;int dbchars = 1, cnschars = 1;<BR>&nbsp;long cnsfreq = 0, totalfreq = 1;<BR>&nbsp;float rangeval = 0, freqval = 0;<BR>&nbsp;int row, column;</P>
<P>&nbsp;// Check to see if characters fit into acceptable ranges<BR>&nbsp;// and have expected frequency of use</P>
<P>&nbsp;rawtextlen = rawtext.length;<BR>&nbsp;for (i = 0; i &lt; rawtextlen-1; i++) {<BR>&nbsp;&nbsp;&nbsp;&nbsp; if (rawtext[i] &gt;= 0) { // in ASCII range<BR>&nbsp;&nbsp;//asciichars++;<BR>&nbsp;&nbsp;&nbsp;&nbsp; } else {&nbsp; // high bit set<BR>&nbsp;&nbsp;dbchars++;<BR>&nbsp;&nbsp;if (i + 3 &lt; rawtextlen &amp;&amp; (byte)0x8E == rawtext[i] &amp;&amp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (byte)0xA1 &lt;= rawtext[i+1] &amp;&amp; rawtext[i+1] &lt;= (byte)0xB0 &amp;&amp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (byte)0xA1 &lt;= rawtext[i+2] &amp;&amp; rawtext[i+2] &lt;= (byte)0xFE &amp;&amp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (byte)0xA1 &lt;= rawtext[i+3] &amp;&amp; rawtext[i+3] &lt;= (byte)0xFE) { // Planes 1 - 16</P>
<P>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; cnschars++;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; //System.out.println("plane 2 or above CNS char");<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // These are all less frequent chars so just ignore freq<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; i+=3;<BR>&nbsp;&nbsp;} else if ((byte)0xA1 &lt;= rawtext[i] &amp;&amp; rawtext[i] &lt;= (byte)0xFE &amp;&amp; // Plane 1<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (byte)0xA1 &lt;= rawtext[i+1] &amp;&amp; rawtext[i+1] &lt;= (byte)0xFE)<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {<BR>&nbsp;&nbsp;&nbsp;cnschars++;<BR>&nbsp;&nbsp;&nbsp;totalfreq += 500;<BR>&nbsp;&nbsp;&nbsp;row = rawtext[i] + 256 - 0xA1;<BR>&nbsp;&nbsp;&nbsp;column = rawtext[i+1] + 256 - 0xA1;<BR>&nbsp;&nbsp;&nbsp;if (EUC_TWFreq[row][column] != 0) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; cnsfreq += EUC_TWFreq[row][column];<BR>&nbsp;&nbsp;&nbsp;} else if (35 &lt;= row &amp;&amp; row &lt;= 92) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; cnsfreq += 150;<BR>&nbsp;&nbsp;&nbsp;}<BR>&nbsp;&nbsp;&nbsp;i++;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;}</P>
<P>&nbsp;rangeval = 50 * ((float)cnschars/(float)dbchars);<BR>&nbsp;freqval = 50 * ((float)cnsfreq/(float)totalfreq);</P>
<P>&nbsp;return (int)(rangeval + freqval);<BR>&nbsp;&nbsp;&nbsp; }</P>
<P>&nbsp;</P>
<P>&nbsp;&nbsp;&nbsp; /* Function:&nbsp; iso_2022_cn_probability<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Argument:&nbsp; byte array<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Returns :&nbsp; number from 0 to 100 representing probability<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; text in array uses ISO 2022-CN encoding<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; WORKS FOR BASIC CASES, BUT STILL NEEDS MORE WORK<BR>&nbsp;&nbsp;&nbsp; */</P>
<P>&nbsp;&nbsp;&nbsp; int iso_2022_cn_probability(byte[] rawtext) {<BR>&nbsp;int i, rawtextlen = 0;<BR>&nbsp;int dbchars = 1, isochars = 1;<BR>&nbsp;long isofreq = 0, totalfreq = 1;<BR>&nbsp;float rangeval = 0, freqval = 0;<BR>&nbsp;int row, column;</P>
<P>&nbsp;// Check to see if characters fit into acceptable ranges<BR>&nbsp;// and have expected frequency of use</P>
<P>&nbsp;rawtextlen = rawtext.length;<BR>&nbsp;for (i = 0; i &lt; rawtextlen-1; i++) {<BR>&nbsp;&nbsp;&nbsp;&nbsp; if (rawtext[i] == (byte)0x1B &amp;&amp; i+3 &lt; rawtextlen) { // Escape char ESC<BR>&nbsp;&nbsp;if (rawtext[i+1] == (byte)0x24 &amp;&amp; rawtext[i+2] == 0x29 &amp;&amp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; rawtext[i+3] == (byte)0x41) {&nbsp; // GB Escape&nbsp; $ ) A<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; i += 4;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; while (rawtext[i] != (byte)0x1B) {<BR>&nbsp;&nbsp;&nbsp;dbchars++;<BR>&nbsp;&nbsp;&nbsp;if ((0x21 &lt;= rawtext[i] &amp;&amp; rawtext[i] &lt;= 0x77) &amp;&amp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (0x21 &lt;= rawtext[i+1] &amp;&amp; rawtext[i+1] &lt;= 0x77)) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; isochars++;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; row = rawtext[i] - 0x21;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; column = rawtext[i+1] - 0x21;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; totalfreq += 500;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (GBFreq[row][column] != 0) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;isofreq += GBFreq[row][column];<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; } else if (15 &lt;= row &amp;&amp; row &lt; 55) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;isofreq += 200;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; i++;<BR>&nbsp;&nbsp;&nbsp;}<BR>&nbsp;&nbsp;&nbsp;i++;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;&nbsp;} else if (i+3 &lt; rawtextlen &amp;&amp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; rawtext[i+1] == (byte)0x24 &amp;&amp; rawtext[i+2] == (byte)0x29 &amp;&amp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; rawtext[i+3] == (byte)0x47) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // CNS Escape $ ) G<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; i+=4;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; while (rawtext[i] != (byte)0x1B) {<BR>&nbsp;&nbsp;&nbsp;dbchars++;<BR>&nbsp;&nbsp;&nbsp;if ((byte)0x21 &lt;= rawtext[i] &amp;&amp; rawtext[i] &lt;= (byte)0x7E &amp;&amp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (byte)0x21 &lt;= rawtext[i+1] &amp;&amp; rawtext[i+1] &lt;= (byte)0x7E)<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {<BR>&nbsp;&nbsp;&nbsp;&nbsp;isochars++;<BR>&nbsp;&nbsp;&nbsp;&nbsp;totalfreq += 500;<BR>&nbsp;&nbsp;&nbsp;&nbsp;row = rawtext[i] - 0x21;<BR>&nbsp;&nbsp;&nbsp;&nbsp;column = rawtext[i+1] - 0x21;<BR>&nbsp;&nbsp;&nbsp;&nbsp;if (EUC_TWFreq[row][column] != 0) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; isofreq += EUC_TWFreq[row][column];<BR>&nbsp;&nbsp;&nbsp;&nbsp;} else if (35 &lt;= row &amp;&amp; row &lt;= 92) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; isofreq += 150;<BR>&nbsp;&nbsp;&nbsp;&nbsp;}<BR>&nbsp;&nbsp;&nbsp;&nbsp;i++;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;&nbsp;&nbsp;i++;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;&nbsp;}<BR>&nbsp;&nbsp;if (rawtext[i] == (byte)0x1B &amp;&amp; i+2 &lt; rawtextlen &amp;&amp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; rawtext[i+1] == (byte)0x28 &amp;&amp; rawtext[i+2] == (byte)0x42) { // ASCII:&nbsp; ESC ( B<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; i+=2;<BR>&nbsp;&nbsp;}<BR>&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;}<BR>&nbsp;rangeval = 50 * ((float)isochars/(float)dbchars);<BR>&nbsp;freqval = 50 * ((float)isofreq/(float)totalfreq);</P>
<P>&nbsp;//System.out.println("isochars dbchars isofreq totalfreq " + isochars + " " + dbchars + " " + isofreq + " " + totalfreq + " " + rangeval + " " + freqval);</P>
<P>&nbsp;return (int)(rangeval + freqval);<BR>&nbsp;//return 0;<BR>&nbsp;&nbsp;&nbsp; }</P>
<P>&nbsp;</P>
<P>&nbsp;&nbsp;&nbsp; /* Function:&nbsp; utf8_probability<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Argument:&nbsp; byte array<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Returns :&nbsp; number from 0 to 100 representing probability<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; text in array uses UTF-8 encoding of Unicode<BR>&nbsp;&nbsp;&nbsp; */</P>
<P>&nbsp;&nbsp;&nbsp; int utf8_probability(byte[] rawtext) {<BR>&nbsp;int score = 0;<BR>&nbsp;int i, rawtextlen = 0;<BR>&nbsp;int goodbytes = 0, asciibytes = 0;</P>
<P>&nbsp;// Maybe also use UTF8 Byte Order Mark:&nbsp; EF BB BF</P>
<P>&nbsp;// Check to see if characters fit into acceptable ranges<BR>&nbsp;rawtextlen = rawtext.length;<BR>&nbsp;for (i = 0; i &lt; rawtextlen; i++) {<BR>&nbsp;&nbsp;&nbsp;&nbsp; if ((rawtext[i] &amp; (byte)0x7F) == rawtext[i]) {&nbsp; // One byte<BR>&nbsp;&nbsp;asciibytes++;<BR>&nbsp;&nbsp;// Ignore ASCII, can throw off count<BR>&nbsp;&nbsp;&nbsp;&nbsp; } else if (-64 &lt;= rawtext[i] &amp;&amp; rawtext[i] &lt;= -33 &amp;&amp; // Two bytes<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; i+1 &lt; rawtextlen &amp;&amp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -128 &lt;= rawtext[i+1] &amp;&amp; rawtext[i+1] &lt;= -65) {<BR>&nbsp;&nbsp;goodbytes += 2;<BR>&nbsp;&nbsp;i++;<BR>&nbsp;&nbsp;&nbsp;&nbsp; } else if (-32 &lt;= rawtext[i] &amp;&amp; rawtext[i] &lt;= -17 &amp;&amp; // Three bytes<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; i+2 &lt; rawtextlen &amp;&amp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -128 &lt;= rawtext[i+1] &amp;&amp; rawtext[i+1] &lt;= -65 &amp;&amp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -128 &lt;= rawtext[i+2] &amp;&amp; rawtext[i+2] &lt;= -65) {<BR>&nbsp;&nbsp;goodbytes += 3;<BR>&nbsp;&nbsp;i+=2;<BR>&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;}</P>
<P>&nbsp;if (asciibytes == rawtextlen) { return 0; }</P>
<P>&nbsp;score = (int)(100 * ((float)goodbytes/(float)(rawtextlen-asciibytes)));</P>
<P>&nbsp;// If not above 98, reduce to zero to prevent coincidental matches<BR>&nbsp;// Allows for some (few) bad formed sequences<BR>&nbsp;if (score &gt; 98) {<BR>&nbsp;&nbsp;&nbsp;&nbsp; return score;<BR>&nbsp;} else if (score &gt; 95 &amp;&amp; goodbytes &gt; 30) {<BR>&nbsp;&nbsp;&nbsp;&nbsp; return score;<BR>&nbsp;} else {<BR>&nbsp;&nbsp;&nbsp;&nbsp; return 0;<BR>&nbsp;}</P>
<P>&nbsp;&nbsp;&nbsp; }</P>
<P><BR>&nbsp;&nbsp;&nbsp; /* Function:&nbsp; utf16_probability<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Argument:&nbsp; byte array<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Returns :&nbsp; number from 0 to 100 representing probability<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; text in array uses UTF-16 encoding of Unicode, guess based on BOM<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // NOT VERY GENERAL, NEEDS MUCH MORE WORK<BR>&nbsp;&nbsp;&nbsp; */</P>
<P>&nbsp;&nbsp;&nbsp; int utf16_probability(byte[] rawtext) {<BR>&nbsp;//int score = 0;<BR>&nbsp;//int i, rawtextlen = 0;<BR>&nbsp;//int goodbytes = 0, asciibytes = 0;</P>
<P>&nbsp;if (((byte)0xFE == rawtext[0] &amp;&amp; (byte)0xFF == rawtext[1]) ||&nbsp; // Big-endian<BR>&nbsp;&nbsp;&nbsp;&nbsp; ((byte)0xFF == rawtext[0] &amp;&amp; (byte)0xFE == rawtext[1])) {&nbsp; // Little-endian<BR>&nbsp;&nbsp;&nbsp;&nbsp; return 100;<BR>&nbsp;}</P>
<P>&nbsp;return 0;</P>
<P>&nbsp;/*&nbsp;// Check to see if characters fit into acceptable ranges<BR>&nbsp;rawtextlen = rawtext.length;<BR>&nbsp;for (i = 0; i &lt; rawtextlen; i++) {<BR>&nbsp;&nbsp;&nbsp;&nbsp; if ((rawtext[i] &amp; (byte)0x7F) == rawtext[i]) {&nbsp; // One byte<BR>&nbsp;&nbsp;goodbytes += 1;<BR>&nbsp;&nbsp;asciibytes++;<BR>&nbsp;&nbsp;&nbsp;&nbsp; } else if ((rawtext[i] &amp; (byte)0xDF) == rawtext[i]) { // Two bytes<BR>&nbsp;&nbsp;if (i+1 &lt; rawtextlen &amp;&amp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (rawtext[i+1] &amp; (byte)0xBF) == rawtext[i+1]) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; goodbytes += 2;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; i++;<BR>&nbsp;&nbsp;}<BR>&nbsp;&nbsp;&nbsp;&nbsp; } else if ((rawtext[i] &amp; (byte)0xEF) == rawtext[i]) { // Three bytes<BR>&nbsp;&nbsp;if (i+2 &lt; rawtextlen &amp;&amp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (rawtext[i+1] &amp; (byte)0xBF) == rawtext[i+1] &amp;&amp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (rawtext[i+2] &amp; (byte)0xBF) == rawtext[i+2]) {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; goodbytes += 3;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; i+=2;<BR>&nbsp;&nbsp;}<BR>&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;}</P>
<P>&nbsp;score = (int)(100 * ((float)goodbytes/(float)rawtext.length));</P>
<P>&nbsp;// An all ASCII file is also a good UTF8 file, but I'd rather it<BR>&nbsp;// get identified as ASCII.&nbsp; Can delete following 3 lines otherwise<BR>&nbsp;if (goodbytes == asciibytes) {<BR>&nbsp;&nbsp;&nbsp;&nbsp; score = 0;<BR>&nbsp;}</P>
<P>&nbsp;// If not above 90, reduce to zero to prevent coincidental matches<BR>&nbsp;if (score &gt; 90) {<BR>&nbsp;&nbsp;&nbsp;&nbsp; return score;<BR>&nbsp;} else {<BR>&nbsp;&nbsp;&nbsp;&nbsp; return 0;<BR>&nbsp;&nbsp;&nbsp;&nbsp; } */</P>
<P>&nbsp;&nbsp;&nbsp; }</P>
<P>&nbsp;</P>
<P>&nbsp;&nbsp;&nbsp; /* Function:&nbsp; ascii_probability<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Argument:&nbsp; byte array<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Returns :&nbsp; number from 0 to 100 representing probability<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; text in array uses all ASCII<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Description:&nbsp; Sees if array has any characters not in<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ASCII range, if so, score is reduced<BR>&nbsp;&nbsp;&nbsp; */</P>
<P>&nbsp;&nbsp;&nbsp; int ascii_probability(byte[] rawtext) {<BR>&nbsp;int score = 70;<BR>&nbsp;int i, rawtextlen;</P>
<P>&nbsp;rawtextlen = rawtext.length;</P>
<P>&nbsp;for (i = 0; i &lt; rawtextlen; i++) {<BR>&nbsp;&nbsp;&nbsp;&nbsp; if (rawtext[i] &lt; 0) {<BR>&nbsp;&nbsp;score = score - 5;<BR>&nbsp;&nbsp;&nbsp;&nbsp; } else if (rawtext[i] == (byte)0x1B) { // ESC (used by ISO 2022)<BR>&nbsp;&nbsp;score = score - 5;<BR>&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;}</P>
<P>&nbsp;return score;<BR>&nbsp;&nbsp;&nbsp; }</P>
<P>&nbsp;</P>
<P>&nbsp;&nbsp;&nbsp; void initialize_frequencies() {<BR>&nbsp;int i, j;</P>
<P>&nbsp;for (i = 0; i &lt; 93; i++) {<BR>&nbsp;&nbsp;&nbsp;&nbsp; for (j = 0; j &lt; 93; j++) {<BR>&nbsp;&nbsp;GBFreq[i][j] = 0;<BR>&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;}</P>
<P>&nbsp;for (i = 0; i &lt; 126; i++) {<BR>&nbsp;&nbsp;&nbsp;&nbsp; for (j = 0; j &lt; 191; j++) {<BR>&nbsp;&nbsp;GBKFreq[i][j] = 0;<BR>&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;}</P>
<P>&nbsp;for (i = 0; i &lt; 93; i++) {<BR>&nbsp;&nbsp;&nbsp;&nbsp; for (j = 0; j &lt; 157; j++) {<BR>&nbsp;&nbsp;Big5Freq[i][j] = 0;<BR>&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;}</P>
<P>&nbsp;for (i = 0; i &lt; 93; i++) {<BR>&nbsp;&nbsp;&nbsp;&nbsp; for (j = 0; j &lt; 93; j++) {<BR>&nbsp;&nbsp;EUC_TWFreq[i][j] = 0;<BR>&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;}</P>
<P>&nbsp;GBFreq[20][35] = 599;&nbsp;GBFreq[49][26] = 598;<BR>&nbsp;GBFreq[41][38] = 597;&nbsp;GBFreq[17][26] = 596;<BR>&nbsp;GBFreq[32][42] = 595;&nbsp;GBFreq[39][42] = 594;<BR>&nbsp;GBFreq[45][49] = 593;&nbsp;GBFreq[51][57] = 592;<BR>&nbsp;GBFreq[50][47] = 591;&nbsp;GBFreq[42][90] = 590;<BR>&nbsp;GBFreq[52][65] = 589;&nbsp;GBFreq[53][47] = 588;<BR>&nbsp;GBFreq[19][82] = 587;&nbsp;GBFreq[31][19] = 586;<BR>&nbsp;GBFreq[40][46] = 585;&nbsp;GBFreq[24][89] = 584;<BR>&nbsp;GBFreq[23][85] = 583;&nbsp;GBFreq[20][28] = 582;<BR>&nbsp;GBFreq[42][20] = 581;&nbsp;GBFreq[34][38] = 580;<BR>&nbsp;GBFreq[45][9] = 579;&nbsp;GBFreq[54][50] = 578;<BR>&nbsp;GBFreq[25][44] = 577;&nbsp;GBFreq[35][66] = 576;<BR>&nbsp;GBFreq[20][55] = 575;&nbsp;GBFreq[18][85] = 574;<BR>&nbsp;GBFreq[20][31] = 573;&nbsp;GBFreq[49][17] = 572;<BR>&nbsp;GBFreq[41][16] = 571;&nbsp;GBFreq[35][73] = 570;<BR>&nbsp;GBFreq[20][34] = 569;&nbsp;GBFreq[29][44] = 568;<BR>&nbsp;GBFreq[35][38] = 567;&nbsp;GBFreq[49][9] = 566;<BR>&nbsp;GBFreq[46][33] = 565;&nbsp;GBFreq[49][51] = 564;<BR>&nbsp;GBFreq[40][89] = 563;&nbsp;GBFreq[26][64] = 562;<BR>&nbsp;GBFreq[54][51] = 561;&nbsp;GBFreq[54][36] = 560;<BR>&nbsp;GBFreq[39][4] = 559;&nbsp;GBFreq[53][13] = 558;<BR>&nbsp;GBFreq[24][92] = 557;&nbsp;GBFreq[27][49] = 556;<BR>&nbsp;GBFreq[48][6] = 555;&nbsp;GBFreq[21][51] = 554;<BR>&nbsp;GBFreq[30][40] = 553;&nbsp;GBFreq[42][92] = 552;<BR>&nbsp;GBFreq[31][78] = 551;&nbsp;GBFreq[25][82] = 550;<BR>&nbsp;GBFreq[47][0] = 549;&nbsp;GBFreq[34][19] = 548;<BR>&nbsp;GBFreq[47][35] = 547;&nbsp;GBFreq[21][63] = 546;<BR>&nbsp;GBFreq[43][75] = 545;&nbsp;GBFreq[21][87] = 544;<BR>&nbsp;GBFreq[35][59] = 543;&nbsp;GBFreq[25][34] = 542;<BR>&nbsp;GBFreq[21][27] = 541;&nbsp;GBFreq[39][26] = 540;<BR>&nbsp;GBFreq[34][26] = 539;&nbsp;GBFreq[39][52] = 538;<BR>&nbsp;GBFreq[50][57] = 537;&nbsp;GBFreq[37][79] = 536;<BR>&nbsp;GBFreq[26][24] = 535;&nbsp;GBFreq[22][1] = 534;<BR>&nbsp;GBFreq[18][40] = 533;&nbsp;GBFreq[41][33] = 532;<BR>&nbsp;GBFreq[53][26] = 531;&nbsp;GBFreq[54][86] = 530;<BR>&nbsp;GBFreq[20][16] = 529;&nbsp;GBFreq[46][74] = 528;<BR>&nbsp;GBFreq[30][19] = 527;&nbsp;GBFreq[45][35] = 526;<BR>&nbsp;GBFreq[45][61] = 525;&nbsp;GBFreq[30][9] = 524;<BR>&nbsp;GBFreq[41][53] = 523;&nbsp;GBFreq[41][13] = 522;<BR>&nbsp;GBFreq[50][34] = 521;&nbsp;GBFreq[53][86] = 520;<BR>&nbsp;GBFreq[47][47] = 519;&nbsp;GBFreq[22][28] = 518;<BR>&nbsp;GBFreq[50][53] = 517;&nbsp;GBFreq[39][70] = 516;<BR>&nbsp;GBFreq[38][15] = 515;&nbsp;GBFreq[42][88] = 514;<BR>&nbsp;GBFreq[16][29] = 513;&nbsp;GBFreq[27][90] = 512;<BR>&nbsp;GBFreq[29][12] = 511;&nbsp;GBFreq[44][22] = 510;<BR>&nbsp;GBFreq[34][69] = 509;&nbsp;GBFreq[24][10] = 508;<BR>&nbsp;GBFreq[44][11] = 507;&nbsp;GBFreq[39][92] = 506;<BR>&nbsp;GBFreq[49][48] = 505;&nbsp;GBFreq[31][46] = 504;<BR>&nbsp;GBFreq[19][50] = 503;&nbsp;GBFreq[21][14] = 502;<BR>&nbsp;GBFreq[32][28] = 501;&nbsp;GBFreq[18][3] = 500;<BR>&nbsp;GBFreq[53][9] = 499;&nbsp;GBFreq[34][80] = 498;<BR>&nbsp;GBFreq[48][88] = 497;&nbsp;GBFreq[46][53] = 496;<BR>&nbsp;GBFreq[22][53] = 495;&nbsp;GBFreq[28][10] = 494;<BR>&nbsp;GBFreq[44][65] = 493;&nbsp;GBFreq[20][10] = 492;<BR>&nbsp;GBFreq[40][76] = 491;&nbsp;GBFreq[47][8] = 490;<BR>&nbsp;GBFreq[50][74] = 489;&nbsp;GBFreq[23][62] = 488;<BR>&nbsp;GBFreq[49][65] = 487;&nbsp;GBFreq[28][87] = 486;<BR>&nbsp;GBFreq[15][48] = 485;&nbsp;GBFreq[22][7] = 484;<BR>&nbsp;GBFreq[19][42] = 483;&nbsp;GBFreq[41][20] = 482;<BR>&nbsp;GBFreq[26][55] = 481;&nbsp;GBFreq[21][93] = 480;<BR>&nbsp;GBFreq[31][76] = 479;&nbsp;GBFreq[34][31] = 478;<BR>&nbsp;GBFreq[20][66] = 477;&nbsp;GBFreq[51][33] = 476;<BR>&nbsp;GBFreq[34][86] = 475;&nbsp;GBFreq[37][67] = 474;<BR>&nbsp;GBFreq[53][53] = 473;&nbsp;GBFreq[40][88] = 472;<BR>&nbsp;GBFreq[39][10] = 471;&nbsp;GBFreq[24][3] = 470;<BR>&nbsp;GBFreq[27][25] = 469;&nbsp;GBFreq[26][15] = 468;<BR>&nbsp;GBFreq[21][88] = 467;&nbsp;GBFreq[52][62] = 466;<BR>&nbsp;GBFreq[46][81] = 465;&nbsp;GBFreq[38][72] = 464;<BR>&nbsp;GBFreq[17][30] = 463;&nbsp;GBFreq[52][92] = 462;<BR>&nbsp;GBFreq[34][90] = 461;&nbsp;GBFreq[21][7] = 460;<BR>&nbsp;GBFreq[36][13] = 459;&nbsp;GBFreq[45][41] = 458;<BR>&nbsp;GBFreq[32][5] = 457;&nbsp;GBFreq[26][89] = 456;<BR>&nbsp;GBFreq[23][87] = 455;&nbsp;GBFreq[20][39] = 454;<BR>&nbsp;GBFreq[27][23] = 453;&nbsp;GBFreq[25][59] = 452;<BR>&nbsp;GBFreq[49][20] = 451;&nbsp;GBFreq[54][77] = 450;<BR>&nbsp;GBFreq[27][67] = 449;&nbsp;GBFreq[47][33] = 448;<BR>&nbsp;GBFreq[41][17] = 447;&nbsp;GBFreq[19][81] = 446;<BR>&nbsp;GBFreq[16][66] = 445;&nbsp;GBFreq[45][26] = 444;<BR>&nbsp;GBFreq[49][81] = 443;&nbsp;GBFreq[53][55] = 442;<BR>&nbsp;GBFreq[16][26] = 441;&nbsp;GBFreq[54][62] = 440;<BR>&nbsp;GBFreq[20][70] = 439;&nbsp;GBFreq[42][35] = 438;<BR>&nbsp;GBFreq[20][57] = 437;&nbsp;GBFreq[34][36] = 436;<BR>&nbsp;GBFreq[46][63] = 435;&nbsp;GBFreq[19][45] = 434;<BR>&nbsp;GBFreq[21][10] = 433;&nbsp;GBFreq[52][93] = 432;<BR>&nbsp;GBFreq[25][2] = 431;&nbsp;GBFreq[30][57] = 430;<BR>&nbsp;GBFreq[41][24] = 429;&nbsp;GBFreq[28][43] = 428;<BR>&nbsp;GBFreq[45][86] = 427;&nbsp;GBFreq[51][56] = 426;<BR>&nbsp;GBFreq[37][28] = 425;&nbsp;GBFreq[52][69] = 424;<BR>&nbsp;GBFreq[43][92] = 423;&nbsp;GBFreq[41][31] = 422;<BR>&nbsp;GBFreq[37][87] = 421;&nbsp;GBFreq[47][36] = 420;<BR>&nbsp;GBFreq[16][16] = 419;&nbsp;GBFreq[40][56] = 418;<BR>&nbsp;GBFreq[24][55] = 417;&nbsp;GBFreq[17][1] = 416;<BR>&nbsp;GBFreq[35][57] = 415;&nbsp;GBFreq[27][50] = 414;<BR>&nbsp;GBFreq[26][14] = 413;&nbsp;GBFreq[50][40] = 412;<BR>&nbsp;GBFreq[39][19] = 411;&nbsp;GBFreq[19][89] = 410;<BR>GBFreq[29][91] = 409;&nbsp;GBFreq[17][89] = 408;<BR>GBFreq[39][74] = 407;&nbsp;GBFreq[46][39] = 406;<BR>GBFreq[40][28] = 405;&nbsp;GBFreq[45][68] = 404;<BR>GBFreq[43][10] = 403;&nbsp;GBFreq[42][13] = 402;<BR>GBFreq[44][81] = 401;&nbsp;GBFreq[41][47] = 400;<BR>GBFreq[48][58] = 399;&nbsp;GBFreq[43][68] = 398;<BR>GBFreq[16][79] = 397;&nbsp;GBFreq[19][5] = 396;<BR>GBFreq[54][59] = 395;&nbsp;GBFreq[17][36] = 394;<BR>GBFreq[18][0] = 393;&nbsp;GBFreq[41][5] = 392;<BR>GBFreq[41][72] = 391;&nbsp;GBFreq[16][39] = 390;<BR>GBFreq[54][0] = 389;&nbsp;GBFreq[51][16] = 388;<BR>GBFreq[29][36] = 387;&nbsp;GBFreq[47][5] = 386;<BR>GBFreq[47][51] = 385;&nbsp;GBFreq[44][7] = 384;<BR>GBFreq[35][30] = 383;&nbsp;GBFreq[26][9] = 382;<BR>GBFreq[16][7] = 381;&nbsp;GBFreq[32][1] = 380;<BR>GBFreq[33][76] = 379;&nbsp;GBFreq[34][91] = 378;<BR>GBFreq[52][36] = 377;&nbsp;GBFreq[26][77] = 376;<BR>GBFreq[35][48] = 375;&nbsp;GBFreq[40][80] = 374;<BR>GBFreq[41][92] = 373;&nbsp;GBFreq[27][93] = 372;<BR>GBFreq[15][17] = 371;&nbsp;GBFreq[16][76] = 370;<BR>GBFreq[51][12] = 369;&nbsp;GBFreq[18][20] = 368;<BR>GBFreq[15][54] = 367;&nbsp;GBFreq[50][5] = 366;<BR>GBFreq[33][22] = 365;&nbsp;GBFreq[37][57] = 364;<BR>GBFreq[28][47] = 363;&nbsp;GBFreq[42][31] = 362;<BR>GBFreq[18][2] = 361;&nbsp;GBFreq[43][64] = 360;<BR>GBFreq[23][47] = 359;&nbsp;GBFreq[28][79] = 358;<BR>GBFreq[25][45] = 357;&nbsp;GBFreq[23][91] = 356;<BR>GBFreq[22][19] = 355;&nbsp;GBFreq[25][46] = 354;<BR>GBFreq[22][36] = 353;&nbsp;GBFreq[54][85] = 352;<BR>GBFreq[46][20] = 351;&nbsp;GBFreq[27][37] = 350;<BR>GBFreq[26][81] = 349;&nbsp;GBFreq[42][29] = 348;<BR>GBFreq[31][90] = 347;&nbsp;GBFreq[41][59] = 346;<BR>GBFreq[24][65] = 345;&nbsp;GBFreq[44][84] = 344;<BR>GBFreq[24][90] = 343;&nbsp;GBFreq[38][54] = 342;<BR>GBFreq[28][70] = 341;&nbsp;GBFreq[27][15] = 340;<BR>GBFreq[28][80] = 339;&nbsp;GBFreq[29][8] = 338;<BR>GBFreq[45][80] = 337;&nbsp;GBFreq[53][37] = 336;<BR>GBFreq[28][65] = 335;&nbsp;GBFreq[23][86] = 334;<BR>GBFreq[39][45] = 333;&nbsp;GBFreq[53][32] = 332;<BR>GBFreq[38][68] = 331;&nbsp;GBFreq[45][78] = 330;<BR>GBFreq[43][7] = 329;&nbsp;GBFreq[46][82] = 328;<BR>GBFreq[27][38] = 327;&nbsp;GBFreq[16][62] = 326;<BR>GBFreq[24][17] = 325;&nbsp;GBFreq[22][70] = 324;<BR>GBFreq[52][28] = 323;&nbsp;GBFreq[23][40] = 322;<BR>GBFreq[28][50] = 321;&nbsp;GBFreq[42][91] = 320;<BR>GBFreq[47][76] = 319;&nbsp;GBFreq[15][42] = 318;<BR>GBFreq[43][55] = 317;&nbsp;GBFreq[29][84] = 316;<BR>GBFreq[44][90] = 315;&nbsp;GBFreq[53][16] = 314;<BR>GBFreq[22][93] = 313;&nbsp;GBFreq[34][10] = 312;<BR>GBFreq[32][53] = 311;&nbsp;GBFreq[43][65] = 310;<BR>GBFreq[28][7] = 309;&nbsp;GBFreq[35][46] = 308;<BR>GBFreq[21][39] = 307;&nbsp;GBFreq[44][18] = 306;<BR>GBFreq[40][10] = 305;&nbsp;GBFreq[54][53] = 304;<BR>GBFreq[38][74] = 303;&nbsp;GBFreq[28][26] = 302;<BR>GBFreq[15][13] = 301;&nbsp;GBFreq[39][34] = 300;<BR>GBFreq[39][46] = 299;&nbsp;GBFreq[42][66] = 298;<BR>GBFreq[33][58] = 297;&nbsp;GBFreq[15][56] = 296;<BR>GBFreq[18][51] = 295;&nbsp;GBFreq[49][68] = 294;<BR>GBFreq[30][37] = 293;&nbsp;GBFreq[51][84] = 292;<BR>GBFreq[51][9] = 291;&nbsp;GBFreq[40][70] = 290;<BR>GBFreq[41][84] = 289;&nbsp;GBFreq[28][64] = 288;<BR>GBFreq[32][88] = 287;&nbsp;GBFreq[24][5] = 286;<BR>GBFreq[53][23] = 285;&nbsp;GBFreq[42][27] = 284;<BR>GBFreq[22][38] = 283;&nbsp;GBFreq[32][86] = 282;<BR>GBFreq[34][30] = 281;&nbsp;GBFreq[38][63] = 280;<BR>GBFreq[24][59] = 279;&nbsp;GBFreq[22][81] = 278;<BR>GBFreq[32][11] = 277;&nbsp;GBFreq[51][21] = 276;<BR>GBFreq[54][41] = 275;&nbsp;GBFreq[21][50] = 274;<BR>GBFreq[23][89] = 273;&nbsp;GBFreq[19][87] = 272;<BR>GBFreq[26][7] = 271;&nbsp;GBFreq[30][75] = 270;<BR>GBFreq[43][84] = 269;&nbsp;GBFreq[51][25] = 268;<BR>GBFreq[16][67] = 267;&nbsp;GBFreq[32][9] = 266;<BR>GBFreq[48][51] = 265;&nbsp;GBFreq[39][7] = 264;<BR>GBFreq[44][88] = 263;&nbsp;GBFreq[52][24] = 262;<BR>GBFreq[23][34] = 261;&nbsp;GBFreq[32][75] = 260;<BR>GBFreq[19][10] = 259;&nbsp;GBFreq[28][91] = 258;<BR>GBFreq[32][83] = 257;&nbsp;GBFreq[25][75] = 256;<BR>GBFreq[53][45] = 255;&nbsp;GBFreq[29][85] = 254;<BR>GBFreq[53][59] = 253;&nbsp;GBFreq[16][2] = 252;<BR>GBFreq[19][78] = 251;&nbsp;GBFreq[15][75] = 250;<BR>GBFreq[51][42] = 249;&nbsp;GBFreq[45][67] = 248;<BR>GBFreq[15][74] = 247;&nbsp;GBFreq[25][81] = 246;<BR>GBFreq[37][62] = 245;&nbsp;GBFreq[16][55] = 244;<BR>GBFreq[18][38] = 243;&nbsp;GBFreq[23][23] = 242;<BR></P><img src ="http://www.blogjava.net/bluelily22/aggbug/16330.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/bluelily22/" target="_blank">丁丁</a> 2005-10-21 19:50 <a href="http://www.blogjava.net/bluelily22/archive/2005/10/21/16330.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>上传下载全攻略jspSmartUpload</title><link>http://www.blogjava.net/bluelily22/archive/2005/10/21/16323.html</link><dc:creator>丁丁</dc:creator><author>丁丁</author><pubDate>Fri, 21 Oct 2005 10:09:00 GMT</pubDate><guid>http://www.blogjava.net/bluelily22/archive/2005/10/21/16323.html</guid><wfw:comment>http://www.blogjava.net/bluelily22/comments/16323.html</wfw:comment><comments>http://www.blogjava.net/bluelily22/archive/2005/10/21/16323.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/bluelily22/comments/commentRss/16323.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/bluelily22/services/trackbacks/16323.html</trackback:ping><description><![CDATA[<DIV align=left>摘自：<A href="http://www.j2eesp.com" target=_blank><SPAN class=px12>http://www.j2eesp.com</SPAN></A> <BR><BR><BR>一、安装篇&nbsp;<BR><BR>　　jspSmartUpload是由www.jspsmart.com网站开发的一个可免费使用的全功能的文件上传下载组件，适于嵌入执行上传下载操作的JSP文件中。该组件有以下几个特点：&nbsp;<BR><BR>1、使用简单。在JSP文件中仅仅书写三五行JAVA代码就可以搞定文件的上传或下载，方便。&nbsp;<BR><BR>2、能全程控制上传。利用jspSmartUpload组件提供的对象及其操作方法，可以获得全部上传文件的信息（包括文件名，大小，类型，扩展名，文件数据等），方便存取。&nbsp;<BR><BR>3、能对上传的文件在大小、类型等方面做出限制。如此可以滤掉不符合要求的文件。&nbsp;<BR><BR>4、下载灵活。仅写两行代码，就能把Web服务器变成文件服务器。不管文件在Web服务器的目录下或在其它任何目录下，都可以利用jspSmartUpload进行下载。&nbsp;<BR><BR>5、能将文件上传到数据库中，也能将数据库中的数据下载下来。这种功能针对的是MYSQL数据库，因为不具有通用性，所以本文不准备举例介绍这种用法。&nbsp;<BR><BR>　　jspSmartUpload组件可以从www.jspsmart.com网站上自由下载，压缩包的名字是jspSmartUpload.zip。下载后，用WinZip或WinRAR将其解压到Tomcat的webapps目录下（本文以Tomcat服务器为例进行介绍）。解压后，将webapps/jspsmartupload目录下的子目录Web-inf名字改为全大写的WEB-INF，这样一改jspSmartUpload类才能使用。因为Tomcat对文件名大小写敏感，它要求Web应用程序相关的类所在目录为WEB-INF，且必须是大写。接着重新启动Tomcat，这样就可以在JSP文件中使用jspSmartUpload组件了。&nbsp;<BR><BR>　　注意，按上述方法安装后，只有webapps/jspsmartupload目录下的程序可以使用jspSmartUpload组件，如果想让Tomcat服务器的所有Web应用程序都能用它，必须做如下工作：&nbsp;<BR><BR>1．进入命令行状态，将目录切换到Tomcat的webapps/jspsmartupload/WEB-INF目录下。&nbsp;<BR><BR>2．运行JAR打包命令：jar&nbsp;cvf&nbsp;jspSmartUpload.jar&nbsp;com&nbsp;<BR><BR>（也可以打开资源管理器，切换到当前目录，用WinZip将com目录下的所有文件压缩成jspSmartUpload.zip，然后将jspSmartUpload.zip换名为jspSmartUpload.jar文件即可。）&nbsp;<BR><BR>3．将jspSmartUpload.jar拷贝到Tomcat的shared/lib目录下。&nbsp;<BR><BR>二、相关类说明篇&nbsp;<BR><BR>㈠&nbsp;File类&nbsp;<BR><BR>　　这个类包装了一个上传文件的所有信息。通过它，可以得到上传文件的文件名、文件大小、扩展名、文件数据等信息。&nbsp;<BR><BR>　　File类主要提供以下方法：&nbsp;<BR><BR>1、saveAs作用：将文件换名另存。&nbsp;<BR><BR>原型：&nbsp;<BR><BR>public&nbsp;void&nbsp;saveAs(java.lang.String&nbsp;destFilePathName)&nbsp;<BR><BR>或&nbsp;<BR><BR>public&nbsp;void&nbsp;saveAs(java.lang.String&nbsp;destFilePathName,&nbsp;int&nbsp;optionSaveAs)&nbsp;<BR><BR>其中，destFilePathName是另存的文件名，optionSaveAs是另存的选项，该选项有三个值，分别是SAVEAS_PHYSICAL,SAVEAS_VIRTUAL，SAVEAS_AUTO。SAVEAS_PHYSICAL表明以操作系统的根目录为文件根目录另存文件，SAVEAS_VIRTUAL表明以Web应用程序的根目录为文件根目录另存文件，SAVEAS_AUTO则表示让组件决定，当Web应用程序的根目录存在另存文件的目录时，它会选择SAVEAS_VIRTUAL，否则会选择SAVEAS_PHYSICAL。&nbsp;<BR><BR>例如，saveAs("/upload/sample.zip",SAVEAS_PHYSICAL)执行后若Web服务器安装在C盘，则另存的文件名实际是c:\upload\sample.zip。而saveAs("/upload/sample.zip",SAVEAS_VIRTUAL)执行后若Web应用程序的根目录是webapps/jspsmartupload，则另存的文件名实际是webapps/jspsmartupload/upload/sample.zip。saveAs("/upload/sample.zip",SAVEAS_AUTO)执行时若Web应用程序根目录下存在upload目录，则其效果同saveAs("/upload/sample.zip",SAVEAS_VIRTUAL)，否则同saveAs("/upload/sample.zip",SAVEAS_PHYSICAL)。&nbsp;<BR><BR>建议：对于Web程序的开发来说，最好使用SAVEAS_VIRTUAL，以便移植。&nbsp;<BR><BR>2、isMissing&nbsp;<BR><BR>作用：这个方法用于判断用户是否选择了文件，也即对应的表单项是否有值。选择了文件时，它返回false。未选文件时，它返回true。&nbsp;<BR><BR>原型：public&nbsp;boolean&nbsp;isMissing()&nbsp;<BR><BR>3、getFieldName&nbsp;<BR><BR>作用：取HTML表单中对应于此上传文件的表单项的名字。&nbsp;<BR><BR>原型：public&nbsp;String&nbsp;getFieldName()&nbsp;<BR><BR>4、getFileName&nbsp;<BR><BR>作用：取文件名（不含目录信息）&nbsp;<BR><BR>原型：public&nbsp;String&nbsp;getFileName()&nbsp;<BR><BR>5、getFilePathName&nbsp;<BR><BR>作用：取文件全名（带目录）&nbsp;<BR><BR>原型：public&nbsp;String&nbsp;getFilePathName&nbsp;<BR><BR>6、getFileExt&nbsp;<BR><BR>作用：取文件扩展名（后缀）&nbsp;<BR><BR>原型：public&nbsp;String&nbsp;getFileExt()&nbsp;<BR><BR>7、getSize&nbsp;<BR><BR>作用：取文件长度（以字节计）&nbsp;<BR><BR>原型：public&nbsp;int&nbsp;getSize()&nbsp;<BR><BR>8、getBinaryData&nbsp;<BR><BR>作用：取文件数据中指定位移处的一个字节，用于检测文件等处理。&nbsp;<BR><BR>原型：public&nbsp;byte&nbsp;getBinaryData(int&nbsp;index)。其中，index表示位移，其值在0到getSize()-1之间。&nbsp;<BR><BR>㈡&nbsp;Files类&nbsp;<BR><BR>　　这个类表示所有上传文件的集合，通过它可以得到上传文件的数目、大小等信息。有以下方法：&nbsp;<BR><BR>1、getCount&nbsp;<BR><BR>作用：取得上传文件的数目。&nbsp;<BR><BR>原型：public&nbsp;int&nbsp;getCount()&nbsp;<BR><BR>2、getFile&nbsp;<BR><BR>作用：取得指定位移处的文件对象File（这是com.jspsmart.upload.File，不是java.io.File，注意区分）。&nbsp;<BR><BR>原型：public&nbsp;File&nbsp;getFile(int&nbsp;index)。其中，index为指定位移，其值在0到getCount()-1之间。&nbsp;<BR><BR>3、getSize&nbsp;<BR><BR>作用：取得上传文件的总长度，可用于限制一次性上传的数据量大小。&nbsp;<BR><BR>原型：public&nbsp;long&nbsp;getSize()&nbsp;<BR><BR>4、getCollection&nbsp;<BR><BR>作用：将所有上传文件对象以Collection的形式返回，以便其它应用程序引用，浏览上传文件信息。&nbsp;<BR><BR>原型：public&nbsp;Collection&nbsp;getCollection()&nbsp;<BR><BR>5、getEnumeration&nbsp;<BR><BR>作用：将所有上传文件对象以Enumeration（枚举）的形式返回，以便其它应用程序浏览上传文件信息。&nbsp;<BR><BR>原型：public&nbsp;Enumeration&nbsp;getEnumeration()&nbsp;<BR><BR>㈢&nbsp;Request类&nbsp;<BR><BR>　　这个类的功能等同于JSP内置的对象request。只所以提供这个类，是因为对于文件上传表单，通过request对象无法获得表单项的值，必须通过jspSmartUpload组件提供的Request对象来获取。该类提供如下方法：&nbsp;<BR><BR>1、getParameter&nbsp;<BR><BR>作用：获取指定参数之值。当参数不存在时，返回值为null。&nbsp;<BR><BR>原型：public&nbsp;String&nbsp;getParameter(String&nbsp;name)。其中，name为参数的名字。&nbsp;<BR><BR>2、getParameterValues&nbsp;<BR><BR>作用：当一个参数可以有多个值时，用此方法来取其值。它返回的是一个字符串数组。当参数不存在时，返回值为null。&nbsp;<BR><BR>原型：public&nbsp;String[]&nbsp;getParameterValues(String&nbsp;name)。其中，name为参数的名字。&nbsp;<BR><BR>3、getParameterNames&nbsp;<BR><BR>作用：取得Request对象中所有参数的名字，用于遍历所有参数。它返回的是一个枚举型的对象。&nbsp;<BR><BR>原型：public&nbsp;Enumeration&nbsp;getParameterNames()&nbsp;<BR><BR>㈣&nbsp;SmartUpload类这个类完成上传下载工作。&nbsp;<BR><BR>A．上传与下载共用的方法：&nbsp;<BR><BR>只有一个：initialize。&nbsp;<BR><BR>作用：执行上传下载的初始化工作，必须第一个执行。&nbsp;<BR><BR>原型：有多个，主要使用下面这个：&nbsp;<BR><BR>public&nbsp;final&nbsp;void&nbsp;initialize(javax.servlet.jsp.PageContext&nbsp;pageContext)&nbsp;<BR><BR>其中，pageContext为JSP页面内置对象（页面上下文）。&nbsp;<BR><BR>B．上传文件使用的方法：&nbsp;<BR><BR>1、upload&nbsp;<BR><BR>作用：上传文件数据。对于上传操作，第一步执行initialize方法，第二步就要执行这个方法。&nbsp;<BR><BR>原型：public&nbsp;void&nbsp;upload()&nbsp;<BR><BR>2、save&nbsp;<BR><BR>作用：将全部上传文件保存到指定目录下，并返回保存的文件个数。&nbsp;<BR><BR>原型：public&nbsp;int&nbsp;save(String&nbsp;destPathName)&nbsp;<BR><BR>和public&nbsp;int&nbsp;save(String&nbsp;destPathName,int&nbsp;option)&nbsp;<BR><BR>其中，destPathName为文件保存目录，option为保存选项，它有三个值，分别是SAVE_PHYSICAL,SAVE_VIRTUAL和SAVE_AUTO。（同File类的saveAs方法的选项之值类似）SAVE_PHYSICAL指示组件将文件保存到以操作系统根目录为文件根目录的目录下，SAVE_VIRTUAL指示组件将文件保存到以Web应用程序根目录为文件根目录的目录下，而SAVE_AUTO则表示由组件自动选择。&nbsp;<BR><BR>注：save(destPathName)作用等同于save(destPathName,SAVE_AUTO)。&nbsp;<BR><BR>3、getSize&nbsp;<BR><BR>作用：取上传文件数据的总长度&nbsp;<BR><BR>原型：public&nbsp;int&nbsp;getSize()&nbsp;<BR><BR>4、getFiles&nbsp;<BR><BR>作用：取全部上传文件，以Files对象形式返回，可以利用Files类的操作方法来获得上传文件的数目等信息。&nbsp;<BR><BR>原型：public&nbsp;Files&nbsp;getFiles()&nbsp;<BR><BR>5、getRequest&nbsp;<BR><BR>作用：取得Request对象，以便由此对象获得上传表单参数之值。&nbsp;<BR><BR>原型：public&nbsp;Request&nbsp;getRequest()&nbsp;<BR><BR>6、setAllowedFilesList&nbsp;<BR><BR>作用：设定允许上传带有指定扩展名的文件，当上传过程中有文件名不允许时，组件将抛出异常。&nbsp;<BR><BR>原型：public&nbsp;void&nbsp;setAllowedFilesList(String&nbsp;allowedFilesList)&nbsp;<BR><BR>其中，allowedFilesList为允许上传的文件扩展名列表，各个扩展名之间以逗号分隔。如果想允许上传那些没有扩展名的文件，可以用两个逗号表示。例如：setAllowedFilesList("doc,txt,,")将允许上传带doc和txt扩展名的文件以及没有扩展名的文件。&nbsp;<BR><BR>7、setDeniedFilesList&nbsp;<BR><BR>作用：用于限制上传那些带有指定扩展名的文件。若有文件扩展名被限制，则上传时组件将抛出异常。&nbsp;<BR><BR>原型：public&nbsp;void&nbsp;setDeniedFilesList(String&nbsp;deniedFilesList)&nbsp;<BR><BR>其中，deniedFilesList为禁止上传的文件扩展名列表，各个扩展名之间以逗号分隔。如果想禁止上传那些没有扩展名的文件，可以用两个逗号来表示。例如：setDeniedFilesList("exe,bat,,")将禁止上传带exe和bat扩展名的文件以及没有扩展名的文件。&nbsp;<BR><BR>8、setMaxFileSize&nbsp;<BR><BR>作用：设定每个文件允许上传的最大长度。&nbsp;<BR><BR>原型：public&nbsp;void&nbsp;setMaxFileSize(long&nbsp;maxFileSize)&nbsp;<BR><BR>其中，maxFileSize为为每个文件允许上传的最大长度，当文件超出此长度时，将不被上传。&nbsp;<BR><BR>9、setTotalMaxFileSize&nbsp;<BR><BR>作用：设定允许上传的文件的总长度，用于限制一次性上传的数据量大小。&nbsp;<BR><BR>原型：public&nbsp;void&nbsp;setTotalMaxFileSize(long&nbsp;totalMaxFileSize)&nbsp;<BR><BR>其中，totalMaxFileSize为允许上传的文件的总长度。<BR><BR><BR><BR>1、setContentDisposition&nbsp;<BR><BR>作用：将数据追加到MIME文件头的CONTENT-DISPOSITION域。jspSmartUpload组件会在返回下载的信息时自动填写MIME文件头的CONTENT-DISPOSITION域，如果用户需要添加额外信息，请用此方法。&nbsp;<BR><BR>原型：public&nbsp;void&nbsp;setContentDisposition(String&nbsp;contentDisposition)&nbsp;<BR><BR>其中，contentDisposition为要添加的数据。如果contentDisposition为null，则组件将自动添加"attachment;"，以表明将下载的文件作为附件，结果是IE浏览器将会提示另存文件，而不是自动打开这个文件（IE浏览器一般根据下载的文件扩展名决定执行什么操作，扩展名为doc的将用word程序打开，扩展名为pdf的将用acrobat程序打开，等等）。&nbsp;<BR><BR>2、downloadFile&nbsp;<BR><BR>作用：下载文件。&nbsp;<BR><BR>原型：共有以下三个原型可用，第一个最常用，后两个用于特殊情况下的文件下载（如更改内容类型，更改另存的文件名）。&nbsp;<BR><BR>①&nbsp;public&nbsp;void&nbsp;downloadFile(String&nbsp;sourceFilePathName)&nbsp;<BR><BR>其中，sourceFilePathName为要下载的文件名（带目录的文件全名）&nbsp;<BR><BR>②&nbsp;public&nbsp;void&nbsp;downloadFile(String&nbsp;sourceFilePathName,String&nbsp;contentType)&nbsp;<BR><BR>其中，sourceFilePathName为要下载的文件名（带目录的文件全名）,contentType为内容类型（MIME格式的文件类型信息，可被浏览器识别）。&nbsp;<BR><BR>③&nbsp;public&nbsp;void&nbsp;downloadFile(String&nbsp;sourceFilePathName,String&nbsp;contentType,String&nbsp;destFileName)&nbsp;<BR><BR>其中，sourceFilePathName为要下载的文件名（带目录的文件全名）,contentType为内容类型（MIME格式的文件类型信息，可被浏览器识别）,destFileName为下载后默认的另存文件名。&nbsp;<BR><BR>三、文件上传篇&nbsp;<BR><BR>㈠&nbsp;表单要求&nbsp;<BR><BR>对于上传文件的FORM表单，有两个要求：&nbsp;<BR><BR>1、METHOD应用POST，即METHOD="POST"。&nbsp;<BR><BR>2、增加属性：ENCTYPE="multipart/form-data"&nbsp;<BR><BR>下面是一个用于上传文件的FORM表单的例子：&nbsp;<BR><BR><BR><BR>&lt;FORM&nbsp;METHOD="POST"&nbsp;ENCTYPE="multipart/form-data"&nbsp;<BR>ACTION="/jspSmartUpload/upload.jsp"&gt;<BR>&lt;INPUT&nbsp;TYPE="FILE"&nbsp;NAME="MYFILE"&gt;<BR>&lt;INPUT&nbsp;TYPE="SUBMIT"&gt;<BR>&lt;/FORM&gt;<BR>&nbsp;<BR><BR><BR>㈡&nbsp;上传的例子&nbsp;<BR><BR>1、上传页面upload.html&nbsp;<BR><BR>本页面提供表单，让用户选择要上传的文件，点击"上传"按钮执行上传操作。&nbsp;<BR><BR>页面源码如下：&nbsp;<BR><BR>&lt;!--<BR>&nbsp;&nbsp;&nbsp;&nbsp;文件名：upload.html<BR>作&nbsp;&nbsp;者：纵横软件制作中心雨亦奇(zhsoft88@sohu.com)<BR>--&gt;<BR>&lt;!DOCTYPE&nbsp;HTML&nbsp;PUBLIC&nbsp;"-//W3C//DTD&nbsp;HTML&nbsp;4.01&nbsp;Transitional//EN"&gt;<BR>&lt;html&gt;<BR>&lt;head&gt;<BR>&lt;title&gt;文件上传&lt;/title&gt;<BR>&lt;meta&nbsp;http-equiv="Content-Type"&nbsp;content="text/html;&nbsp;charset=gb2312"&gt;<BR>&lt;/head&gt;<BR><BR>&lt;body&gt;<BR>&lt;p&gt;&nbsp;&lt;/p&gt;<BR>&lt;p&nbsp;align="center"&gt;上传文件选择&lt;/p&gt;<BR>&lt;FORM&nbsp;METHOD="POST"&nbsp;ACTION="jsp/do_upload.jsp"<BR>ENCTYPE="multipart/form-data"&gt;<BR>&lt;input&nbsp;type="hidden"&nbsp;name="TEST"&nbsp;value="good"&gt;<BR>&nbsp;&nbsp;&lt;table&nbsp;width="75%"&nbsp;border="1"&nbsp;align="center"&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&lt;tr&gt;&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;td&gt;&lt;div&nbsp;align="center"&gt;1、&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;input&nbsp;type="FILE"&nbsp;name="FILE1"&nbsp;size="30"&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;/div&gt;&lt;/td&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&lt;/tr&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&lt;tr&gt;&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;td&gt;&lt;div&nbsp;align="center"&gt;2、&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;input&nbsp;type="FILE"&nbsp;name="FILE2"&nbsp;size="30"&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;/div&gt;&lt;/td&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&lt;/tr&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&lt;tr&gt;&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;td&gt;&lt;div&nbsp;align="center"&gt;3、&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;input&nbsp;type="FILE"&nbsp;name="FILE3"&nbsp;size="30"&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;/div&gt;&lt;/td&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&lt;/tr&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&lt;tr&gt;&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;td&gt;&lt;div&nbsp;align="center"&gt;4、&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;input&nbsp;type="FILE"&nbsp;name="FILE4"&nbsp;size="30"&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;/div&gt;&lt;/td&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&lt;/tr&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&lt;tr&gt;&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;td&gt;&lt;div&nbsp;align="center"&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;input&nbsp;type="submit"&nbsp;name="Submit"&nbsp;value="上传它！"&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;/div&gt;&lt;/td&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&lt;/tr&gt;<BR>&nbsp;&nbsp;&lt;/table&gt;<BR>&lt;/FORM&gt;<BR>&lt;/body&gt;<BR>&lt;/html&gt;<BR>&nbsp;<BR><BR><BR>2、上传处理页面do_upload.jsp&nbsp;<BR><BR>本页面执行文件上传操作。页面源码中详细介绍了上传方法的用法，在此不赘述了。&nbsp;<BR><BR>页面源码如下：&nbsp;<BR><BR>&lt;%--<BR>文件名：do_upload.jsp<BR>作&nbsp;&nbsp;者：纵横软件制作中心雨亦奇(zhsoft88@sohu.com)<BR>--%&gt;<BR>&lt;%@&nbsp;page&nbsp;contentType="text/html;&nbsp;charset=gb2312"&nbsp;language="java"&nbsp;<BR>import="java.util.*,com.jspsmart.upload.*"&nbsp;errorPage=""&nbsp;%&gt;<BR>&lt;html&gt;<BR>&lt;head&gt;<BR>&lt;title&gt;文件上传处理页面&lt;/title&gt;<BR>&lt;meta&nbsp;http-equiv="Content-Type"&nbsp;content="text/html;&nbsp;charset=gb2312"&gt;<BR>&lt;/head&gt;<BR><BR>&lt;body&gt;<BR>&lt;%<BR>//&nbsp;新建一个SmartUpload对象<BR>SmartUpload&nbsp;su&nbsp;=&nbsp;new&nbsp;SmartUpload();<BR>//&nbsp;上传初始化<BR>su.initialize(pageContext);<BR>//&nbsp;设定上传限制<BR>//&nbsp;1.限制每个上传文件的最大长度。<BR>//&nbsp;su.setMaxFileSize(10000);<BR>//&nbsp;2.限制总上传数据的长度。<BR>//&nbsp;su.setTotalMaxFileSize(20000);<BR>//&nbsp;3.设定允许上传的文件（通过扩展名限制）,仅允许doc,txt文件。<BR>//&nbsp;su.setAllowedFilesList("doc,txt");<BR>//&nbsp;4.设定禁止上传的文件（通过扩展名限制）,禁止上传带有exe,bat,<BR>jsp,htm,html扩展名的文件和没有扩展名的文件。<BR>//&nbsp;su.setDeniedFilesList("exe,bat,jsp,htm,html,,");<BR>//&nbsp;上传文件<BR>su.upload();<BR>//&nbsp;将上传文件全部保存到指定目录<BR>int&nbsp;count&nbsp;=&nbsp;su.save("/upload");<BR>out.println(count+"个文件上传成功！&lt;br&gt;");<BR><BR>//&nbsp;利用Request对象获取参数之值<BR>out.println("TEST="+su.getRequest().getParameter("TEST")<BR>+"&lt;BR&gt;&lt;BR&gt;");<BR><BR>//&nbsp;逐一提取上传文件信息，同时可保存文件。<BR>for&nbsp;(int&nbsp;i=0;i&lt;su.getFiles().getCount();i++)<BR>{<BR>com.jspsmart.upload.File&nbsp;file&nbsp;=&nbsp;su.getFiles().getFile(i);<BR><BR>//&nbsp;若文件不存在则继续<BR>if&nbsp;(file.isMissing())&nbsp;continue;<BR><BR>//&nbsp;显示当前文件信息<BR>out.println("&lt;TABLE&nbsp;BORDER=1&gt;");<BR>out.println("&lt;TR&gt;&lt;TD&gt;表单项名（FieldName）&lt;/TD&gt;&lt;TD&gt;"<BR>+&nbsp;file.getFieldName()&nbsp;+&nbsp;"&lt;/TD&gt;&lt;/TR&gt;");<BR>out.println("&lt;TR&gt;&lt;TD&gt;文件长度（Size）&lt;/TD&gt;&lt;TD&gt;"&nbsp;+&nbsp;<BR>file.getSize()&nbsp;+&nbsp;"&lt;/TD&gt;&lt;/TR&gt;");<BR>out.println("&lt;TR&gt;&lt;TD&gt;文件名（FileName）&lt;/TD&gt;&lt;TD&gt;"&nbsp;<BR>+&nbsp;file.getFileName()&nbsp;+&nbsp;"&lt;/TD&gt;&lt;/TR&gt;");<BR>out.println("&lt;TR&gt;&lt;TD&gt;文件扩展名（FileExt）&lt;/TD&gt;&lt;TD&gt;"&nbsp;<BR>+&nbsp;file.getFileExt()&nbsp;+&nbsp;"&lt;/TD&gt;&lt;/TR&gt;");<BR>out.println("&lt;TR&gt;&lt;TD&gt;文件全名（FilePathName）&lt;/TD&gt;&lt;TD&gt;"<BR>+&nbsp;file.getFilePathName()&nbsp;+&nbsp;"&lt;/TD&gt;&lt;/TR&gt;");<BR>out.println("&lt;/TABLE&gt;&lt;BR&gt;");<BR><BR>//&nbsp;将文件另存<BR>//&nbsp;file.saveAs("/upload/"&nbsp;+&nbsp;myFile.getFileName());<BR>//&nbsp;另存到以WEB应用程序的根目录为文件根目录的目录下<BR>//&nbsp;file.saveAs("/upload/"&nbsp;+&nbsp;myFile.getFileName(),&nbsp;<BR>su.SAVE_VIRTUAL);<BR>//&nbsp;另存到操作系统的根目录为文件根目录的目录下<BR>//&nbsp;file.saveAs("c:\\temp\\"&nbsp;+&nbsp;myFile.getFileName(),&nbsp;<BR>su.SAVE_PHYSICAL);<BR><BR>}<BR>%&gt;<BR>&lt;/body&gt;<BR>&lt;/html&gt;<BR>&nbsp;<BR><BR><BR>四、文件下载篇&nbsp;<BR><BR>1、下载链接页面download.html&nbsp;<BR><BR>页面源码如下：&nbsp;<BR><BR>&lt;!--<BR>文件名：download.html<BR>作&nbsp;&nbsp;者：纵横软件制作中心雨亦奇(zhsoft88@sohu.com)<BR>--&gt;<BR>&lt;!DOCTYPE&nbsp;HTML&nbsp;PUBLIC&nbsp;"-//W3C//DTD&nbsp;HTML&nbsp;4.01&nbsp;Transitional//EN"&gt;<BR>&lt;html&gt;<BR>&lt;head&gt;<BR>&lt;title&gt;下载&lt;/title&gt;<BR>&lt;meta&nbsp;http-equiv="Content-Type"&nbsp;content="text/html;&nbsp;charset=gb2312"&gt;<BR>&lt;/head&gt;<BR>&lt;body&gt;<BR>&lt;a&nbsp;href="jsp/do_download.jsp"&gt;点击下载&lt;/a&gt;<BR>&lt;/body&gt;<BR>&lt;/html&gt;<BR>&nbsp;<BR><BR><BR>2、下载处理页面do_download.jsp&nbsp;do_download.jsp展示了如何利用jspSmartUpload组件来下载文件，从下面的源码中就可以看到，下载何其简单。&nbsp;<BR><BR>源码如下：&nbsp;<BR><BR>&lt;%@&nbsp;page&nbsp;contentType="text/html;charset=gb2312"&nbsp;<BR>import="com.jspsmart.upload.*"&nbsp;%&gt;&lt;%<BR>//&nbsp;新建一个SmartUpload对象<BR>SmartUpload&nbsp;su&nbsp;=&nbsp;new&nbsp;SmartUpload();<BR>//&nbsp;初始化<BR>su.initialize(pageContext);<BR>//&nbsp;设定contentDisposition为null以禁止浏览器自动打开文件，<BR>//保证点击链接后是下载文件。若不设定，则下载的文件扩展名为<BR>//doc时，浏览器将自动用word打开它。扩展名为pdf时，<BR>//浏览器将用acrobat打开。<BR>su.setContentDisposition(null);<BR>//&nbsp;下载文件<BR>su.downloadFile("/upload/如何赚取我的第一桶金.doc");<BR>%&gt;<BR>&nbsp;<BR><BR><BR>注意，执行下载的页面，在Java脚本范围外（即&lt;%&nbsp;...&nbsp;%&gt;之外），不要包含HTML代码、空格、回车或换行等字符，有的话将不能正确下载。不信的话，可以在上述源码中%&gt;&lt;%之间加入一个换行符，再下载一下，保证出错。因为它影响了返回给浏览器的数据流，导致解析出错。&nbsp;<BR><BR>3、如何下载中文文件&nbsp;<BR><BR>jspSmartUpload虽然能下载文件，但对中文支持不足。若下载的文件名中有汉字，则浏览器在提示另存的文件名时，显示的是一堆乱码，很扫人兴。上面的例子就是这样。（这个问题也是众多下载组件所存在的问题，很少有人解决，搜索不到相关资料，可叹！）&nbsp;<BR><BR>为了给jspSmartUpload组件增加下载中文文件的支持，我对该组件进行了研究，发现对返回给浏览器的另存文件名进行UTF-8编码后，浏览器便能正确显示中文名字了。这是一个令人高兴的发现。于是我对jspSmartUpload组件的SmartUpload类做了升级处理，增加了toUtf8String这个方法，改动部分源码如下：&nbsp;<BR><BR>public&nbsp;void&nbsp;downloadFile(String&nbsp;s,&nbsp;String&nbsp;s1,&nbsp;String&nbsp;s2,&nbsp;int&nbsp;i)<BR>throws&nbsp;ServletException,&nbsp;IOException,&nbsp;SmartUploadException<BR>&nbsp;&nbsp;&nbsp;&nbsp;{<BR>if(s&nbsp;==&nbsp;null)<BR>&nbsp;&nbsp;&nbsp;&nbsp;throw&nbsp;new&nbsp;IllegalArgumentException("File&nbsp;''"&nbsp;+&nbsp;s&nbsp;+<BR>&nbsp;&nbsp;&nbsp;&nbsp;"''&nbsp;not&nbsp;found&nbsp;(1040).");<BR>if(s.equals(""))<BR>&nbsp;&nbsp;&nbsp;&nbsp;throw&nbsp;new&nbsp;IllegalArgumentException("File&nbsp;''"&nbsp;+&nbsp;s&nbsp;+<BR>&nbsp;&nbsp;&nbsp;&nbsp;"''&nbsp;not&nbsp;found&nbsp;(1040).");<BR>if(!isVirtual(s)&nbsp;&amp;&amp;&nbsp;m_denyPhysicalPath)<BR>&nbsp;&nbsp;&nbsp;&nbsp;throw&nbsp;new&nbsp;SecurityException("Physical&nbsp;path&nbsp;is<BR>&nbsp;&nbsp;&nbsp;&nbsp;denied&nbsp;(1035).");<BR>if(isVirtual(s))<BR>&nbsp;&nbsp;&nbsp;&nbsp;s&nbsp;=&nbsp;m_application.getRealPath(s);<BR>java.io.File&nbsp;file&nbsp;=&nbsp;new&nbsp;java.io.File(s);<BR>FileInputStream&nbsp;fileinputstream&nbsp;=&nbsp;new&nbsp;FileInputStream(file);<BR>long&nbsp;l&nbsp;=&nbsp;file.length();<BR>boolean&nbsp;flag&nbsp;=&nbsp;false;<BR>int&nbsp;k&nbsp;=&nbsp;0;<BR>byte&nbsp;abyte0[]&nbsp;=&nbsp;new&nbsp;byte[i];<BR>if(s1&nbsp;==&nbsp;null)<BR>&nbsp;&nbsp;&nbsp;&nbsp;m_response.setContentType("application/x-msdownload");<BR>else<BR>if(s1.length()&nbsp;==&nbsp;0)<BR>&nbsp;&nbsp;&nbsp;&nbsp;m_response.setContentType("application/x-msdownload");<BR>else<BR>&nbsp;&nbsp;&nbsp;&nbsp;m_response.setContentType(s1);<BR>m_response.setContentLength((int)l);<BR>m_contentDisposition&nbsp;=&nbsp;m_contentDisposition&nbsp;!=&nbsp;null&nbsp;?<BR>m_contentDisposition&nbsp;:&nbsp;"attachment;";<BR>if(s2&nbsp;==&nbsp;null)<BR>&nbsp;&nbsp;&nbsp;&nbsp;m_response.setHeader("Content-Disposition",&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;m_contentDisposition&nbsp;+&nbsp;"&nbsp;filename="&nbsp;+&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;toUtf8String(getFileName(s)));<BR>else<BR>if(s2.length()&nbsp;==&nbsp;0)<BR>&nbsp;&nbsp;&nbsp;&nbsp;m_response.setHeader("Content-Disposition",&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;m_contentDisposition);<BR>else<BR>&nbsp;&nbsp;&nbsp;&nbsp;m_response.setHeader("Content-Disposition",&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;m_contentDisposition&nbsp;+&nbsp;"&nbsp;filename="&nbsp;+&nbsp;toUtf8String(s2));<BR>while((long)k&nbsp;&lt;&nbsp;l)<BR>{<BR>&nbsp;&nbsp;&nbsp;&nbsp;int&nbsp;j&nbsp;=&nbsp;fileinputstream.read(abyte0,&nbsp;0,&nbsp;i);<BR>&nbsp;&nbsp;&nbsp;&nbsp;k&nbsp;+=&nbsp;j;<BR>&nbsp;&nbsp;&nbsp;&nbsp;m_response.getOutputStream().write(abyte0,&nbsp;0,&nbsp;j);<BR>}<BR>fileinputstream.close();<BR>&nbsp;&nbsp;&nbsp;&nbsp;}<BR><BR>&nbsp;&nbsp;&nbsp;&nbsp;/**<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;*&nbsp;将文件名中的汉字转为UTF8编码的串,以便下载时能正确显示另存的文件名.<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;*&nbsp;纵横软件制作中心雨亦奇2003.08.01<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;*&nbsp;@param&nbsp;s&nbsp;原文件名<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;*&nbsp;@return&nbsp;重新编码后的文件名<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;*/<BR>&nbsp;&nbsp;&nbsp;&nbsp;public&nbsp;static&nbsp;String&nbsp;toUtf8String(String&nbsp;s)&nbsp;{<BR>StringBuffer&nbsp;sb&nbsp;=&nbsp;new&nbsp;StringBuffer();<BR>for&nbsp;(int&nbsp;i=0;i&lt;s.length();i++)&nbsp;{<BR>&nbsp;&nbsp;&nbsp;&nbsp;char&nbsp;c&nbsp;=&nbsp;s.charAt(i);<BR>&nbsp;&nbsp;&nbsp;&nbsp;if&nbsp;(c&nbsp;&gt;=&nbsp;0&nbsp;&amp;&amp;&nbsp;c&nbsp;&lt;=&nbsp;255)&nbsp;{<BR>sb.append(c);<BR>&nbsp;&nbsp;&nbsp;&nbsp;}&nbsp;else&nbsp;{<BR>byte[]&nbsp;b;<BR>try&nbsp;{<BR>&nbsp;&nbsp;&nbsp;&nbsp;b&nbsp;=&nbsp;Character.toString(c).getBytes("utf-8");<BR>}&nbsp;catch&nbsp;(Exception&nbsp;ex)&nbsp;{<BR>&nbsp;&nbsp;&nbsp;&nbsp;System.out.println(ex);<BR>&nbsp;&nbsp;&nbsp;&nbsp;b&nbsp;=&nbsp;new&nbsp;byte[0];<BR>}<BR>for&nbsp;(int&nbsp;j&nbsp;=&nbsp;0;&nbsp;j&nbsp;&lt;&nbsp;b.length;&nbsp;j++)&nbsp;{<BR>&nbsp;&nbsp;&nbsp;&nbsp;int&nbsp;k&nbsp;=&nbsp;b[j];<BR>&nbsp;&nbsp;&nbsp;&nbsp;if&nbsp;(k&nbsp;&lt;&nbsp;0)&nbsp;k&nbsp;+=&nbsp;256;<BR>&nbsp;&nbsp;&nbsp;&nbsp;sb.append("%"&nbsp;+&nbsp;Integer.toHexString(k).<BR>&nbsp;&nbsp;&nbsp;&nbsp;toUpperCase());<BR>}<BR>&nbsp;&nbsp;&nbsp;&nbsp;}<BR>}<BR>return&nbsp;sb.toString();<BR>&nbsp;&nbsp;&nbsp;&nbsp;}<BR>&nbsp;<BR><BR><BR>注意源码中粗体部分，原jspSmartUpload组件对返回的文件未作任何处理，现在做了编码的转换工作，将文件名转换为UTF-8形式的编码形式。UTF-8编码对英文未作任何处理，对中文则需要转换为%XX的形式。toUtf8String方法中，直接利用Java语言提供的编码转换方法获得汉字字符的UTF-8编码，之后将其转换为%XX的形式。&nbsp;<BR><BR>将源码编译后打包成jspSmartUpload.jar，拷贝到Tomcat的shared/lib目录下（可为所有WEB应用程序所共享），然后重启Tomcat服务器就可以正常下载含有中文名字的文件了。另，toUtf8String方法也可用于转换含有中文的超级链接，以保证链接的有效，因为有的WEB服务器不支持中文链接。&nbsp;<BR><BR>小结：jspSmartUpload组件是应用JSP进行B/S程序开发过程中经常使用的上传下载组件，它使用简单，方便。现在我又为其加上了下载中文名字的文件的支持，真个是如虎添翼，必将赢得更多开发者的青睐。</DIV><img src ="http://www.blogjava.net/bluelily22/aggbug/16323.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/bluelily22/" target="_blank">丁丁</a> 2005-10-21 18:09 <a href="http://www.blogjava.net/bluelily22/archive/2005/10/21/16323.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item></channel></rss>