Change Dir

先知cd——热爱生活是一切艺术的开始

统计

留言簿(18)

积分与排名

“牛”们的博客

各个公司技术

我的链接

淘宝技术

阅读排行榜

评论排行榜

总结一下几天前的LCS算法编写

没有过多的技术含量,只是拿来分享一下,希望不要被手懒的各位学生朋友直接拿去做作业。大家有兴趣可以和我探讨,最近看来有时间可以做些研究工作。

不说了,参考文章去百度搜一下吧,最长公共子串。

DP解:

   1: /**
   2:      * 动态规划求最长公共子串
   3:      * 
   4:      * @param s
   5:      * @param t
   6:      * @return
   7:      */
   8:     public String getLCSByDP(String s, String t) {
   9:         int len_s = s.length();
  10:         int len_t = t.length();
  11:         String[][] num = new String[len_s][len_t];
  12:         char ch1 = '\0';
  13:         char ch2 = '\0';
  14:         int len = 0;
  15:         String res = "";
  16:         for (int i = 0; i < len_s; i++) {
  17:             for (int j = 0; j < len_t; j++) {
  18:                 ch1 = s.charAt(i);
  19:                 ch2 = t.charAt(j);
  20:                 if (ch1 != ch2) {
  21:                     num[i][j] = "";
  22:  
  23:                 } else {
  24:                     if (i == 0 || j == 0) {
  25:                         num[i][j] = ch1 + "";
  26:                     } else {
  27:                         num[i][j] = num[i - 1][j - 1] + ch1;
  28:                     }
  29:                     if (num[i][j].length() > len) {
  30:                         len = num[i][j].length();
  31:                         res = num[i][j];
  32:                     }
  33:                 }
  34:             }
  35:         }
  36:         return res;
  37:     }

 

GST解:

   1: public static final String tail1 = "$1";
   2:     public static final String tail2 = "$2";
   3:  
   4: /**
   5:      * 求最长公共前缀LCP
   6:      * 
   7:      * @param word1
   8:      * @param word2
   9:      * @return
  10:      */
  11:     public String getLCP(String word1, String word2) {
  12:         String res = "";
  13:         int len1 = word1.length();
  14:         int len2 = word2.length();
  15:         for (int i = 0; i < Math.min(len1, len2); i++) {
  16:             if (word1.charAt(i) == word2.charAt(i)) {
  17:                 res += word1.charAt(i);
  18:             } else
  19:                 break;
  20:         }
  21:         return res;
  22:     }
  23:  
  24:     /**
  25:      * 后缀数组方法求最长公共子串
  26:      * 
  27:      * @param word1
  28:      * @param word2
  29:      * @return
  30:      */
  31:     public String getLCSByGST(String word1, String word2) {
  32:         String res = "";
  33:         
  34:         int len1 = word1.length();
  35:         int len2 = word2.length();
  36:         String gst[] = new String[len1 + len2];
  37:         for (int i = 0; i < len1; i++) {
  38:             gst[i] = mergeSuffix(word1.substring(i), word2);
  39:         }
  40:         for (int i = 0; i < len2; i++) {
  41:             gst[i + len1] = word2.substring(i) + tail2;
  42:         }
  43:         Arrays.sort(gst);
  44:         for (int i = 0; i < gst.length - 1; i++) {
  45:             String str = getLCP(gst[i], gst[i + 1]);
  46:             if (str.length() > res.length()) {
  47:                 res = str;
  48:             }
  49:         }
  50:         if (res.endsWith("$"))
  51:             res = res.substring(0, res.length() - 1);
  52:         return res;
  53:     }
  54:  
  55:     private String mergeSuffix(String word1, String word2) {
  56:         StringBuffer sb = new StringBuffer();
  57:         sb.append(word1);
  58:         sb.append(tail1);
  59:         sb.append(word2);
  60:         sb.append(tail2);
  61:         return sb.toString();
  62:     }

我的应用是需要做字符串相似度的计算,所以需要求出来最长公共子串。在没有任何语义背景的解释的情况下,一个自己设计的相似度模型如下:

clip_image002

posted on 2011-06-03 10:18 changedi 阅读(2475) 评论(0)  编辑  收藏 所属分类: 算法


只有注册用户登录后才能发表评论。


网站导航: