2009年11月5日随笔档案 - 甜菜侯爵

世界十大最糟糕网站设计（中文翻译）

摘要: 你曾经打开过真正设计糟糕的网站，糟到你都觉得是自己手太贱么？我就非常“幸运”能看到一坨这种震撼人心的网站。下面这些是这些糟糕网站中最糟糕的。
如果你是一个网页设计师，赶快行动起来吧，赶紧给他们发邮件提供服务。
如果你的网站不幸在这个名单中，也别觉得窝心，不过您最好还是考虑考虑重新设计一下网站比较好。。。阅读全文

posted @ 2009-11-21 03:04 甜菜侯爵阅读(2255) | 评论 (3) | 编辑收藏

世界十大最糟糕网页设计

链接如下：

先放这里，现在没有时间，等有空了把原文翻译了贴过来。
写得还是蛮有意思的。http://www.blogstorm.co.uk/blog/top-10-worst-websites/

posted @ 2009-11-17 01:18 甜菜侯爵阅读(284) | 评论 (0) | 编辑收藏

用正则表达式取出去除html页面中的tags

这个就比较简单了，正则式是 “<[^>]*>”，其表意为“以<开头的，后续任意个不为>的字符，并以>结尾的字符串”
这样做的目的是为了获得所谓plain的文本，方便下一步的处理。

代码如下：

1

/**
2

* Remove all "<>" tags in the text
3

* @param tagText
4

* @return the clean text without tags
5

*/
6

public String removeTags( String tagText )
7

{
8

return tagText.replaceAll("<[^>]*>", "");
9

}

posted @ 2009-11-06 22:19 甜菜侯爵阅读(220) | 评论 (0) | 编辑收藏

用正则表达式提取网页中的链接

个人感觉效率肯定还能进一步提高。。。。
不过实在是对正则不是太熟悉，只好暂时这样了。

代码如下：

1

/** The regex for search link with the tag "a" */
2

private final String A_REGEX = "<a.*?/a>";
3

/** The regex for search url with the tag "href" */
4

private final String HREF_REGEX = "href=\".*?\"";
5

/** The pattern for linke with the tag "a" */
6

private final Pattern A_PATTERN = Pattern.compile(A_REGEX);
7

/** The pattern for url with the tag "href" */
8

private final Pattern HREF_PATTERN = Pattern.compile(HREF_REGEX);
9

/**
10

* Get url address from the url and the content of the url
11

* @param url the url need to be get links
12

* @param content the content of the given url
13

* @return a list with the url address of the links
14

*/
15

public List<String> getLinkList( URL url, String content )
16

{
17

List<String> linkList = new LinkedList<String>();
18

final Matcher a_matcher = A_PATTERN.matcher(content);
19

while (a_matcher.find())
20

{
21

//JUST FOR TEST!
22

// System.out.println(a_matcher.group());
23

//get url address
24

final Matcher myurl = HREF_PATTERN.matcher(a_matcher.group());
25

while (myurl.find())
26

{
27

String urlAddress = myurl.group().replaceAll("href=|>|\"|\"", "");
28

if( urlAddress.startsWith("http") )
29

{
30

linkList.add(urlAddress);
31

}
32

else if( urlAddress.startsWith("/") || urlAddress.startsWith("\\") )
33

{
34

linkList.add(url.getPath()+urlAddress);
35

}
36

else
37

{
38

String fullUrl = url.toString();
39

//the length of the url without the current page
40

int lastSlash = fullUrl.lastIndexOf("/") + 1;
41

linkList.add(fullUrl.substring(0,lastSlash) + urlAddress);
42

}
43

}
44

}
45

return linkList;
46

}

posted @ 2009-11-05 03:00 甜菜侯爵阅读(470) | 评论 (0) | 编辑收藏

世界十大最糟糕网站设计（中文翻译）

世界十大最糟糕网页设计

用正则表达式取出去除html页面中的tags

用正则表达式提取网页中的链接

导航

统计

常用链接

留言簿

随笔档案

搜索

最新评论

阅读排行榜

评论排行榜