Ginew.Z 的博客

一切，为了让生活更简单、更自然

语源科技BlogJava :: 首页 :: 联系 :: 聚合

:: 管理

21 Posts :: 0 Stories :: 14 Comments :: 0 Trackbacks

2006年4月11日 #

href=# vs href=void(0)

href="#" vs. href="javascript:void(0)"

开发的时候有时需要用link(<a>)来触发一些javascript事件，所以常常可以看到如下的代码：

<a href="javascript:void(0)" onclick="doSomething();returnfalse;">Link</a>

这是一个曾经被多次讨论过的问题，长期以来，我也一直是这样写的。读了 a href=”javascript:void(0);” — avoid the void 之后，我认同了作者的意见。下面的写法确实更合理：

<a href="#" onclick="doSomething();returnfalse;">Link</a>

或者

<script type="javascript">
function doSomething() {
  //doSomething
  returnfalse;
}
</script>
<a href="#" onclick="return doSomething();">Link</a>

以往大家不使用"#"的问题是，这将导致点击链接时页面跳回页面顶部，但通过 return false 语句将使得浏览器忽略链接的默认动作，从而避免了此问题。

youngpup 更有意思，他在How to Create Pop-Up Windows 中言辞激烈的倡导大家永远永远永远不要使用 javascript: 伪协议：

Never, ever, ever use the javascript: pseudo-protocol for anything, ever ever ever ever again. Please. Pretty please.

他的解决方案是：

<a 
  href="http://google.com/" 
  onclick="window.open(this.href, 'popupwindow', 
  'width=400,height=300,scrollbars,resizable'); 
  returnfalse;">

这样的好处就是可以保存到书签或者收藏夹，可以左键单击，也可以右键使用！

posted @ 2006-11-17 12:15 无风之雨阅读(1127) | 评论 (2) | 编辑收藏

resin使用security-manager的问题

     我们打算为用户架设单独的虚拟主机服务器，可以让企业自主上传jsp、htm、php等程序。其中resin用来做jsp的容器。
     由于是用户自主通过FTP上传程序，我们必须控制这些程序可以执行的权限，不能让用户随便浏览硬盘上的文件，但又要能让resin可以正常运行。比如：/data/user_a目录中的程序，只能在/data/user_a目录及其子目录中读写，如果想要访问其他目录，就没有权限。
     通过研究resin的文档以及JAVA的机制，我认为要实现以上构想，可以通过使用java权限管理器来构建一个resin的沙箱来对java的具体操作进行授权。
参考文档：http://www.caucho.com/resin-3.0/security/securitymanager.xtp ，http://www.jscud.com/srun/news/viewhtml/3_2005_10/148.htm

     当我认为胜利在望的时候，发现resin好像不支持grant codeBase "file:xxxx 。

grant codeBase "file:/data/ftpdata/user01.test.com/-" {
permission java.io.FilePermission "/data/ftpdata/user01.test.com/-", "read,write,delete";
};
     上面的语句，语法上没有问题，但就是不起作用。那个codebase目录下的文件，对本目录没有任何权限。

        resin的官方论坛里面，有人在2001年，针对resin1.2.5就提出了和我一摸一样的疑问（http://www.caucho.com/support/resin-interest/0105/0106.html），作者发现问题是由于resin的classloader是非安全的，因此改了resin原文件后解决了问题（http://www.caucho.com/support/resin-interest/0105/0112.html），但是我看resin3的源代码，里面已经基于java.security.SecureClassLoader，因此应该不是这个原因了。
     以下是我的resin.policy文件：

grant codeBase "file:${java.home}/lib/-" {
permission java.security.AllPermission;
};

grant codeBase "file:${java.home}/jre/lib/-" {
permission java.security.AllPermission;
};

grant codeBase "file:${resin.home}/lib/-" {
permission java.security.AllPermission;
};

grant {
permission java.util.PropertyPermission "*", "read";
permission java.io.SerializablePermission "enableSubstitution";
permission java.lang.reflect.ReflectPermission "suppressAccessChecks";
permission java.lang.RuntimePermission "accessClassInPackage.*";
permission java.lang.RuntimePermission "getClassLoader";
permission java.lang.RuntimePermission "accessDeclaredMembers";
permission java.lang.RuntimePermission "modifyThreadGroup";
permission java.lang.RuntimePermission "setContextClassLoader";
permission java.lang.RuntimePermission "setIO";
permission java.lang.RuntimePermission "stopThread";
permission java.lang.RuntimePermission "createClassLoader";
permission java.lang.RuntimePermission "getProtectionDomain";
permission java.lang.RuntimePermission "defineClassInPackage";
permission java.security.SecurityPermission "putProviderProperty.SunJCE";
permission java.security.SecurityPermission "insertProvider.SunJCE";
permission java.util.logging.LoggingPermission "control";
permission java.lang.RuntimePermission "getAttribute";
permission java.util.PropertyPermission "jaxp.debug", "read";
permission ognl.OgnlInvokePermission "invoke.*";
permission java.net.SocketPermission "localhost:3306","connect";
permission java.io.FilePermission "${resin.home}/-", "read";
permission java.io.FilePermission "${java.home}/-", "read";
permission java.io.FilePermission "/tmp/-","read,write,delete";
permission java.io.FilePermission "/tmp","read,write,delete";
permission java.io.FilePermission ".","read";
permission java.io.FilePermission "/home/apps/java/jdk/lib/tools.jar","read";
permission java.io.FilePermission "/bin/sh", "read,execute";
};

//以下语句没有任何作用，/data/ftpdata/user01.test.com/下的jsp对这个目录没有读的权限
grant codeBase "file:/data/ftpdata/user01.test.com/-" {
permission java.io.FilePermission "/data/ftpdata/user01.test.com/-", "read,write,delete";
};

posted @ 2006-06-09 11:00 无风之雨阅读(710) | 评论 (2) | 编辑收藏

GPL协议简述

要了解GPL，一般地，您没有必要耐心阅读原版的GPL协议，因为 GPL 无非交待了几个原则：

确保软件自始至终都以开放源代码形式发布，保护开发成果不被窃取用作商业发售。任何一套软件，只要其中使用了受 GPL 协议保护的第三方软件的源程序，并向非开发人员发布时，软件本身也就自动成为受 GPL 保护并且约束的实体。也就是说，此时它必须开放源代码。
GPL 大致就是一个左侧版权（Copyleft，或译为“反版权”、“版权属左”、“版权所无”、“版责”等）的体现。你可以去掉所有原作的版权信息，只要你保持开源，并且随源代码、二进制版附上 GPL 的许可证就行，让后人可以很明确地得知此软件的授权信息。GPL 精髓就是，只要使软件在完整开源的情况下，尽可能使使用者得到自由发挥的空间，使软件得到更快更好的发展。
无论软件以何种形式发布，都必须同时附上源代码。例如在 Web 上提供下载，就必须在二进制版本（如果有的话）下载的同一个页面，清楚地提供源代码下载的链接。如果以光盘形式发布，就必须同时附上源文件的光盘。
开发或维护遵循 GPL 协议开发的软件的公司或个人，可以对使用者收取一定的服务费用。但还是一句老话——必须无偿提供软件的完整源代码，不得将源代码与服务做捆绑或任何变相捆绑销售。

posted @ 2006-05-16 16:50 无风之雨阅读(704) | 评论 (0) | 编辑收藏

最想要的车

posted @ 2006-05-14 11:28 无风之雨阅读(302) | 评论 (0) | 编辑收藏

关于IE无法打开INTERNET站点的错

今天新页面上线，很多同事报告说页面打开到一半，经常跳出无法打开Internet站点的错误，然后页面会跳转到DNS错误的页面。

        这个问题我以前遇到过，一直没有详细的去深究原因，只是以为是服务器关闭连接太快的原因。今天发现这个问题出的很频繁，服务器方面没有改什么，只是上传了新的页面程序而已，应该不会和服务器有关。在对页面进行分析，并搜索了一下网上，发现原来是js在document还没完全load完的时候就试图改变其值导致。

        因此对js做如下改变：

原js：

     window.settimeout("go()",500);
     function go(){
    .......
     }

改成：

var go_i=window.setInterval("go()",500);
function go(){
   if(document.readyState=="complete"){
      window.clearInterval(go2_i);
    }
    else return;
    ........
}
目的就是让他一定要在document完成后才执行那个操作

posted @ 2006-04-19 18:14 无风之雨阅读(6951) | 评论 (6) | 编辑收藏

document.domain与域名的问题

以前如果要使iframe里面的脚本能访问parent的内容，但iframe和parent的二级域名相同，那一般都会在两者都写上document.domain="xxx.com" 以放宽访问权限。

今天发现，如果iframe和parent在同一个三级域名下，比如都是aa.bb.com，那设了document.domain反而会造成访问拒绝。

查了下MSDN，有如下解释：

Remarks

The property initially returns the host name of the server from which the page is served. The property can be assigned the domain suffix to allow sharing of pages across frames. For example, a page in one frame from home.microsoft.com and a page from www.microsoft.com initially would not be able to communicate with each other. However, by setting the domain property of both pages to the suffix "microsoft.com", you ensure that both pages are considered secure and access is available between the pages.

When you set the domain property, use the domain name determined by the server rather than by the client browser.

All the pages on different hosts must have the domain property explicitly set to the same value to communicate successfully with each other. For example, the value of the domain property of a page on the host microsoft.com would be "microsoft.com" by default. It might seem logical that if you set the domain property of a page on another host named msdn.microsoft.com to "microsoft.com," that the two pages could communicate with each other. However, this is not the case unless you have also explicitly set the domain property of the page on microsoft.com to "microsoft.com".

Furthermore, this property cannot be used to allow cross-frame communication among frames with different domain suffixes. For example, a page in one frame from www.microsoft.com and a page in another frame from www.msn.com would not be able to communicate with each other even if the domain property of both pages was set to the suffix "microsoft.com".

Security Alert Using this property incorrectly can compromise the security of your Web site. Set the domain property only if you must allow cross-domain scripting. Use a value determined on the server. Setting this property to a value determined on the client (like through the location object) could expose your site to attack from another site through Domain Name System (DNS) manipulation. For more information, see Security Considerations: Dynamic HTML.

For more information on domain security, see About Cross-Frame Scripting and Security.

posted @ 2006-04-13 11:54 无风之雨阅读(9386) | 评论 (3) | 编辑收藏

域名和COOKIE的问题

今天发现，在IE里面，当一个域名包含_的时候，IE不会给这个网站发送COOKIE，真变态。同事调试了半天，才发现还有这个问题

posted @ 2006-04-11 17:54 无风之雨阅读(284) | 评论 (0) | 编辑收藏

我认为最好的MYSQL备份方式

    要备份MYSQL，很多人用mysqldump，其实这种方式，导出的文件是最大的，导入的时间是最久的。命令是方便的，但真正发生错误的时候，恢复效率很低。
    我主张，另外找一台比较空闲的机器，来做数据库的备份。这台机器作以下用途：

   它是主数据库带的slave数据库群里面的一台，每天凌晨定时启动同步数据，等追上bin-log并全部执行后，停止同步，并用select * into outfile将数据全部导出成文件，并且在每周的某一天，清除掉主数据库上已经同步好的bin-log，以确保硬盘空间不被log占满。

   为此，我写了3个脚本，分别执行1、启动mysql，追log，然后停止slave；2、导出全部数据库全部文件到文件；3、删除主数据库的log

---------------------------------------------------------------------------------------
#!/bin/bash
#readMasterMysql.sh
CHECK_MYSQL=0
/home/mysql/bin/mysqld_safe &
until [ "$CHECK_MYSQL" = "1" ]
do
sleep 10
CHECK_MYSQL=`/home/mysql/bin/mysql -uroot -e"show slave status"|awk '{if($14==$21)print "1"}'|tail -n1`
done
/home/mysql/bin/mysql -uroot -e"slave stop"
/home/script/backupMysql.sh
/home/mysql/bin/mysqladmin shutdown
WEEK=`date "+%w"`
if [ $WEEK = "5" ]
then
    /home/script/purgeLog.sh
fi

------------------------------------------------------------------------------
#!/bin/bash
#purgeLog.sh
LOG_FILE=/home/mysql/data/master.info
DB_SERVER=`sed -n '4p' $LOG_FILE`
DB_USER=`sed -n '5p' $LOG_FILE`
DB_PASS=`sed -n '6p' $LOG_FILE`
DB_LOGFILE=`sed -n '2p' $LOG_FILE`
/home/mysql/bin/mysql -h$DB_SERVER -u$DB_USER -p"$DB_PASS" -e"purge master logs to '$DB_LOGFILE'"

------------------------------------------------------------------------------
#!/bin/bash
#backupMysql.sh
database=$1
table=$2
MYSQL_CLIENT="/home/mysql/bin/mysql -uroot --default-character-set=gbk"
MYSQL_DUMP="/home/mysql/bin/mysqldump -d -uroot --default-character-set=gbk"
OUTPUT_PATH=/date/backup
for databases in `$MYSQL_CLIENT -e "show databases"|grep -v Database`
do
if [ "$#" = "0" -o "$database" = "$databases" ] ; then
        mkdir -p -m777 $OUTPUT_PATH/$databases/
        $MYSQL_DUMP $databases > $OUTPUT_PATH/$databases/$databases.sql
        for tables in `$MYSQL_CLIENT -e "show tables" $databases|grep -v Tables_in_`
        do
        if [ "$#" = "0" -o "$#" = "1" -o "$table" = "$tables" ] ; then
                mv -f $OUTPUT_PATH/$databases/$tables $OUTPUT_PATH/$databases/$tables.old
                $MYSQL_CLIENT -e "select * into outfile '$OUTPUT_PATH/$databases/$tables' from $tables" $databases
        fi
        done
fi
done

posted @ 2006-04-11 12:47 无风之雨阅读(683) | 评论 (0) | 编辑收藏

如何让referer和user-agent同时为空时记录日志

一般情况下，Referer和User-Agent同时为空的时候，可以认为是其他网站在批量采集本站数据，我打算deny掉这种请求。不过apache文档里面没有提到有两个环境变量的与操作。最后chinaunix上有大侠回答了我的问题：

SetEnv log_flag=1
SetEnvIf Referer !"^$" log_flag=0
SetEnvIf user-agent !"^$" log_flag=0
...

看字面上，就是如果用两个非的或来代替与
这样，只要两个条件有一个不满足，就log_flag就会变掉，只要它变掉了，就说明不符合我屏蔽的规则。
高手就是高手，为什么非要苛求一定要有“与”呢，两个“非”的“或”，不一样达到要求？

学习了。

posted @ 2006-04-11 12:25 无风之雨阅读(862) | 评论 (0) | 编辑收藏

oracle到mysql的迁移，函数部分

to_date(?, 'YYYY-MM-DD HH24:MI:SS')"
STR_TO_DATE('2003-15-10 00:00:00','%Y-%m-%d %H:%i:%s'); //格式不对，会返回NULL

to_char(create_time,'yyyy-MM-dd')
DATE_FORMAT(create_time,'%Y-%m-%d')

sysdate
now()或者CURRENT_TIMESTAMP //'1997-12-15 23:50:26',建表的时候，timestamp类型可以指定default CURRENT_TIMESTAMP

sysdate - 7 //7天前
now()-INTERVAL 7 DAY

select * from (select .... where rownum<end) where rownum>start
limit [start,] length

substr(productInfor,1,20)
SUBSTRING('Quadratically',5,6) //SUBSTRING(str,pos,len)

instr(str,substr,pos,index)
instr(str,substr) 或者 locate(substr,str,pos)
// 没有相对应的语法，但一般情况，这个是和substr结合起来用的。
//如果是str="2005-10-01"取中间的10这样的需要，oracle是substr(str,instr(str,'-',1,1)+1,instr(str,'-',1,2)-instr(str,'-',1,1)-1)
那在mysql里面，可以试试这样SUBSTRING_INDEX(SUBSTRING_INDEX(str,'-',2),'-',-1),意思就是取第二个-之前的str后（2005-10），再取倒数第一个-之后的内容

oracle的nvl（ss,dd）函数在mysql中怎么实现？
答：ifnull(ss,dd)

posted @ 2006-04-11 12:16 无风之雨阅读(874) | 评论 (0) | 编辑收藏

Mysql到oracle程序迁移的注意事项

Mysql到oracle程序迁移的注意事项(摘抄)
有很多应用项目, 刚起步的时候用MYSQL数据库基本上能实现各种功能需求，随着应用用户的增多，
数据量的增加，MYSQL渐渐地出现不堪重负的情况：连接很慢甚至宕机，于是就有把数据从MYSQL迁到
ORACLE的需求，应用程序也要相应做一些修改。本人总结出以下几点注意事项，希望对大家有所帮助。

1．自动增长的数据类型处理
MYSQL有自动增长的数据类型，插入记录时不用操作此字段，会自动获得数据值。
ORACLE没有自动增长的数据类型，需要建立一个自动增长的序列号，插入记录时要把序列号的下一个
值赋于此字段。

CREATE SEQUENCE 序列号的名称 (最好是表名+序列号标记) INCREMENT BY 1 START WITH 1
MAXVALUE 99999 CYCLE NOCACHE;
其中最大的值按字段的长度来定, 如果定义的自动增长的序列号 NUMBER(6) , 最大值为999999
INSERT 语句插入这个字段值为: 序列号的名称.NEXTVAL

2. 单引号的处理
MYSQL里可以用双引号包起字符串，ORACLE里只可以用单引号包起字符串。在插入和修改字符串
前必须做单引号的替换：把所有出现的一个单引号替换成两个单引号。

3. 翻页的SQL语句的处理
MYSQL处理翻页的SQL语句比较简单，用LIMIT 开始位置, 记录个数；PHP里还可以用SEEK定位到结果
集的位置。
ORACLE处理翻页的SQL语句就比较繁琐了。每个结果集只有一个ROWNUM字段标明它的位置, 并且只能
用ROWNUM<100, 不能用ROWNUM>80。
以下是经过分析后较好的两种ORACLE翻页SQL语句( ID是唯一关键字的字段名 )：
语句一：
SELECT ID, [FIELD_NAME,...] FROM TABLE_NAME WHERE ID IN ( SELECT ID FROM (SELECT
ROWNUM AS NUMROW, ID FROM TABLE_NAME WHERE 条件1 ORDER BY 条件2) WHERE NUMROW > 80 AND
NUMROW < 100 ) ORDER BY 条件3;

语句二：
SELECT * FROM (( SELECT ROWNUM AS NUMROW, c.* from (select [FIELD_NAME,...] FROM
TABLE_NAME WHERE 条件1 ORDER BY 条件2) c) WHERE NUMROW > 80 AND NUMROW < 100 ) ORDER BY 条件3;

4．长字符串的处理
长字符串的处理ORACLE也有它特殊的地方。INSERT和UPDATE时最大可操作的字符串长度小于等于
4000个单字节, 如果要插入更长的字符串, 请考虑字段用CLOB类型，方法借用ORACLE里自带的DBMS_LOB程序
包。插入修改记录前一定要做进行非空和长度判断，不能为空的字段值和超出长度字段值都应该提出警告,
返回上次操作。

5. 日期字段的处理
MYSQL日期字段分DATE和TIME两种，ORACLE日期字段只有DATE，包含年月日时分秒信息，用当前数据库
的系统时间为SYSDATE, 精确到秒，或者用字符串转换成日期型函数TO_DATE(‘2001-08-01’,’YYYY-MM-DD’)
年-月-日 24小时:分钟:秒的格式YYYY-MM-DD HH24:MI:SS TO_DATE()还有很多种日期格式, 可以参看
ORACLE DOC.
日期型字段转换成字符串函数TO_CHAR(‘2001-08-01’,’YYYY-MM-DD HH24:MI:SS’)

日期字段的数学运算公式有很大的不同。
MYSQL找到离当前时间7天用
DATE_FIELD_NAME > SUBDATE（（NOW（），INTERVAL 7 DAY）
ORACLE找到离当前时间7天用
DATE_FIELD_NAME >SYSDATE - 7;

6. 空字符的处理
MYSQL的非空字段也有空的内容，ORACLE里定义了非空字段就不容许有空的内容。
按MYSQL的NOT NULL来定义ORACLE表结构, 导数据的时候会产生错误。因此导数据时要对空字符进行判
断，如果为NULL或空字符，需要把它改成一个空格的字符串。

7. 字符串的模糊比较
MYSQL里用字段名 like '%字符串%'
ORACLE里也可以用字段名 like '%字符串%' 但这种方法不能使用索引, 速度不快
用字符串比较函数 instr(字段名,'字符串')>0 会得到更精确的查找结果

8. 程序和函数里，操作数据库的工作完成后请注意结果集和指针的释放。

有兴趣可以看MYSQL管理员指南

posted @ 2006-04-11 12:15 无风之雨阅读(294) | 评论 (0) | 编辑收藏

MYSQL特殊函数

1、用mysql内置函数转换ip地址和数字
利用两个内置函数
inet_aton:将ip地址转换成数字型
inet_ntoa:将数字型转换成ip地址

2、用Mysql内置函数来转化unix时间(秒值)和字符串时间
from_unixtime():1144728462 -> "2006-04-11 12:07:42"
unix_timestamp():"2006-04-11 12:07:42" -> 1144728462

posted @ 2006-04-11 12:13 无风之雨阅读(278) | 评论 (0) | 编辑收藏

MYSQL5.1的表分区功能

Improving Database Performance with Partitioning

A few years ago, I wrote an article entitled "The Foundation of Excellent Performance" (still available at http://www.tdan.com/i016fe03.htm) where I argued against the notion that SQL code was the number one contributor to performance in a database-driven system. Instead, I stated in the article that I firmly believed how good physical database design was far and away the leading component of superior database performance. In addition, I showed that Oracle's own research illustrated how poor design was the main culprit behind database downtime (planned or unplanned). In the years since then, I've not changed my stance and still think that any DBA who wants a high-performance database has got to invest in intelligent and savvy physical design to produce the kind of response times that make end users smile instead of scream.

One of the reasons I'm very excited about the release of MySQL 5.1 is that it contains a potent new weapon for designing supercharged databases that any MySQL DBA should quickly learn how to use and exploit. By smartly using the new 5.1 partitioning feature, a DBA can oftentimes dramatically improve the performance of most any VLDB or data warehouse they happen to be in charge of.
What is Partitioning?

Partitioning is a physical database design technique that many data modelers and DBAs are quite familiar with. Although partitioning can be used to accomplish a number of various objectives, the main goal is to reduce the amount of data read for particular SQL operations so that overall response time is reduced.

There are two major forms of partitioning:

1. Horizontal Partitioning - this form of partitioning segments table rows so that distinct groups of physical row-based datasets are formed that can be addressed individually (one partition) or collectively (one-to-all partitions). All columns defined to a table are found in each set of partitions so no actual table attributes are missing. An example of horizontal partitioning might be a table that contains ten years worth of historical invoice data being partitioned into ten distinct partitions, where each partition contains a single year's worth of data.
2. Vertical Partitioning - this partitioning scheme is traditionally used to reduce the width of a target table by splitting a table vertically so that only certain columns are included in a particular dataset, with each partition including all rows. An example of vertical partitioning might be a table that contains a number of very wide text or BLOB columns that aren't addressed often being broken into two tables that has the most referenced columns in one table and the seldom-referenced text or BLOB data in another.

Before database vendors began building partitioning (mainly horizontal) into their engines, DBAs and data modelers had to physically design separate table structures to hold the desired partitions, which either held redundant data (separate tables with data that were based off a live parent table) or were linked together to form one logical parent object (usually via a view). This practice has since been made obsolete for the most part for horizontal partitioning, although it is sometimes still done for vertical partitioning.
Partitioning in MySQL 5.1

One of the great new features in MySQL 5.1 is support for horizontal partitioning. The really good news about MySQL and the new 5.1 partitioning feature is all the major forms of partitioning are supported:

1. Range - this partitioning mode allows a DBA to specify various ranges for which data is assigned. For example, a DBA may create a partitioned table that is segmented by three partitions that contain data for the 1980's, 1990's, and everything beyond and including the year 2000.
2. Hash - this partitioning mode allows a DBA to separate data based on a computed hash key that is defined on one or more table columns, with the end goal being an equal distribution of values among partitions. For example, a DBA may create a partitioned table that has ten partitions that are based on the table's primary key.
3. Key - a special form of Hash where MySQL guarantees even distribution of data through a system-generated hash key.
4. List - this partitioning mode allows a DBA to segment data based on a pre-defined list of values that the DBA specifies. For example, a DBA may create a partitioned table that contains three partitions based on the years 2004, 2005, and 2006.
5. Composite - this final partitioning mode allows a DBA to perform sub-partitioning where a table is initially partitioned by, for example range partitioning, but then each partition is segmented even further by another method (for example, hash).

There are a number of benefits that come with partitioning, but the two main advantages are:

Increased performance - during scan operations, the MySQL optimizer knows what partitions contain the data that will satisfy a particular query and will access only those necessary partitions during query execution. For example, a million row table may be broken up into ten different partitions in range style so that each partition contains 100,000 rows. If a query is issued that only needs data from one of the partitions, and a table scan operation is necessary, only 100,000 rows will be accessed instead of a million. Obviously, it is much quicker for MySQL to sample 100,000 rows than one million so the query will complete much sooner. The same benefit is derived should index access be possible as local partitioned indexes are created for partitioned tables. Finally, it is possible to stripe a partitioned table across different physical drives by specifying different file system/directory paths for specific partitions. This allows physical I/O contention to be reduced when multiple partitions are accessed at the same time.
Simplified data management - partitioning allows a DBA to have more control over how data is managed inside of the database. By intelligently creating partitions, a DBA can simplify how certain data operations are performed. For example, a DBA can drop specific partitions in a partitioned table while the remaining partitions remain intact (as opposed to crafting a fragmentation-producing mass delete operation for the whole table). Further, partitions are automatically maintained by MySQL so the DBA doesn't have to manually separate and maintain a horizontal partitioning scheme for a table. For example, a DBA can create a history table that holds data for customers that are partitioned across various year ranges, and have those partitioned automatically enforced by the database server with no DBA intervention being necessary.

From a design-for-performance standpoint, we're mainly interested in point one above. By smartly using partitioning and matching the design to properly coded queries, a dramatic performance impact can be realized. Let's take a quick test drive of partitioning in MySQL 5.1 to see this in action. Note that all tests below were done on a Dell Optiplex box with a Pentium 4 3.00GHz processor, 1GB of RAM, running Fedora Core 4 and MySQL 5.1.6 alpha.
Partitioning in Action

To see the positive benefit partitioning can have on a database, let's create identical MyISAM tables that contain date sensitive information, but let's partition one and leave the other a standard heap table. For our partitioned table, we'll partition based on range and use a function that segments the data based on year:

mysql> CREATE TABLE part_tab
-> ( c1 int default NULL,
-> c2 varchar(30) default NULL,
-> c3 date default NULL
->
-> ) engine=myisam
-> PARTITION BY RANGE (year(c3)) (PARTITION p0 VALUES LESS THAN (1995),
-> PARTITION p1 VALUES LESS THAN (1996) , PARTITION p2 VALUES LESS THAN (1997) ,
-> PARTITION p3 VALUES LESS THAN (1998) , PARTITION p4 VALUES LESS THAN (1999) ,
-> PARTITION p5 VALUES LESS THAN (2000) , PARTITION p6 VALUES LESS THAN (2001) ,
-> PARTITION p7 VALUES LESS THAN (2002) , PARTITION p8 VALUES LESS THAN (2003) ,
-> PARTITION p9 VALUES LESS THAN (2004) , PARTITION p10 VALUES LESS THAN (2010),
-> PARTITION p11 VALUES LESS THAN MAXVALUE );
Query OK, 0 rows affected (0.00 sec)

Notice that we designed partitions for a particular year and finished with one catch-all partition to get anything that doesn't fall into any of the specific date partitions. Now let's create a mirror MyISAM table that's not partitioned:

mysql> create table no_part_tab
-> (c1 int(11) default NULL,
-> c2 varchar(30) default NULL,
-> c3 date default NULL) engine=myisam;
Query OK, 0 rows affected (0.02 sec)

Now let's create a procedure (thanks to Peter Gulutzan for the code) that will fill our partitioned table with 8 million rows that distributes data fairly evenly across the various partitions. Once filled, we'll then insert the same data into our non-partitioned MyISAM clone table:

mysql> delimiter //
mysql> CREATE PROCEDURE load_part_tab()
-> begin
-> declare v int default 0;
-> while v < 8000000
-> do
-> insert into part_tab
-> values (v,'testing partitions',adddate('1995-01-01',(rand(v)*36520) mod 3652));
-> set v = v + 1;
-> end while;
-> end
-> //
Query OK, 0 rows affected (0.00 sec)
mysql> delimiter ;
mysql> call load_part_tab();
Query OK, 1 row affected (8 min 17.75 sec)
mysql> insert into no_part_tab select * from part_tab;
Query OK, 8000000 rows affected (51.59 sec)
Records: 8000000 Duplicates: 0 Warnings: 0

With our tables now ready, let's issue a simple date range query on both tables - the non-partitioned table first and then the partitioned table - followed by EXPLAIN's, and see what MySQL does:

mysql> select count(*) from no_part_tab where
-> c3 > date '1995-01-01' and c3 < date '1995-12-31';
+----------+
| count(*) |
+----------+
| 795181 |
+----------+
1 row in set (38.30 sec)

mysql> select count(*) from part_tab where
-> c3 > date '1995-01-01' and c3 < date '1995-12-31';
+----------+
| count(*) |
+----------+
| 795181 |
+----------+
1 row in set (3.88 sec)

mysql> explain select count(*) from no_part_tab where
-> c3 > date '1995-01-01' and c3 < date '1995-12-31'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: no_part_tab
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 8000000
Extra: Using where
1 row in set (0.00 sec)

mysql> explain partitions select count(*) from part_tab where
-> c3 > date '1995-01-01' and c3 < date '1995-12-31'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: part_tab
partitions: p1
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 798458
Extra: Using where
1 row in set (0.00 sec)

The power of proper partition and query design can easily be seen as the partitioned table access delivers a whopping 90% response time reduction over the non-partitioned table. The EXPLAIN plans showcase why this is (notice the new EXPLAIN syntax for partitioned objects) as only the first partition in the partitioned table is accessed with all others being skipped.

As a MySQL DBA, it's easy to get excited about the potential benefits that partitioning can provide, but you always want to make sure that the tool you use for database design matches the requirements and scenario of your particular application. Partitioning is best suited for VLDB's that contain a lot of query activity that targets specific portions/ranges of one or more database tables. Of course, other situations lend themselves to partitioning as well (e.g. data archiving, etc.)
A Quick Side Note on Vertical Partitioning

Although MySQL 5.1 automates horizontal partitioning, don't lose sight of vertical partitioning schemes when designing your databases. Although you have to do vertical partitioning manually, you can benefit from the practice in certain circumstances. For example, let's say you didn't normally need to reference or use the VARCHAR column defined in our previously shown partitioned table. Would the elimination of this column help query speed? Let's find out:

mysql> desc part_tab;
+-------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| c1 | int(11) | YES | | NULL | |
| c2 | varchar(30) | YES | | NULL | |
| c3 | date | YES | | NULL | |
+-------+-------------+------+-----+---------+-------+
3 rows in set (0.03 sec)

mysql> alter table part_tab drop column c2;
Query OK, 8000000 rows affected (42.20 sec)
Records: 8000000 Duplicates: 0 Warnings: 0

mysql> desc part_tab;
+-------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| c1 | int(11) | YES | | NULL | |
| c3 | date | YES | | NULL | |
+-------+---------+------+-----+---------+-------+
2 rows in set (0.00 sec)

mysql> select count(*) from part_tab where
-> c3 > date '1995-01-01' and c3 < date '1995-12-31';
+----------+
| count(*) |
+----------+
| 795181 |
+----------+
1 row in set (0.34 sec)

By removing the VARCHAR column from the design, you actually get another 90+% reduction in query response time. Beyond partitioning, this speaks to the effect wide tables can have on queries and why you should always ensure that all columns defined to a table are actually needed.
Wrap Up

A short article like this can't possibly cover all the benefits and mechanics of MySQL 5.1's partitioning, but a few notes of interest include:

* All storage engines support partitioning (MyISAM, Archive, InnoDB, etc.)
* Indexing support for partitioned tables include local indexes, which mirror each partition in a one-to-one scheme. In other words, if a partitioned table has ten partitions, then a local index for that table would also contain ten partitions.
* Metadata regarding partitioned tables can be found in the INFORMATION_SCHEMA database, with a new PARTITIONS table being available.
* All SHOW commands support the return of partitioned table and index metadata.
* Maintenance functions and a number of other operations can be performed on partitions (rather than acting on a full table), including:
o ADD PARTITION
o DROP PARTITION
o COALESCE PARTITION
o REORGANIZE PARTITION
o ANALYZE PARTITION
o CHECK PARTITION
o OPTIMIZE PARTITION
o REBUILD PARTITION
o REPAIR PARTITION

From a performance standpoint, the main take-away is that MySQL 5.1 partitioning is a powerful new tool that can be used in many physical database designs to dramatically improve performance and ease DBA management burdens. For more information on MySQL partitioning, you can visit out the online reference manual at http://dev.mysql.com/doc/refman/5.1/en/partitioning.html and visit the MySQL forums as there is a forum section devoted to partitioning, which can be referenced at http://forums.mysql.com/list.php?106.

Download a copy of MySQL 5.1 (which is now is beta) today and give partitioning a try. I think you will be pleased with all the new possibilities partitioning provides when it comes to creating a top-notch physical database design, which is the number one contributor to overall database performance.

posted @ 2006-04-11 11:47 无风之雨阅读(799) | 评论 (0) | 编辑收藏

Ginew.Z 的博客

常用链接

留言簿(6)

我参与的团队

随笔分类

随笔档案

搜索

最新评论

阅读排行榜

评论排行榜

href="#" vs. href="javascript:void(0)"