pattern比对    (grep)
| 字符串 命令 /pattern/修饰词 
 | 
| 命令 =~          表示比對符合pattern
 !~         表示比對不符合pattern
 
 | 
| 修饰词 i 不計大小寫
 x 在模式中忽略空格
 g 继续比对,继续寻找,相当于find next
 | 
   例子  :扫描文件gogo,找含有want的行
| #!/usr/bin/perl 
 $file="/home/macg/perltest/gogo";
 &gotest($file);
 
 sub gotest{
 my(@tmp)=@_;
 
 open (MYFILE, $tmp[0]) || die ("Could not open file");
 my($line,$newline);
 while ($line=<MYFILE>) {
 if($newline=($line=~/want/)) {          行中查找含有want
 print "found\n";
 print "\$line is:$line";
 print "\$newline is:$newline \n";
 } else {
 print "not found\n";
 print "\$line is:$line";
 print "\$newline is:$newline \n";
 }
 }
 close(MYFILE);
 }
 
 | 
| [macg@localhost perltest]$ ./tip.pl
 not found
 $line is:I glad to be Los angle
 $newline is:
 found
 $line is:I want to be Los angle
 $newline is:1
 
 | 
    缺省的,模式定界符为反斜线/,但其可用字母m自行指定,如:
m!/u/jqpublic/perl/prog1!    等价于    /\/u\/jqpublic\/perl\/prog1/
而且换成其他字符后,/就不属于特殊字符了,不必加\/了
  pattern
\d或\d+      任意数字 [0-9]
\D或\D+     除数字外的任意字符  
/[\da-z]/   等同于/[0-9a-z]/
^    /^def/  只匹配以def打头的字符串
$    
/\\/         转义字符
/\*+/
             pattern中标点都要加\
[]           意味着匹配一组字符中的一个
* + ? .      通配符
   *+不能作为首字符,所以任意字符必须用显示表示法[0-9a-zA-Z]
| $line=~ 
 | 
| syntax error at ./address.pl line 6, near "out @int_hwaddress" Quantifier follows nothing in regex; marked by <-- HERE in m at ./address.pl line 41
 
 | 
改为
$line=~/[0-9a-zA-Z]+:[0-9a-fA-F]+:[0-9a-fA-F]+:[0-9a-fA-F]+:[0-9a-fA-F]+:[0-9a-fA-F]+/
.
    pattern中空格就是" "
| if(($input=~/^ping$/i)||($input=~/^ping $/i)) 
 | 
| macg>ping command:[ping xxx.xxx.xxx.xxx]
 macg>ping                      带一个空格
 command:[ping xxx.xxx.xxx.xxx]
 IEI-nTracker>ping               带两个空格
 Use of uninitialized value in concatenation (.) or string at /nettracker/ntshell/ntshell
 
 
 | 
| if(($input=~/^ping$/i)||($input=~/^ping +$/i)) | 
| IEI-nTracker>ping     多个空格 command:[ping xxx.xxx.xxx.xxx]
 
 | 
| if($_[0]=~/^ping +[0-9a-zA-Z\.]+$/i) 
 | 
| 相当于"ping xxxxx"或"ping        xxxx" 
 | 
    =~/want/是等于,还是含有???
当然是包含
等于其实就是exactly匹配,/^wang$/
    格式匹配(不是包含性符合),通常用于一些特殊格式输入时用(比如IP地址)
| #!/usr/bin/perl 
 $file="/home/macg/perltest/gogo";
 &gotest($file);
 
 sub gotest{
 my(@tmp)=@_;
 
 open (MYFILE, $tmp[0]) || die ("Could not open file");
 my($line,$newline);
 while ($line=<MYFILE>) {
 if ($line =~ /\d+\.\d+\.\d+\.\d+/) {
 print "$line";
 print "the ip add is good\n";
 } else {
 print "$line";
 print "the ip add is a error\n";
 }
 }
 close(MYFILE);
 }
 
 
 | 
| [macg@localhost perltest]$ cat gogo 202.106.0.20
 10.0.0.as
 
 [macg@localhost perltest]$ ./tip.pl
 202.106.0.20
 the ip add is good
 10.0.0.as
 the ip add is a error
 
 | 
上面这个例子也不对,会出下面的错:包含以外的错误,所以应该加^ $
[macg@localhost perltest]$ cat gogo
202.106.0.20
10.0.0.as
10.0.0.1 as   
| 改成 if ($line =~ /^\d+\.\d+\.\d+\.\d+$/) { 
 | 
| [macg@localhost perltest]$ ./tip.pl 202.106.0.20
 the ip add is good
 10.0.0.as
 the ip add is a error
 10.0.0.1 as
 the ip add is a error
 | 
      /want/g与/want/的区别:指针后移,相当于find next
| $line="inet addr:192.168.10.17  Bcast:192.168.10.255  Mask:255.255.255.0"; $line=~/(\d+)\.(\d+)\.(\d+)\.(\d+)/;
 print "$&\n";       $&查询结果
 $line=~/(\d+)\.(\d+)\.(\d+)\.(\d+)/;
 print "$&\n";
 $line=~/(\d+)\.(\d+)\.(\d+)\.(\d+)/;
 print "$&\n";
 
 | 
| [root@nm testpl]# ./tip.pl 192.168.10.17
 192.168.10.17
 192.168.10.17
 几次的查找都相同,因为每次都是回到“首部”找“一次”
 找到一个,就返回值1,并停止比对
 
 | 
加g
| $line="inet addr:192.168.10.17  Bcast:192.168.10.255  Mask:255.255.255.0"; $line=~/(\d+)\.(\d+)\.(\d+)\.(\d+)/g;
 print "$&\n";
 $line=~/(\d+)\.(\d+)\.(\d+)\.(\d+)/g;
 print "$&\n";
 $line=~/(\d+)\.(\d+)\.(\d+)\.(\d+)/g;
 print "$&\n";
 
 | 
| [root@nm testpl]# ./tip.pl 192.168.10.17
 192.168.10.255
 255.255.255.0
 加g后,好象文件指针一样,查一次,指针就移一格
 
 | 
   “或”下的比对,匹配的先后顺序很重要, 尤其是包含型的,要把精确的放前面
| if (($ret=($line=~/eth[0-5]/))||($ret=($line=~/eth[0-5]:[0-5]/))) {
 print "$&:";
 $found=1;
 } elsif ($found)
 {
 print $line;
 $found=0;
 }
 }
 
 | 
| [mac@nm testpl]$ ./address.pl eth0:          inet addr:10.4.3.117  Bcast:10.4.255.255  Mask:255.255.0.0
 eth0:          inet addr:192.168.10.117  Bcast:192.168.10.255  Mask:255.255.255.0
 eth0:          inet addr:192.168.1.142  Bcast:192.168.1.255  Mask:255.255.255.0
 eth1:          BROADCAST MULTICAST  MTU:1500  Metric:1
 
 | 
改成精确的放前面
| if (($ret=($line=~/eth[0-5]:[0-5]/))||($ret=($line=~/eth[0-5]/))) 
 | 
| [mac@nm testpl]$ ./address.pl eth0:          inet addr:10.4.3.117  Bcast:10.4.255.255  Mask:255.255.0.0
 eth0:1:          inet addr:192.168.10.117  Bcast:192.168.10.255  Mask:255.255.255.0
 eth0:2:          inet addr:192.168.1.142  Bcast:192.168.1.255  Mask:255.255.255.0
 eth1:          BROADCAST MULTICAST  MTU:1500  Metric:1
 
 | 
先查到精确,就会跳过模糊的(部分的),否则会用模糊的(部分的)代替精确的
    $line =~ /want/ 完成,只返回1和null
即比对不修改=~左边字符串
与比对截然不同,替换是修改=~左边字符串的
| my($line,$newline); while ($line=<MYFILE>) {
 if($newline=($line=~/want/)) {
 print "found\n";
 print "\$line is:$line";
 print "\$newline is:$newline \n";
 } else {
 print "not found\n";
 print "\$line is:$line";
 print "\$newline is:$newline \n";
 }
 }
 close(MYFILE);
 }
 
 | 
| [macg@localhost perltest]$ ./tip.pl not found
 $line is:I glad to be Los angle
 $newline is:
 found
 $line is:I want to be Los angle
 $newline is:1
 
 | 
    如何能找到比对结果?    用$&
| while ($line=<MYFILE>) { if ($line =~ /want/) {
 print "$line";
 print "\$\& is $&\n";
 print "good\n";
 } else {
 print "$line";
 print " error\n";
 }
 }
 
 | 
| [macg@localhost perltest]$ ./tip.pl I want go to LA. and I also want to be NY.       $line没发生变化
 $& is want                         $&是比对结果
 good
 But I glad to be D.C.
 error
 
 | 
   注意if ($line=~/want/)和赋值毫无关系,所以不存在$line值改变的问题,$line只是操作符号=~左边的一个元素而已,所以也不存在返回值给$line的问题
    $&     $`      $' 的含义
| while ($line=<MYFILE>) { if ($line =~ /want/g) {
 print "good\n";
 print "$line";
 print  $& . "\n";
 print  $` . "\n";
 print  $' . "\n";
 print "good\n";
 } else {
 print " error\n";
 print "$line";
 }
 }
 
 | 
| [macg@localhost perltest]$ ./tip.pl good
 I want go to LA. and I also want to be NY.
 want                  $&      $&是最后一个match,也可算是结果
 I                      $`       match之前的所有字符
 go to LA. and I also want to be NY.    $'      match之后的所有字符
 | 
 
   !~ 比对不符合pattern      (其实没什么用,因为用if ( =~ ) else即可)
    perl可以将pattern再细分 ,再用$1,$2,$3,$4表示这些子match
步骤:
1.对想单独提出来的子pattern加( )
2.再用$1,$2来表示| if ($line =~ /^(\d+)\.(\d+)\.(\d+)\.(\d+)$/) { print "good \n";
 print $line;
 print  $& . "\n";
 print $1,"\n";
 print $2,"\n";
 print $3,"\n";
 print $4,"\n";
 
 | 
| [macg@localhost perltest]$ ./tip.pl good
 202.106.0.20
 202.106.0.20
 202
 106
 0
 20
 
 | 
 
   修饰词i          不计大小写
| if ($line =~ /want/i) { print "good \n";
 print $line;
 print  $& . "\n";
 
 | 
| [macg@localhost perltest]$ ./tip.pl good
 I WANT TO go to
 WANT
 
 | 
 
    修饰词x 在模式中忽略空格
/\d{2} ([\W]) \d{2} \1 \d{2}/x    等价于    /\d{2}([\W])\d{2}\1\d{2}/
 替换格式
命令与修饰词基本上与比对相同| 格式: string command  s/pattern/欲置換的字串/修饰词 
 | 
| 命令与比对相同 =~          先比对符合(=~)再替换
 !~          比对不符合(!~)再替换
 
 | 
   基本替换(后面替换前面)
| $line =~s/want/hate/i; print "good \n";
 print "\$line is :$line";
 print "\$\& is : $&", "\n";
 
 | 
| [macg@localhost perltest]$ ./tip.pl good
 $line is :I hate TO go to    与比对截然不同,替换是修改=~左边字符串的
 $& is : WANT      替换里的$&和就是比对的$&
 
 | 
 
    修饰词i,不计大小写
$line =~s/want/hate/i;    將 $line中的 want 或 WANT,Want 換成 hate
    刪除(替换为空)
单纯的删除一般没用,实际应用中,基本上都用全域删除(g)
| $line =~s/want//i;
 print "\$line is :$line";
 
 | 
| [macg@localhost perltest]$ ./tip.pl $line is :I  TO go to
 
 | 
  
 
    g全域替换,替换所有的,缺省替换是查找到第一符合的就替换,然后停止
| $line =~s/want/hate/ig;       修饰词可以连写 print "good \n";
 print "\$line is :$line"
 
 | 
| [macg@localhost perltest]$ cat gogo I WANT TO go to NY. And I also want to be DC.
 I glad to go to
 
 [macg@localhost perltest]$ ./tip.pl
 $line is :I hate TO go to NY. And I also hate to be DC.
 
 | 
替换g与比对的g的不同
- 比对g是find next,所以需要与while等合用
- 替换不需要用循环语句,一句就能实现所有替换,即:替换不需要find and find next,替换可以find all in one time.
    e选项把替换部分的字符串看作表达式,在替换之前先计算其值
$string = "0abc1";
$string =~ s/[a-zA-Z]+/$& x 2/e;  将中间的字符(非数字)成倍
now $string = "0abcabc1"
$&是查找结果
    
    转换格式
| string command tr/字元集/欲轉換的字元集/修饰词 string command y/字元集/欲轉換的字元集/修饰词
 
 | 
| 命令:=~  !~ 修饰词:
 d 刪除
 s 將重覆刪除
 c 非转换:將不在指定字元集中的字元(包括換行字元),換成欲轉換的字元集
 
 | 
   最基本的转换:字符串小写换大写
| $line =~tr/a-z/A-Z/; print "\$line is :$line";
 
 | 
| [macg@localhost perltest]$ cat gogo I WANT TO go to NY. And I also want to be DC.
 
 [macg@localhost perltest]$ ./tip.pl
 good
 $line is :I WANT TO GO TO NY. AND I ALSO WANT TO BE DC.
 
 | 
转换和替换一样,也是修改string
 
    删除:     =~tr/要删除的字符//d
    全域替换删除和转换删除等价
全域替换删除    $line =~tr/\t//g;
转换删除        $line =~tr/\t//d;
| $line =~tr/\t//;            删除所有TAB  转化所有TAB为空// print "\$line is :$line";
 
 | 
| [macg@localhost perltest]$ ./tip.pl $line is :I WANT TO              go to NY. And   I also want to          be DC.
 
 | 
发觉TAB没删掉,其实不是没删掉,只是只删了第一个TAB而已
| $line =~tr/\t//d; print "\$line is :$line";
 
 | 
| [macg@localhost perltest]$ ./tip.pl good
 $line is :I WANT TO go to NY. And I also want to be DC.
 
 | 
    删除重复字符:   =~ tr/a-zA-Z//s;      这功能没什么实际用途
$line=~ tr/a-zA-Z//s;
  print "\$line is :$line";  
[macg@localhost perltest]$ cat gogo
I WANTWANT TO go to NNYY. And I also wWant to be DC.  
[macg@localhost perltest]$ ./tip.pl
good
$line is :I WANTWANT TO go to NY. And I also wWant to be DC.  
    tr转换不支持!~  只支持=~      因为修饰词c就相当于!~了
$text="1 abc 23 PID";
$text =~ tr/[0-9]c;      [0-90]c即非数字
 
 
一个CGI控件值的解码的示范程序:
| $value="%A4T%A4K%21"; $value=~s/%([a-fA-F0-9][a-fA-F0-9])/pack("C",hex($1))/eg;
 
 | 
| s替换%字符串, 并把符合的字符串传给$1,
 把$1通过e运算pack("C",hex($1))进行解码处理
 pack("C",hex($1))把$1找到的十六进制数值转成十进制的码
 C代表unsigned char value的意
 
 
 | 
	posted on 2012-03-10 15:44 
xzc 阅读(8227) 
评论(0)  编辑  收藏  所属分类: 
linux/unix