google rss rss

doneykoo [Ouditian]

DKzone- Ouditian Technology

UltraEdit 与Unix 正则表达式

UltraEdit 允许在搜索菜单下面列出了的许多搜索和替换功能中使用正则表达式。正则表达式能让更多的复杂的搜索和替换功能变成简单的操作。(中文版界面上显示为“正规表达式”)

有两个可使用的语法集合。下面的第一表显示出在 UltraEdit 的更早的版本被使用的原来的 UltraEdit 句法。第二表给出了可选的"Unix"类型的正则表达式。这可以从配置单元启用。

 

符号  功能

%     匹配行的开始 - 显示搜索字符串必须在行的开始,但是在所选择的结果字符串中不包括任何行终止字符。

$     匹配行尾 - 显示搜索字符串必须在行尾,但是在所选择的结果字符串中不包括任何行终止字符。

?     除了换行符以外匹配任何单个的字符

*     除了换行符匹配任何数量的字符和数字

+     前一字符匹配一个或多个,但至少要出现一个

++    前一字符匹配零个或多个,但至少要出现一个

^b    匹配一个分页

^p    匹配一个换行符(CR/LF)()(DOS文件)

^r    匹配一个换行符(CR 仅仅)()(MAC 文件)

^n    匹配一个换行符 ( LF 仅仅 )( )( UNIX 文件 )

^t    匹配一个标签字符TAB

[]    匹配任何单个的字符,或在方括号中的范围

^{A^}^{ B^} 匹配表达式A B

^     重载其后的正规表达式字符

^(^)  括或标注为用于替换命令的表达式。

 

一个正则表达式最多可以有9个标注表达式, 按正规表达式的需要而定。

相应的替换表达式是 ^x , 替换范围x1-9。例如:

If ^(h*o^) ^(f*s^) matches "hello folks",

^2 ^1 would replace it with "folks hello".

 

hello folks 将被替换成 folks hello。)

 

注: ^ 是实际字符 ^不是Ctl + 键值。

 

例如:

m?n 匹配 "man","men","min" 但不匹配 "moon".

t*t 匹配 "test","tonight" "tea time" (the "tea t" portion) 但不匹配 "tea

time" (newline between "tea " and "time").

Te+st 匹配 "test","teest"," teeeest "等等。但是不匹配 "tst"

[aeiou]  匹配每个小写元音。

[,.?]  匹配一文字的 ",""." "?"

[0-9, a-z] 匹配任何数位,或小写字母。

[~0-9] 除了数字以外匹配任何字符 (~ 意味着"")

 

你按如下方式可以查找一个表达式A B

 

"^{John^}^{Tom^}"

 

这将在找JohnTom的出现。应该在 2 个表达式之间没有任何东西。

 

你可以在同一搜索中按如下方式组合A or B and C or D

 

"^{John^}^{Tom^}^{Smith^}^{Jones^}"

 

这将在John or Tom 后面找 Smith or Jones

 

 

下表为"Unix"句法类型的正则表达式。

 

正则表达式 (Unix句法)

 

符号        功能

          标记下一个字符作为一个特殊的字符。

"n"         匹配字符"n""n" 一个换行符或换行符字符。

^           匹配/定位行的开始。

$           匹配/定位行的尾。

*           匹配前面的字符零次或多次。例

+           匹配前面的字符一次或多次。例

.           匹配除了一个换行符字符匹配任何单个的字符。

(expression)标注用于替换命令的表达式。一个正则表达式根据需要,最多可以有9个标注表达式。相应的代替表达式是 x , x的范围是 1-9

 

 

例如:

 

If (h.*o) (f.*s) matches "hello folks",

2 1 would replace it with "folks hello".

hello folks 将被替换成 folks hello。)

 

 

[xyz]       一个字符集。匹配在方括号之间的任何字符。

[^xyz]      一个否定的字符集。不匹配在方括号之间的任何字符。

d          匹配一个数字字符。等价于[0-9]

D          匹配一个非数字字符。等价于[^0-9]

f          匹配一个换页字符。

n          匹配一个换行字符。

r          匹配一个回车符字符。

s          匹配任何空白的空格, 标签, 换页, 包括空格等等,但不匹配换行符。

S          匹配任何非空白的字符,但不匹配换行符。

t          匹配一个标签TAB字符。

v          匹配一个垂直的标签字符。

w          匹配任何词语字符包括下划线。

W          匹配任何非词语字符字符。

 

注: ^ 是实际字符 ^不是Ctl + 键值。

 

 

例如:

m.n       匹配 "man","men","min" 但不匹配 "moon".

t+t       匹配 "test","tonight" "tea time" (the "tea t" portion) 但不匹配 "tea

time" (newline between "tea " and "time").

Te*st     匹配 "test","teest"," teeeest "等等。但是不匹配 "tst"

[aeiou]   匹配每个小写元音。

[,.?]     匹配一文字的 ",""." "?"

[0-9,a-z] 匹配任何数位,或小写字母。

[^0-9]    除了数字以外匹配任何字符 (~ 意味着"")

 

 

你按如下方式可以查找一个表达式A B

 

"(John)|(Tom)"

 

这将在找JohnTom的出现。应该在 2 个表达式之间没有任何东西。

 

你可以在同一搜索中按如下方式组合A or B and C or D

 

"(John|Tom) (Smith|Jones)"

 

这将在John or Tom 后面找 Smith or Jones

 

 

另外:

 

p        匹配 CR/LF ( 作为 rn 的一样 ) 作为DOS行结束符匹配

 

如果查找/替换功能中正则表达式没有选用,则替换字段中下列字符也是有效的:

 

符号   功能

 

^^          匹配一个 "^" 字符

^s          替换为被选择 ( 加亮 ) 活跃的文件窗口的文章。

^c          替换为剪贴板的内容

^b          匹配一个页裂缝

^p          匹配一个换行符 ( CR/LF )( )( DOS 文件)

^r          匹配一个换行符 ( CR 仅仅 )( )( MAC 文件)

^n          匹配一个换行符 ( LF 仅仅 )( )( UNIX 文件)

^t          匹配一个标签TAB字符

Regular Expressions

 

UltraEdit allows for Regular Expressions in many of its search and

replace functions listed under the Search Menu.

 

Regular expressions allow more complex search and replace functions

to be performed in a single operation.

 

There are two possible sets of syntax that may be used.  The first

table below shows the original UltraEdit syntax used in earlier

versions of UltraEdit.  The second table shows the optional "Unix"

style regular expressions.  This may be enabled from the

Configuration Section.

 

Regular Expressions (UltraEdit Syntax):

 

Symbol

 Function

 

%

 Matches the start of line - Indicates the search string must be at

the beginning of a line but does not include any line terminator

characters in the resulting string selected.

 

$

 Matches the end of line - Indicates the search string must be at the

end of line but does not include any line terminator characters in

the resulting string selected.

 

?

 Matches any single character except newline.

 

*

 Matches any number of occurrences of any character except newline.

 

+

 Matches one or more of the preceding character/expression.  At least

one occurrence of the character must be found.  Does not match

repeated newlines.

 

++

 Matches the preceding character/expression zero or more times.  Does

not match repeated newlines.

 

^b

 Matches a page break.

 

^p

 Matches a newline (CR/LF) (paragraph) (DOS Files)

 

^r

 Matches a newline (CR Only) (paragraph) (MAC Files)

 

^n

 Matches a newline (LF Only) (paragraph) (UNIX Files)

 

^t

 Matches a tab character

 

[ ]

 Matches any single character or range in the brackets

 

^{A^}^{B^}

 Matches expression A OR B

 

^

 Overrides the following regular expression character

 

^()  

 Brackets or tags an expression to use in the replace command.  A

regular expression may have up to 9 tagged expressions, numbered

according to their order in the regular expression.

 

The corresponding replacement expression is ^x, for x in the range

1-9.  Example: If ^(h*o^) ^(f*s^) matches "hello folks", ^2 ^1 would

replace it with "folks hello".

 

 

Note - ^ refers to the character '^' NOT Control Key + value.

 

Examples:

m?n matches "man", "men", "min" but not "moon".

 

t*t matches "test", "tonight" and "tea time" (the "tea t" portion)

but not "tea

time" (newline between "tea " and "time").

 

Te+st matches "test", "teest", "teeeest" etc. but does not match

"tst".

 

[aeiou] matches every lowercase vowel

[,.?] matches a literal ",", "." or "?".

[0-9a-z] matches any digit, or lowercase letter

[~0-9] matches any character except a digit (~ means NOT the

following)

 

You may search for an expression A or B as follows:

 

"^{John^}^{Tom^}?/SPAN>

 

This will search for an occurrence of John or Tom.  There should be

nothing between the two expressions.

 

You may combine A or B and C or D in the same search as follows:

 

"^{John^}^{Tom^} ^{Smith^}^{Jones^}"

 

This will search for John or Tom followed by Smith or Jones.

 

The table below shows the syntax for the "Unix" style regular

expressions.

 

Regular Expressions (Unix Syntax):

 

Symbol

 Function

 

\

 Indicates the next character has a special meaning. "n" on it own

matches the character "n". "\n" matches a linefeed or newline

character.  See examples below (\d, \f, \n etc).

 

^

 Matches/anchors the beginning of line.

 

$

 Matches/anchors the end of line.

 

*

 Matches the preceding character zero or more times.

 

+

 Matches the preceding character one or more times. Does not match

repeated newlines.

 

.

 Matches any single character except a newline character. Does not

match repeated newlines.

 

(expression)

 Brackets or tags an expression to use in the replace command.A

regular expression may have up to 9 tagged expressions, numbered

according to their order in the regular expression.

 

The corresponding replacement expression is \x, for x in the range

1-9.  Example: If (h.*o) (f.*s) matches "hello folks", \2 \1 would

replace it with "folks hello".

 

[xyz]

 A character set. Matches any characters between brackets.

 

[^xyz]

 A negative character set. Matches any characters NOT between

brackets.

 

\d

 Matches a digit character. Equivalent to [0-9].

 

\D

 Matches a nondigit character. Equivalent to [^0-9].

 

\f

 Matches a form-feed character.

 

\n

 Matches a linefeed character.

 

\r

 Matches a carriage return character.

 

\s

 Matches any whitespace including space, tab, form-feed, etc but not

newline.

 

\S

 Matches any non-whitespace character but not newline.

 

\t

 Matches a tab character.

 

\v

 Matches a vertical tab character.

 

\w

 Matches any word character including underscore.

 

\W

 Matches any nonword character.

 

\p

 Matches CR/LF (same as \r\n) to match a DOS line terminator

 

 

Note - ^ refers to the character '^' NOT Control Key + value.

 

Examples:

m.n matches "man", "men", "min" but not "moon".

 

Te+st matches "test", "teest", "teeeest" etc. BUT NOT "tst".

 

Te*st matches "test", "teest", "teeeest" etc. AND "tst".

 

[aeiou] matches every lowercase vowel

[,.?] matches a literal ",", "." or "?".

[0-9a-z] matches any digit, or lowercase letter

[^0-9] matches any character except a digit (^ means NOT the

following)

 

You may search for an expression A or B as follows:

 

"(John|Tom)"

 

This will search for an occurrence of John or Tom.  There should be

nothing between the two expressions.

 

You may combine A or B and C or D in the same search as follows:

 

"(John|Tom) (Smith|Jones)"

 

 

This will search for John or Tom followed by Smith or Jones.

 

If Regular Expression is not selected for the find/replace and in the

Replace field the following special characters are also valid:

 

Symbol

 Function

 

^^

 Matches a "^" character

 

^s

 Is substituted with the selected (highlighted) text of the active

file window.

 

^c

 Is substituted with the contents of the clipboard.

 

^b

 Matches a page break

 

^p

 Matches a newline (CR/LF) (paragraph) (DOS Files)

 

^r

 Matches a newline (CR Only) (paragraph) (MAC Files)

 

^n

 Matches a newline (LF Only) (paragraph) (UNIX Files)

 

^t

 Matches a tab character

 

 

Note - ^ refers to the character '^' NOT Control Key + value. 

 

Pasted from <http://www.niwota.com/submsg/1966636/>

 

posted on 2009-10-13 18:36 DoNeY 阅读(1698) 评论(0)  编辑  收藏


只有注册用户登录后才能发表评论。


网站导航:
 
  doneykoo blogjava