<rt id="bn8ez"></rt>
<label id="bn8ez"></label>

  • <span id="bn8ez"></span>

    <label id="bn8ez"><meter id="bn8ez"></meter></label>

    zhyiwww
    用平實(shí)的筆,記錄編程路上的點(diǎn)點(diǎn)滴滴………
    posts - 536,comments - 394,trackbacks - 0

    A Tao of Regular Expressions

    Steve Mansour
    sman@scruznet.com
    Revised: June 5, 1999
    (copied by jm /at/ jmason.org from http://www.scruz.net/%7esman/regexp.htm, after the original disappeared! )


    C O N T E N T S

    What Are Regular Expressions
    Examples
    ??
    Simple
    ??
    Medium (Strange Incantations)
    ??
    Hard (Magical Hieroglyphics)
    Regular Expressions In Various Tools


    What Are Regular Expressions

    A regular expression is a formula for matching strings that follow some pattern. Many people are afraid to use them because they can look confusing and complicated. Unfortunately, nothing in this write up can change that. However, I have found that with a bit of practice, it's pretty easy to write these complicated expressions. Plus, once you get the hang of them, you can reduce hours of laborious and error-prone text editing down to minutes or seconds. Regular expressions are supported by many text editors, class libraries such as Rogue Wave's Tools.h++, scripting tools such as awk, grep, sed, and increasingly in interactive development environments such as Microsoft's Visual C++.

    Regular expressions usage is explained by examples in the sections that follow. Most examples are presented as vi substitution commands or as grep file search commands, but they are representative examples and the concepts can be applied in the use of tools such as sed, awk, perl and other programs that support regular expressions. Have a look at Regular Expressions In Various Tools for examples of regular expression usage in other tools. A short explanation of vi's substitution command and syntax is provided at the end of this document.

    Regular Expression Basics

    Regular expressions are made up of normal characters and metacharacters. Normal characters include upper and lower case letters and digits. The metacharacters have special meanings and are described in detail below.

    In the simplest case, a regular expression looks like a standard search string. For example, the regular expression "testing" contains no metacharacters. It will match "testing" and "123testing" but it will not match "Testing".

    To really make good use of regular expressions it is critical to understand metacharacters. The table below lists metacharacters and a short explanation of their meaning.
    ?

    Metacharacter ? Description


    .
    Matches any single character. For example the regular expression r.t would match the strings rat, rut, r t, but not root.?
    $
    Matches the end of a line. For example, the regular expression weasel$ would match the end of the string "He's a weasel" but not the string "They are a bunch of weasels."?
    ^
    Matches the beginning of a line. For example, the regular expression ^When in would match the beginning of the string "When in the course of human events" but would not match "What and When in the" .?
    *
    Matches zero or more occurences of the character immediately preceding. For example, the regular expression .* means match any number of any characters.?
    \
    This is the quoting character, use it to treat the following character as an ordinary character. For example, \$ is used to match the dollar sign character ($) rather than the end of a line. Similarly, the expression \. is used to match the period character rather than any single character.?
    [ ]?
    [c1-c2]
    [^c1-c2]
    Matches any one of the characters between the brackets. For example, the regular expression r[aou]t matches rat, rot, and rut, but not ret. Ranges of characters can specified by using a hyphen. For example, the regular expression [0-9] means match any digit. Multiple ranges can be specified as well. The regular expression [A-Za-z] means match any upper or lower case letter. To match any character except those in the range, the complement range, use the caret as the first character after the opening bracket. For example, the expression [^269A-Z] will match any characters except 2, 6, 9, and upper case letters.?
    \< \>
    Matches the beginning (\<) or end (\>) or a word. For example, \<the matches on "the" in the string "for the wise" but does not match "the" in "otherwise". NOTE: this metacharacter is not supported by all applications.
    \( \)
    Treat the expression between \( and \) as a group. Also, saves the characters matched by the expression into temporary holding areas. Up to nine pattern matches can be saved in a single regular expression. They can be referenced as \1 through \9.
    |
    Or two conditions together. For example (him|her) matches the line "it belongs to him" and matches the line "it belongs to her" but does not match the line "it belongs to them." NOTE: this metacharacter is not supported by all applications.
    +
    Matches one or more occurences of the character or regular expression immediately preceding. For example, the regular expression 9+ matches 9, 99, 999. NOTE: this metacharacter is not supported by all applications.
    ?
    Matches 0 or 1 occurence of the character or regular expression immediately preceding.NOTE: this metacharacter is not supported by all applications.
    \{ i \}
    \{ i , j \}
    Match a specific number of instances or instances within a range of the preceding character. For example, the expression A[0-9]\{3\} will match "A" followed by exactly 3 digits. That is, it will match A123 but not A1234. The expression [0-9]\{4,6\} any sequence of 4, 5, or 6 digits. NOTE: this metacharacter is not supported by all applications.


    The simplest metacharacter is the dot. It matches any one character (excluding the newline character). Consider a file named test.txt consisting of the following lines:

      he is a rat
      he is in a rut
      the food is Rotten
      I like root beer
    We can use grep to test our regular expressions. Grep uses the regular expression we supply and tries to match it to every line of the file. It prints all lines where the regular expression matches at least one sequence of characters on a line. The command
      grep r.t test.txt
    searches for the regular expression r.t in each line of test.txt and prints the matching lines. The regular expression r.t matches an r followed by any character followed by a t. It will match rat and rut. It does not match the Rot in Rotten because regular expressions are case sensitive. To match both the upper and lower the square brackets (character range metacharacters) can be used. The regular expression [Rr] matches either Ror r. So, to match an upper or lower case r followed by any character followed by the character t the regular expression [Rr].t will do the trick.

    To match characters at the beginning of a line use the circumflex character (sometimes called a caret). For example, to find the lines containing the word "he" at the beginning of each line in the file test.txt you might first think the use the simple expression he. However, this would match the in the third line. The regular expression ^he only matches the h at the beginning of a line.

    Sometimes it is easier to indicate something what should not be matched rather than all the cases that should be matched. When the circumflex is the first character between the square brackets it means to match any character which is not in the range. For example, to match he when it is not preceded by t or s, the following regular expression can be used: [^st]he.

    Several character ranges can be specified between the square brackets. For example, the regular expression [A-Za-z] matches any letter in the alphabet, upper or lower case. The regular expression [A-Za-z][A-Za-z]* matches a letter followed by zero or more letters. We can use the + metacharacter to do the same thing. That is, the regular expression [A-Za-z]+ means the same thing as [A-Za-z][A-Za-z]*. Note that the + metacharacter is not supported by all programs that have regular expressions. See Regular Expressions Syntax Support for more details.

    To specify the number of occurrences matched, use the braces (they must be escaped with a backslash). As an example, to match all instances of 100 and 1000 but not 10 or 10000 use the following: 10\{2,3\}. This regular expression matches a the digit 1 followed by either 2 or 3 0's. A useful variation is to omit the second number. For example, the regular expression 0\{3,\} will match 3 or more successive 0's.



    |----------------------------------------------------------------------------------------|
                               版權(quán)聲明  版權(quán)所有 @zhyiwww
                引用請(qǐng)注明來源 http://m.tkk7.com/zhyiwww   
    |----------------------------------------------------------------------------------------|
    posted on 2006-06-20 11:04 zhyiwww 閱讀(1465) 評(píng)論(1)  編輯  收藏 所屬分類: 數(shù)據(jù)結(jié)構(gòu)和算法

    FeedBack:
    # re: A Tao of Regular Expressions(轉(zhuǎn)載)
    2006-10-06 23:14 | refun
    文章沒有帖完整啊  回復(fù)  更多評(píng)論
      
    主站蜘蛛池模板: 好猛好深好爽好硬免费视频| **俄罗斯毛片免费| 亚洲国产精品第一区二区| 亚洲视频免费一区| 亚洲av第一网站久章草| 亚洲永久无码3D动漫一区| 青青久在线视频免费观看| 无码免费又爽又高潮喷水的视频| 亚洲A∨无码一区二区三区| 中文字幕无码成人免费视频| 三级片免费观看久久| 亚洲成aⅴ人片在线影院八| 亚洲国产一级在线观看 | 亚洲成人免费网站| 国产免费变态视频网址网站| 午夜精品射精入后重之免费观看| 亚洲日韩看片无码电影| 国产亚洲精品自在久久| 日本高清免费中文字幕不卡| 久久精品国产这里是免费| 特级毛片免费播放| 亚洲人和日本人jizz| 久久精品国产亚洲综合色| 国产在线a不卡免费视频| 91久久青青草原线免费| jzzjzz免费观看大片免费| 亚洲欧美日韩综合久久久久| 亚洲视频在线视频| 国产成人精品久久亚洲高清不卡 | 无码日韩精品一区二区免费| 暖暖免费在线中文日本| 五月天婷婷精品免费视频| 亚洲av无码专区青青草原| 亚洲网站在线播放| 亚洲精品二区国产综合野狼| 免费一级成人毛片| 成年丰满熟妇午夜免费视频 | 亚洲午夜久久久久久久久久 | 久久精品国产精品亚洲艾| 亚洲国产日韩在线观频| 欧洲美熟女乱又伦免费视频|