Strip white space from beginning of line.

$ sed 's/^\s*//g' <file>

Strip extra white space from file.

$ sed 's/\s\s+/\s/g' <file>

Basic Knowledge

This will be updated shortly -- Last update was on 2007-11-21 12:06:00

The complete list of metacharacters are: . ^ $ * + ? { [ ] \ \| ( ).

http://www.amk.ca/python/howto/regex/ »

Convert text link to real html link with PHP

$item = preg_replace("/(?:(http:\/\/)|(www\.))(\S+\b\/?)([ [:punct:]]*)(\s|$)/i","<a href=\"http://$2$3\">$1$2$3</a>$4$5", $item);

Useful Regular Expressions

email: ^[a-z0-9_\-]+(\.[_a-z0-9\-]+)*@([_a-z0-9\-]+\.)+([a-z]{2}|aero|arpa|biz|com|coop|edu|gov|info|int|jobs|mil|museum|name|nato|net|org|pro|travel)$
ip: ^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$
html link: <a[^>]*href="[^\s"]+"[^>]*>[^<]*<\/a>
url: ^((https?|ftp|news):\/\/)?([a-z]([a-z0-9\-]*\.)+([a-z]{2}|aero|arpa|biz|com|coop|edu|gov|info|int|jobs|mil|museum|name|nato|net|org|pro|travel)|(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5]))(\/[a-z0-9_\-\.~]+)*(\/([a-z0-9_\-\.]*)(\?[a-z0-9+_\-\.%=&]*)?)?(#[a-z][a-z0-9_]*)?$

Greedy and Lazy Repetition

The repetition operators or quantifiers are greedy. They will expand the match as far as they can, and only give back if they must to satisfy the remainder of the regex. The regex <.+> will match <EM>first</EM> in This is a <EM>first</EM> test.

Place a question mark after the quantifier to make it lazy. <.+?> will match <EM> in the above string.

A better solution is to follow my advice to use the dot sparingly. Use <[^<>]+> to quickly match an HTML tag without regard to attributes. The negated character class is more specific than the dot, which helps the regex engine find matches quickly.

Example with perl

# perl -p -i.wjs -e "s?<(em)\\?>?<htmltag>?g" file.html

http://www.regular-expressions.info/quickstart.html »

Page Comments (Click to edit)






[Click to add or edit comments])

Please prepend comments below including a date

Design by N.Design Studio, adapted by solidGone.org (version 1.0.0)
Have a nice day.