sed_tutorial

Selectorweb.com

New York

home > Sed

Email

sed tutorial
• sed introduction • sed examples #1	• sed commands • regular expressions • more sed examples

intro

home - top of the page - email

sed - Stream EDitor - works as a filter processing input line by line.
Sed reads one line at a time, chops off the terminating newline, puts what is left into the pattern space (buffer) where the sed script can process it (often using regular expressions), then if there anything to print - sed appends a newline and prints out the result (usually to stdout).
• www.engin.umich.edu/htbin/mangate?manpage=sed - sed

There are many tutorials and FAQs - search google for sed tutorial or sed faq

- www.faqs.org/faqs/editor-faq/sed/ - FAQ (also www.ptug.org/sed/sedfaq.html )
- www.dreamwvr.com/sed-info/sed-faq.html - FAQ
- http://spacsun.rice.edu/FAQ/sed.html - short 1-page intro - examples of usage
- http://spazioweb.inwind.it/seders/tutorials/ - list of links to sed tutorials
- http://spazioweb.inwind.it/seders/tutorials/sedtut_4.txt -original manual (1978 by Lee E. McMahon with the classic "Kubla Khan" example)
- www.math.fu-berlin.de/~leitner/sed/tutorial.html - tutorial by Felix von Leitner

Print out and read the following:

- www.dbnet.ece.ntua.gr/~george/sed/sedtut_1.html - good tutorial by Carlos Duarte
- www.dbnet.ece.ntua.gr/~george/sed/1liners.html - one-liners (compiled by Eric Pement) (see also www.cornerstonemag.com/sed/sed1line.txt )

Books:
Sed & Awk, 2d edition, by Dale Dougherty & Arnold Robbins (O'Reilly, 1997)
Mastering Regular Expressions, by Jeffrey E. F. Friedl (O'Reilly, 1997)

Several more sites: - Yao-Jen Chang - Sven Guckes - Felix von Leitner - Yiorgos Adamopoulos - Eric Pement

sed examples #1 home - top of the page - email

Simple commands = pattern + action. If no pattern is given, the action is applied to all lines, otherwise it is applied only to lines matching the pattern. Regular expressions are simimlar to those in Perl. Here is a typical example of usage:
Example:

>cat file
I have three dogs and two cats
>sed -e 's/dog/cat/g' -e 's/cat/elephant/g' file
I have three elephants and two elephants
>

The way you usually use sed is as follows:

>sed -e 'command1' -e 'command2' -e 'command3' file
>{shell command}|sed -e 'command1' -e 'command2'
>sed -f sedscript.sed file
>{shell command}|sed -f sedscript.sed

so sed can read from a file or STDIN, and the commands can be specified in a file or on the command line.
Note: trailing whitespaces in the sed script file can cause scripts to fail. Use editor which can show the trailing spaces and allows to remove them (vim is a good choice).

Example:
To delete the first 10 lines of stdin and echo the rest to stdout:
sed -e '1,10d'

The -e tells sed to execute the next command line argument as sed program.
1,10 - pattern
d - action (delete - general syntax is [address1[ , address2 ] ]d )

note, that since sed programs often contain regular expressions, they will often contain characters that your shell interprets, so you should get used to put all sed programs in single quotes so your shell won't interpret the sed program.

Example: show only lines which match the pattern /mama/:
sed -n -e '/line/p' test.txt

The -n suppresses printing for all the lines
the p activates printing for matched lines
test.txt is an input file to which this sed command is applied

Example: To print only the first ten lines, we would have deleted all the lines starting with 11:
sed -e '11,$d'

Note that $ is the last line. Because sed(1) processes the input line by line, it does not keep the whole input in memory. This makes sed(1) very useful for processing large files, but it has it's drawbacks, too. For example, we can't use sed -e '$-10,$d', since sed doesn't know $ before the end of file, so it doesn't know where $-10 is. This is a major problem, and it limits sed(1)'s usefulness, but sed(1) still has a large number of appliances.

Example: Another way to get only the first 10 lines is to use the -n option:
sed -n -e '1,10p'
If we want to delete only one line, the pattern can be '10,10' or simple '10'.

Example: More Than One Command (separated by new lines):
sed -e '1,4d
6,9d'
This would delete the lines 1 to 4 and 6 to 9.

Example: use the -e option more than once:
sed -e '1,4d' -e '6,9d'

Note: you can omit -e option if you have only one command in your program. But you should get used to the -e option, so you won't have to add it if you want to extend your program later on.

sed commands home - top of the page - email

General syntax for a command is:
[address1[,address2]] function [arguments]

if no address is given, a command is applied to all lines
if 1 address is given, then it is applied to all pattern spaces that match that address
if 2 addresses are given, then it is applied to all from addr1 to addr2 (including addr1 and addr2 themselves).

Note: Addresses may be expressed in line numbers or in patterns. If in patterns, then the substitution is applied to groups of lines from address1 to the first match of address2. If there are several groups like that in one file - they all will be affected.

Command example:
1,2s/line/LINE/

Tables of commands (number of arguments):

(2)!cmd exclamation sign means "Don't apply to specified addresses"

(0)# comment

(0):label place a label

(1)= display line number

(2)D delete first part of the pattern space

(2)G append contents of hold area

(2)H append pattern space on buffer

(2)N append next line

(2)P print first part of the pattern space

(1)a append text

(2)blabel branch to label

(2)c change lines

(2)d delete lines

(2)g get contents of hold area

(2)h hold pattern space (in a hold buffer)

(1)i insert lines

(2)l list lines

(2)n next line

(2)p print

(1)q quit

(1)r file read the contents of file

(2)tlabel test substitutions and branch on successful substitution

(2)w file write to file

(2)x exchange buffer space with pattern space

(2){ group commands

(2)s/RE/replacement/[flags] substitute

(2)y/list1/list2/ translates list1 into list2

regular expressions home - top of the page - email

The sed regular expressions are essentially the same as the grep regular expressions. They are summarized below.
Note that you have to escape with backslashes the many characters:
   curlies \{ \} , round brackets , vertical bars \| , star \*, plus \+, question mark \?

^ matches the beginning of the line

$ matches the end of the line

. dot matches any single character

...   \* match zero or more occurences of (char or something)

...   \+ match one or more occurences of (char or something)

...   \? Match 0 or 1 instance of (character)

[abcdef] Match any character enclosed in [] (in this instance, a b c d e or f) ranges of characters such as [a-z] are permitted. The behaviour of this deserves more description. See the page on grep for more details about the syntax of lists.
to include `]' in the list, make it the first char, to include `-' in the list, make it the first or last

[^abcdef] Match any character NOT enclosed in [] (in this instance, any character other than a b c d e or f)

(character)\{m,n\} Match m-n repetitions of (character)

(character)\{m,\} Match m or more repetitions of (character)

(character)\{,n\} Match n or less (possibly 0) repetitions of (character)

(character)\{n\} Match exactly n repetitions of (character)

$expression$ Group operator. Also memorizes into numbered variables - use for backreference as \1 \2 .. \9

\n Backreference - matches nth group

expression1\|expression2 Matches expression1 or expression 2. Works with GNU sed, but this feature might not work with other forms of sed.

\1 \2 ...\9 backreference, matches i-th memorized $..$

sed examples home - top of the page - email

Example: delete all the lines that contain the word ``debug'' from the log file:
sed -e '/debug/d' < log
This works just like grep -v debug.

Example: delete lines with the word debug, but we only want lines that contain ``foo''. The traditional way to handle this would be:
grep 'foo' < log | grep -v debug
Note that this spawns two grep processes. The sed equivalent would be:
sed -n -e '/debug/d' -e '/foo/p'
Here -n option inhibits printing, first pattern deletes all the lines with /debug/, and the second command forces printing of some of the remaining lines (which match /foo/).

Example: Calling sed program from a file:
sed -f program.sed
to set a -n option from within your sed program - use ``#n'' as the first line in your program file.

Example: Inserting Text with 'a' (append) or 'i' (insert) actions:
To insert a string just before line 10.

10i\
I am a string

To append a string after the last line:

$a\
I am a string

Example: Replacing the current line:

10c\
new contents for line 10

Example: option 'l' (as in 'life') causes sed to show visually all non-printable characters and wrap long lines using '\' at the end. Normal backslashes in the text are escaped, too, tabs are replaced with \t and nonprintable characters are printed as escaped three-digit octal numbers.

sed -n -e 'l' <test.txt
a\tb\tc$
d\te\tf$

Example: use 'q' action to end processing. So, yet another way of printing the first 10 lines would have been:
sed -e '10q'

Example: substitutions using regular expressions: 's/pattern/replacement/[flags]' - this is the most often used sed command.
sed -e 's/foo/bar/'
which would just change the string ``foo'' to ``bar''.

The format for the substitute command is as follows:
[address1[ ,address2]]s/pattern/replacement/[flags]
The flags can be any of the following:

n replace nth instance of pattern with replacement

g replace all instances of pattern with replacement

p write pattern space to STDOUT if a succesful substitution takes place

w file Write the pattern space to file if a succesful substitution takes place

Note: we can use differen delimiters (for example one of those: @%,;:) instead of '/'.

Example:

>cat file
the black cat was chased by the brown dog
>sed -e 's/black/white/g' file
the white cat was chased by the brown dog

Example: do substitution only in lines which match some pattern. In this example, the substitution is only applied to lines matching the regular expression /often/.

>cat file
the black cat was chased by the brown dog.
the black cat was often chased by the brown dog
>sed -e '/often/s/black/white/g' file
the black cat was chased by the brown dog.
the white cat was often chased by the brown dog.

Example:

>cat file
line 1 (one)
line 2 (two)
line 3 (three)
>sed -e '1,2s/line/LINE/' file
LINE 1 (one)
LINE 2 (two)
line 3 (three)
>sed -e '/^line.*one/s/line/LINE/' -e '/line/d' file
LINE 1 (one).

Example: Find First Word From a List in a File
This example uses backreferences ( \1, etc.) and subroutines ( grouping commands with curly braces ) .

#!/bin/sh
X='word1\|word2\|word3|\word4|\word5'
sed -e "
/$X/!d
/$X/{
        s/$$X$.*/\1/
        s/.*$$X$/\1/
        q
        }" $1

Double quotes used to expand the $X, and the $1 at the end is the argument given to the shell script (file name).

Note: the * operator is greedy.

Pattern matching across several lines - use N command to append the next line to a pattern space (or better use Perl for this task).

Example:

/Microsoft[ \t]*$/{
                        N
                        }
/Microsoft[ \t\n]*Windows[ \t]*$/{
                        N
                        }
s/Microsoft[ \t\n]*Windows[ \t\n]*95/Linux/g

Example: remove html tags (they may span several lines and they can be nested)

:top
/<.*>/{
s/<[^<>]*>//g
t top
}
/</{
        N
        b top
        }

A fine point: why didn't we replace the third line of the script with
s/<[^>]*>//g
and removing the t command that follows ? Well consider this sample file:
<<hello>
hello>
The desired output would be the empty set, since everything is enclosed in angled brackets. However, the output will look like this:
hello>
since the first line matches the expression <[^>]*> So the point is that we have set up the script to recursively remove the contents of the innermost matching pair of delimiters.
----------------------------------------------

(2)!cmd	exclamation sign means "Don't apply to specified addresses"
(0)#	comment
(0):label	place a label
(1)=	display line number
(2)D	delete first part of the pattern space
(2)G	append contents of hold area
(2)H	append pattern space on buffer
(2)N	append next line
(2)P	print first part of the pattern space
(1)a	append text
(2)blabel	branch to label
(2)c	change lines
(2)d	delete lines
(2)g	get contents of hold area
(2)h	hold pattern space (in a hold buffer)
(1)i	insert lines
(2)l	list lines
(2)n	next line
(2)p	print
(1)q	quit
(1)r file	read the contents of file
(2)tlabel	test substitutions and branch on successful substitution
(2)w file	write to file
(2)x	exchange buffer space with pattern space
(2){	group commands
(2)s/RE/replacement/[flags]	substitute
(2)y/list1/list2/	translates list1 into list2

^	matches the beginning of the line
$	matches the end of the line
.	dot matches any single character
... \*	match zero or more occurences of (char or something)
... \+	match one or more occurences of (char or something)
... \?	Match 0 or 1 instance of (character)
[abcdef]	Match any character enclosed in [] (in this instance, a b c d e or f) ranges of characters such as [a-z] are permitted. The behaviour of this deserves more description. See the page on grep for more details about the syntax of lists. to include `]' in the list, make it the first char, to include `-' in the list, make it the first or last
[^abcdef]	Match any character NOT enclosed in [] (in this instance, any character other than a b c d e or f)
(character)\{m,n\}	Match m-n repetitions of (character)
(character)\{m,\}	Match m or more repetitions of (character)
(character)\{,n\}	Match n or less (possibly 0) repetitions of (character)
(character)\{n\}	Match exactly n repetitions of (character)
\(expression\)	Group operator. Also memorizes into numbered variables - use for backreference as \1 \2 .. \9
\n	Backreference - matches nth group
expression1\\|expression2	Matches expression1 or expression 2. Works with GNU sed, but this feature might not work with other forms of sed.
\1 \2 ...\9	backreference, matches i-th memorized \(..\)

n	replace nth instance of pattern with replacement
g	replace all instances of pattern with replacement
p	write pattern space to STDOUT if a succesful substitution takes place
w file	Write the pattern space to file if a succesful substitution takes place