Sed – Useful Resources

Sed – Useful Resources ”; Previous Next The following resources contain additional information on SED. Please use them to get more in-depth knowledge on this topic. Useful Links on SED GNU Sed – Stream Editor Manual by GNU Wikipedia – Wikipedia reference for Sed Useful Books on SED To enlist your site on this page, please drop an email to [email protected] Print Page Previous Next Advertisements ”;

Sed – Basic Syntax

Stream Editor – Basic Syntax ”; Previous Next This chapter introduces the basic commands that SED supports and their command-line syntax. SED can be invoked in the following two forms: sed [-n] [-e] ”command(s)” files sed [-n] -f scriptfile files The first form allows to specify the commands in-line and they are enclosed within single quotes. The later allows to specify a script file that contains SED commands. However, we can use both forms together multiple times. SED provides various command-line options to control its behavior. Let us see how we can specify multiple SED commands. SED provides the delete command to delete certain lines. Let us delete the 1st, 2nd, and 5th lines. For the time being, ignore all the details of the delete command. We will discuss more about the delete command later. First, display the file contents using the cat command. [jerry]$ cat books.txt On executing the above code, you get the following result: 1) A Storm of Swords, George R. R. Martin, 1216 2) The Two Towers, J. R. R. Tolkien, 352 3) The Alchemist, Paulo Coelho, 197 4) The Fellowship of the Ring, J. R. R. Tolkien, 432 5) The Pilgrimage, Paulo Coelho, 288 6) A Game of Thrones, George R. R. Martin, 864 Now instruct SED to remove only certain lines. Here, to delete three lines, we have specified three separate commands with -e option. [jerry]$ sed -e ”1d” -e ”2d” -e ”5d” books.txt On executing the above code, you get the following result: 3) The Alchemist, Paulo Coelho, 197 4) The Fellowship of the Ring, J. R. R. Tolkien, 432 6) A Game of Thrones, George R. R. Martin, 864 Additionally, we can write multiple SED commands in a text file and provide the text file as an argument to SED. SED can apply each command on the pattern buffer. The following example illustrates the second form of SED. First, create a text file containing SED commands. For easy understanding, let us use the same SED commands. [jerry]$ echo -e “1dn2dn5d” > commands.txt [jerry]$ cat commands.txt On executing the above code, you get the following result: 1d 2d 5d Now instruct the SED to read commands from the text file. Here, we achieve the same result as shown in the above example. [jerry]$ sed -f commands.txt books.txt On executing the above code, you get the following result: 3) The Alchemist, Paulo Coelho, 197 4) The Fellowship of the Ring, J. R. R. Tolkien, 432 6) A Game of Thrones,George R. R. Martin, 864 Standard Options SED supports the following standard options: -n: Default printing of pattern buffer. For example, the following SED command does not show any output: [jerry]$ sed -n ”” quote.txt -e : Next argument is an editing command. Here, angular brackets imply mandatory parameter. By using this option, we can specify multiple commands. Let us print each line twice: [jerry]$ sed -e ”” -e ”p” quote.txt On executing the above code, you get the following result: There is only one thing that makes a dream impossible to achieve: the fear of failure. There is only one thing that makes a dream impossible to achieve: the fear of failure. – Paulo Coelho, The Alchemist – Paulo Coelho, The Alchemist -f : Next argument is a file containing editing commands. The angular brackets imply mandatory parameter. In the following example, we specify print command through file: [jerry]$ echo “p” > commands [jerry]$ sed -n -f commands quote.txt On executing the above code, you get the following result: There is only one thing that makes a dream impossible to achieve: the fear of failure. – Paulo Coelho, The Alchemist GNU Specific Options Let us quickly go through the GNU specific SED options. Note that these options are GNU specific; and may not be supported by other variants of the SED. In later sections, we will discuss these options in more details. -n, –quiet, –silent: Same as standard -n option. -e script, –expression=script: Same as standard -e option. -f script-file, –file=script-file: Same as standard -f option. –follow-symlinks: If this option is provided, the SED follows symbolic links while editing files in place. -i[SUFFIX], –in-place[=SUFFIX]: This option is used to edit file in place. If suffix is provided, it takes a backup of the original file, otherwise it overwrites the original file. -l N, –line-lenght=N: This option sets the line length for l command to N characters. –posix: This option disables all GNU extensions. -r, –regexp-extended: This option allows to use extended regular expressions rather than basic regular expressions. -u, –unbuffered: When this option is provided, the SED loads minimal amount of data from the input files and flushes the output buffers more often. It is useful for editing the output of “tail -f” when you do not want to wait for the output. -z, –null-data: By default, the SED separates each line by a new-line character. If NULL-data option is provided, it separates the lines by NULL characters. Print Page Previous Next Advertisements ”;

Sed – Home

Sed Tutorial PDF Version Quick Guide Resources Job Search Discussion This tutorial takes you through all about Stream EDitor (Sed), one of the most prominent text-processing utilities on GNU/Linux. Similar to many other GNU/Linux utilities, it is stream-oriented and uses simple programming language. It is capable of solving complex text processing tasks with few lines of code. This easy, yet powerful utility makes GNU/Linux more interesting. Audience If you are a software developer, system administrator, or a GNU/Linux loving person, then this tutorial is for you. Prerequisites You must have basic understanding of GNU/Linux operating system and shell scripting. Print Page Previous Next Advertisements ”;

Sed – Regular Expressions

Stream Editor – Regular Expressions ”; Previous Next It is the regular expressions that make SED powerful and efficient. A number of complex tasks can be solved with regular expressions. Any command-line expert knows the power of regular expressions. Like many other GNU/Linux utilities, SED too supports regular expressions, which are often referred to as as regex. This chapter describes regular expressions in detail. The chapter is divided into three sections: Standard regular expressions, POSIX classes of regular expressions, and Meta characters. Standard Regular Expressions Start of line (^) In regular expressions terminology, the caret(^) symbol matches the start of a line. The following example prints all the lines that start with the pattern “The”. [jerry]$ sed -n ”/^The/ p” books.txt On executing the above code, you get the following result: The Two Towers, J. R. R. Tolkien The Alchemist, Paulo Coelho The Fellowship of the Ring, J. R. R. Tolkien The Pilgrimage, Paulo Coelho End of Line ($) End of line is represented by the dollar($) symbol. The following example prints the lines that end with “Coelho”. [jerry]$ sed -n ”/Coelho$/ p” books.txt On executing the above code, you get the following result: The Alchemist, Paulo Coelho The Pilgrimage, Paulo Coelho Single Character (.) The Dot(.) matches any single character except the end of line character. The following example prints all three letter words that end with the character “t”. [jerry]$ echo -e “catnbatnratnmatnbattingnratsnmats” | sed -n ”/^..t$/p” On executing the above code, you get the following result: cat bat rat mat Match Character Set ([]) In regular expression terminology, a character set is represented by square brackets ([]). It is used to match only one out of several characters. The following example matches the patterns “Call” and “Tall” but not “Ball”. [jerry]$ echo -e “CallnTallnBall” | sed -n ”/[CT]all/ p” On executing the above code, you get the following result: Call Tall Exclusive Set ([^]) In exclusive set, the caret negates the set of characters in the square brackets. The following example prints only “Ball”. [jerry]$ echo -e “CallnTallnBall” | sed -n ”/[^CT]all/ p” On executing the above code, you get the following result: Ball Character Range ([-]) When a character range is provided, the regular expression matches any character within the range specified in square brackets. The following example matches “Call” and “Tall” but not “Ball”. [jerry]$ echo -e “CallnTallnBall” | sed -n ”/[C-Z]all/ p” On executing the above code, you get the following result: Call Tall Now let us modify the range to “A-P” and observe the result. [jerry]$ echo -e “CallnTallnBall” | sed -n ”/[A-P]all/ p” On executing the above code, you get the following result: Call Ball Zero on One Occurrence (?) In SED, the question mark (?) matches zero or one occurrence of the preceding character. The following example matches “Behaviour” as well as “Behavior”. Here, we made “u” as an optional character by using “?”. [jerry]$ echo -e “BehaviournBehavior” | sed -n ”/Behaviou?r/ p” On executing the above code, you get the following result: Behaviour Behavior One or More Occurrence (+) In SED, the plus symbol(+) matches one or more occurrences of the preceding character. The following example matches one or more occurrences of “2”. [jerry]$ echo -e “111n22n123n234n456n222″ | sed -n ”/2+/ p” On executing the above code, you get the following result: 22 123 234 222 Zero or More Occurrence (*) Asterisks (*) matches the zero or more occurrence of the preceding character. The following example matches “ca”, “cat”, “catt”, and so on. [jerry]$ echo -e “cancat” | sed -n ”/cat*/ p” On executing the above code, you get the following result: ca cat Exactly N Occurrences {n} {n} matches exactly “n” occurrences of the preceding character. The following example prints only three digit numbers. But before that, you need to create the following file which contains only numbers. [jerry]$ cat numbers.txt On executing the above code, you get the following result: 1 10 100 1000 10000 100000 1000000 10000000 100000000 1000000000 Let us write the SED expression. [jerry]$ sed -n ”/^[0-9]{3}$/ p” numbers.txt On executing the above code, you get the following result: 100 Note that the pair of curly braces is escaped by the “” character. At least n Occurrences {n,} {n,} matches at least “n” occurrences of the preceding character. The following example prints all the numbers greater than or equal to five digits. [jerry]$ sed -n ”/^[0-9]{5,}$/ p” numbers.txt On executing the above code, you get the following result: 10000 100000 1000000 10000000 100000000 1000000000 M to N Occurrence {m, n} {m, n} matches at least “m” and at most “n” occurrences of the preceding character. The following example prints all the numbers having at least five digits but not more than eight digits. [jerry]$ sed -n ”/^[0-9]{5,8}$/ p” numbers.txt On executing the above code, you get the following result: 10000 100000 1000000 10000000 Pipe (|) In SED, the pipe character behaves like logical OR operation. It matches items from either side of the pipe. The following example either matches “str1” or “str3”. [jerry]$ echo -e “str1nstr2nstr3nstr4″ | sed -n ”/str(1|3)/ p” On executing the above code, you get the following result: str1 str3 Note that the pair of the parenthesis and pipe (|) is escaped by the “” character. Escaping Characters There are certain special characters. For example, newline is represented by “n”, carriage return is represented by “r”, and so on. To use these characters into regular ASCII context, we have to escape them using the backward slash() character. This chapter illustrates escaping of special characters. Escaping “” The following example matches the pattern “”. [jerry]$ echo ”str1str2” | sed -n ”/\/ p” On executing the above code, you get the following result: str1str2 Escaping “n” The following example matches the new line character. [jerry]$ echo ”str1nstr2” | sed -n ”/\n/ p” On executing the above code, you get the following result: str1nstr2 Escaping “r” The following example matches the carriage return. [jerry]$ echo ”str1rstr2” | sed -n ”/\r/ p”

Sed – Overview

Stream Editor – Overview ”; Previous Next The acronym SED stands for Stream EDitor. It is a simple yet powerful utility that parses the text and transforms it seamlessly. SED was developed during 1973–74 by Lee E. McMahon of Bell Labs. Today, it runs on all major operating systems. McMahon wrote a general-purpose line-oriented editor, which eventually became SED. SED borrowed syntax and many useful features from ed editor. Since its beginning, it has support for regular expressions. SED accepts inputs from files as well as pipes. Additionally, it can also accept inputs from standard input streams. SED is written and maintained by the Free Software Foundation (FSF) and it is distributed by GNU/Linux. Hence it is often referred to as GNU SED. To a novice user, the syntax of SED may look cryptic. However, once you get familiar with its syntax, you can solve many complex tasks with a few lines of SED script. This is the beauty of SED. Typical Uses of SED SED can be used in many different ways, such as: Text substitution, Selective printing of text files, In-a-place editing of text files, Non-interactive editing of text files, and many more. Print Page Previous Next Advertisements ”;

Sed – Workflow

Stream Editor – Workflow ”; Previous Next In this chapter, we will explore how SED exactly works. To become an expert SED user, one needs to know its internals. SED follows a simple workflow: Read, Execute, and Display. The following diagram depicts the workflow. Read: SED reads a line from the input stream (file, pipe, or stdin) and stores it in its internal buffer called pattern buffer. Execute: All SED commands are applied sequentially on the pattern buffer. By default, SED commands are applied on all lines (globally) unless line addressing is specified. Display: Send the (modified) contents to the output stream. After sending the data, the pattern buffer will be empty. The above process repeats until the file is exhausted. Points to Note Pattern buffer is a private, in-memory, volatile storage area used by the SED. By default, all SED commands are applied on the pattern buffer, hence the input file remains unchanged. GNU SED provides a way to modify the input file in-a-place. We will explore about it in later sections. There is another memory area called hold buffer which is also private, in- memory, volatile storage area. Data can be stored in a hold buffer for later retrieval. At the end of each cycle, SED removes the contents of the pattern buffer but the contents of the hold buffer remains persistent between SED cycles. However SED commands cannot be directly executed on hold buffer, hence SED allows data movement between the hold buffer and the pattern buffer. Initially both pattern and hold buffers are empty. If no input files are provided, then SED accepts input from the standard input stream (stdin). If address range is not provided by default, then SED operates on each line. Examples Let us create a text file quote.txt to contain a quote of the famous author Paulo Coelho. [jerry]$ vi quote.txt There is only one thing that makes a dream impossible to achieve: the fear of failure. – Paulo Coelho, The Alchemist To understand the workflow of SED, let us display the contents of the file quote.txt using SED. This example simulates the cat command. [jerry]$ sed ”” quote.txt When the above code is executed, it will produce the following result. There is only one thing that makes a dream impossible to achieve: the fear of failure. In the above example, quote.txt is the input file name and before that there is a pair of single quote that implies the SED command. Let us demystify this operation. First SED reads a line from the input file quote.txt and stores it in its pattern buffer. Then it applies SED commands on the pattern buffer. In our case, no SED commands are there, hence no operation is performed on the pattern buffer. Finally it deletes and prints the contents of the pattern buffer on the standard output. Isn”t it simple? In the following example, SED accepts input from the standard input stream. [jerry]$ sed ”” When the above code is executed, it will produce the following result. There is only one thing that makes a dream impossible to achieve: the fear of failure. There is only one thing that makes a dream impossible to achieve: the fear of failure. Here, the first line is entered through keyboard and the second is the output generated by SED. To exit from the SED session, press ctrl-D (^D). Print Page Previous Next Advertisements ”;

Sed – Loops

Stream Editor – Loops ”; Previous Next Like other programming languages, SED too provides a looping and branching facility to control the flow of execution. In this chapter, we are going to explore more about how to use loops and branches in SED. A loop in SED works similar to a goto statement. SED can jump to the line marked by the label and continue executing the remaining commands. In SED, we can define a label as follows: :label :start :end :up In the above example, a name after colon(:) implies the label name. To jump to a specific label, we can use the b command followed by the label name. If the label name is omitted, then the SED jumps to the end of the SED file. Let us write a simple SED script to understand the loops and branches. In our books.txt file, there are several entries of book titles and their authors. The following example combines a book title and its author name in one line separated by a comma. Then it searches for the pattern “Paulo”. If the pattern matches, it prints a hyphen(-) in front of the line, otherwise it jumps to the Print label which prints the line. [jerry]$ sed -n ” h;n;H;x s/n/, / /Paulo/!b Print s/^/- / :Print p” books.txt On executing the above code, you get the following result: A Storm of Swords, George R. R. Martin The Two Towers, J. R. R. Tolkien – The Alchemist, Paulo Coelho The Fellowship of the Ring, J. R. R. Tolkien – The Pilgrimage, Paulo Coelho A Game of Thrones, George R. R. Martin At first glance, the above script may look cryptic. Let us demystify this. The first two commands are self-explanatory h;n;H;x and s/n/, / combine the book title and its author separated by a comma(,). The third command jumps to the label Print only when the pattern does not match, otherwise substitution is performed by the fourth command. :Print is just a label name and as you already know, p is the print command. To improve readability, each SED command is placed on a separate line. However, one can choose to place all the commands in one line as follows: [jerry]$ sed -n ”h;n;H;x;s/n/, /;/Paulo/!b Print; s/^/- /; :Print;p” books.txt On executing the above code, you get the following result: A Storm of Swords, George R. R. Martin The Two Towers, J. R. R. Tolkien – The Alchemist, Paulo Coelho The Fellowship of the Ring, J. R. R. Tolkien – The Pilgrimage, Paulo Coelho A Game of Thrones, George R. R. Martin Print Page Previous Next Advertisements ”;

Sed – Environment

Stream Editor – Environment ”; Previous Next This chapter describes how to set up the SED environment on your GNU/Linux system. Installation Using Package Manager Generally, SED is available by default on most GNU/Linux distributions. Use which command to identify whether it is present on your system or not. If not, then install SED on Debian based GNU/Linux using apt package manager as follows: [jerry]$ sudo apt-get install sed After installation, ensure that SED is accessible via command line. [jerry]$ sed –version On executing the above code, you get the following result: sed (GNU sed) 4.2.2 Copyright (C) 2012 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later . This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Jay Fenlason, Tom Lord, Ken Pizzini, and Paolo Bonzini. GNU sed home page: . General help using GNU software: . E-mail bug reports to: . Be sure to include the word “sed” somewhere in the “Subject:” field. Similarly, to install SED on RPM based GNU/Linux, use yum package manager as follows: [root]# yum -y install sed After installation, ensure that SED is accessible via command line. [root]# sed –version On executing the above code, you get the following result: GNU sed version 4.2.1 Copyright (C) 2009 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE, to the extent permitted by law. GNU sed home page: . General help using GNU software: . E-mail bug reports to: . Be sure to include the word “sed” somewhere in the “Subject:” field. Installation from Source Code As GNU SED is a part of the GNU project, its source code is available for free download. We have already seen how to install SED using package manager. Let us now understand how to install SED from its source code. The following installation is applicable to any GNU/Linux software, and for most other freely-available programs as well. Here are the installation steps: Download the source code from an authentic place. The command-line utility wget serves this purpose. [jerry]$ wget ftp://ftp.gnu.org/gnu/sed/sed-4.2.2.tar.bz2 Decompress and extract the downloaded source code. [jerry]$ tar xvf sed-4.2.2.tar.bz2 Change into the directory and run configure. [jerry]$ ./configure Upon successful completion, the configure generates Makefile. To compile the source code, issue a make command. [jerry]$ make You can run the test suite to ensure the build is clean. This is an optional step. [jerry]$ make check Finally, install the SED utility. Make sure you have superuser privileges. [jerry]$ sudo make install That is it! You have successfully compiled and installed SED. Verify it by executing the sed command as follows: [jerry]$ sed –version On executing the above code, you get the following result: sed (GNU sed) 4.2.2 Copyright (C) 2012 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later . This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Jay Fenlason, Tom Lord, Ken Pizzini, and Paolo Bonzini. GNU sed home page: . General help using GNU software: . E-mail bug reports to: . Be sure to include the word “sed” somewhere in the “Subject:” field. Print Page Previous Next Advertisements ”;