jsed

NAME

jsed - an implementation of sed, the stream editor, in JavaScript

SYNOPSIS

Fill the script and input text areas, then click the run button to launch sed.

DESCRIPTION

sed copies the named files (standard input default) to the standard output, edited according to a script of commands.

OPTIONS

-n flag
Suppress the default output that normally takes place at the end of each cycle through the script. If the first two characters of the script are #n then it is equivalent to the -n option.
POSIX mode
Disallow non-POSIX extensions if checked.
jumpmax=number
Stop execution if more that number jumps are taken (commands {, b, t or D)whenprocessingthesameinputline.

MODE OF OPERATION

sed maintains two data buffers: the pattern space and the hold space. Normally sed executes the following cycle on each line of input: an input line is copied into the pattern space (less its terminating newline); sed then tries to apply each command starting at the beginning of the script; finally (unless the -n option was given) the pattern space is written to the standard output (with a final newline).

The hold space is initially empty, and is kept untouched by the sed cycle. The hold space can copied or appended to or from or swapped with the pattern space using functions g, G, h, H, x.

Command syntax

The script consists of commands, of the following form:
[ address [ ,address ] ] [!] function [ arguments ]

Commands can be preceded by white space and must be followed by a newline or (for most commands) a semicolon. Commands whose function is one of a\, b, c\, i\ , r, t, w, :, or # can only be followed by a newline as the argument to these function encompasses the remainder of the script line, going over any semicolon present on the line.

Addresses

Command functions are actually executed only if selected by the addresses (based on the input line number and the contents of the pattern space at the time the command is tried to be executed).

A command with no addresses is always selected.
A command with one address selects every pattern space that match that address.
A command with two addresses selects the inclusive range from the first pattern space that matches the first address through the next pattern space that matches the second. (If the second address is a number less than or equal to the line number first selected, only one line is selected.) Once the second address is matched sed starts looking for the first one again; thus, any number of these ranges will be matched.
The negation operator ! can prefix a command to apply it to every line not selected by the address(es).

The following address types are supported:
number
match only the specified line number. The line number start at 1 and run cumulatively across input files.
$
match only the last input line.
/regexp/
match when the pattern space matches the regular expression regexp.
\%regexp%
Same as above but with the delimiter being % instead of /. the % may be replaced by any other single character).

Functions

In the following list of functions, the maximum number of addresses permitted for each function is indicated in parentheses.

An argument denoted text consists of one or more lines, with all but the last ending with \. To insert a \ in the text, put two backslashes: \\. Other backslashes in the text argument are ignored.

An argument denoted file or label may be preceded by white space, and must be last on the command line.
a\
text
(1) Append. Place text on output at the end of the cycle.
b label
(2) Branch to the : command bearing the label. If no label is given, branch to the end of the script.
c\
text
(2) Change. Delete the pattern space. With 0 or 1 address, or at the end of a 2-address range, place text on the output. Start the next cycle.
d
(2) Delete the pattern space. Start the next cycle.
D
(2) Delete the first line of the pattern space (all chars up to the first newline) or the entire pattern space if it does not contain any newline. Start the next cycle.
g
(2) Replace the contents of the pattern space with the contents of the hold space.
G
(2) Append to the pattern space a newline followed by the contents of the hold space.
h
(2) Copy the pattern space into the hold space.
H
(2) Append to the hold space a newline followed by the contents of the pattern space.
i\
text
(1) Insert. Write text on the standard output.
l
(2) List. Write the pattern space to standard output in a visually unambiguous form. Some non-printable characters are written as \\, \a, \b, \f, \l, \r, \t or \v (same convention as in the C language); non-printing characters not in that list are written as a backslash followed by a three-digit octal number representing the ASCII number. Long lines are folded by inserting a backslash followed by a newline. The end of the pattern space is indicated by a final $.
n
(2) Write the pattern space to standard output (unless option -n is given). Read the next line of input into the pattern space. If no line could be read (the last line was already reached), quit.
N
(2) Append to the pattern space a newline and the next line of input. The current line number changes. If no line could be read (the last line was already reached), quit.
p
(2) Print. Write the pattern space to standard output.
P
(2) Write the first line of the pattern space (all chars up to and including the first newline) to standard output.
q
(1) Quit. Branch to the end of the script. Do not start a new cycle.
r rfile
(1) Schedule appending the contents of file rfile before reading the next input line.
s/regexp/replacement/flags
(2) Substitute the replacement for instances of the regular expression regexp in the current pattern space. Any character except \ or newline may be used as delimiter instead of /. The following character sequences have special meaning in the replacement string:
&
insert the string matched by the entire regular expression
\i
insert the string matched by the ith $...$ subexpression (where i is a digit from 1 to 9).
\c
insert a verbatim character c, where c is either the delimiter, or any of \, & or newline.

The flags are zero or more of the following:
g
Global. Substitute for all nonoverlapping instances of the string rather than just the first one.
p
Print the pattern space if a replacement was made.
w wfile
Write. Append the current text buffer to a file argument as in a w command if a replacement is made.
N
(a positive decimal number) Substitute only the Nth instance of the string.
Flags N and g are incompatible.

t label
(2) Branch-if-test. Branch to the : command with the given label if any replacements have been made (using the s command) since the most recent read of an input line or execution of a t command. If no label is given, branch to the end of the script.
w wfile
(2) Write. Append the pattern space to file wfile. The wfile arguments of all w commands are open in write mode at the beginning, even if the w commands get never executed.
x
(2) Exchange the contents of the pattern and hold spaces.
y/string1/string2/
(2) Translate. Replace in the pattern space each occurrence of a character in string1 with the corresponding character in string2. Any character except \ or newline may be used as delimiter instead of /. Both strings can contain \n (standing for a newline), and \d (representing a verbatim delimiter character d). The lengths of these strings must be equal.
: label
(0) This command does nothing but hold a label for b and t commands to branch to.
=
(1) Write to the standard output the current line number in decimal, followed by a newline.
{ commands... }
(2) Execute the following commands up to a matching } only when the current line matches the address or address range given. Command groups can nest. sed does not prevent branching in or out of such groups.
#comment
(0) everything after a # character up to the next newline is ignored (with the exception that comment #n at the very beginning of the script activates the -n option).

Regular expressions

sed implements basic regular expressions, which consist of the elements described below. A concatenation of regexp elements matches the concatenation of what each element matches.
^
at the beginnning of the regexp, matches the beginning of the pattern space. Matches a verbatim ^ character otherwise.
$
at the end of the regexp, matches the end of the pattern space. Matches a verbatim $ character otherwise.
.
matches any single character (including newline).
\n
matches a newline.
\i
matches a copy of the substring matched by the ith subexpression (i is a digit between 1 and 9), i. e. the subexpression starting at the ith opening $ from the left; the matching $ must be on the left of the \i backreference. If the subexpression matches a null string, then \i always matches; if the subexpression does not match, \i does not match either.
\c
matches a verbatim character c, where c is the current delimiter, or any of \, ^, $, ., [ or *.
regexp1\|regexp2
(extension) matches either regexp1 or regexp2.
[bracket-expression]
matches any character specified by the bracket-expression. This can include:
x
character x.
x-y
characters between x and y inclusively (in the ASCII order).
[:name:]
all characters of the specified character class, where name is one of alnum, alpha, blank, cntrl, digit, graph, lower, print, punct, space, upper, xdigit.
\n
(extension) the newline character.

As an example, [a-cx[:space:]] matches any of the following characters: a, b, c, x, tab, space.
To include a ] put it at the first character of the bracket-expression; to include a [ put it last.
[^bracket-expression]
matches any character not among those specified by the bracket-expression.
$regexp-elements$
group the regexp-elements between matching $ and $ as a subexpression. The subexpression matches if the concatenation of the regexp-elements matches.
Elements above can be repeated when followed by one of the following suffixes:
*
match 0 or more times.
\{n\}
match exactly n times.
\{n,\}
match n or more times.
\{n,m\}
match between n and m times inclusively.
\?
(extension) a synonym of \{0,1\}.
\+
(extension) a synonym of \{1,\}.
If sed encounters an empty regexp, the last previously used regexp is used instead.

PORTABILITY

This implementation is based closely on the POSIX specification. The following extensions are allowed only when not in POSIX mode.

Extensions

non-printable characters
In regexps, the text arguments of the a\, c\, i\ commands, and string arguments of s and y commands, some of the following sequences are recognised:
\a, \b, \f, \n, \r, \t, \v
stand for the corresponding verbatim characters.
\xXX
stands for the verbatim character whose hexadecimal ASCII number is 0xXX.

regexp extensions
\+, \?, \| and \n in bracket expressions are supported.

Anchoring

Subexpression anchoring is not implemented. For example /$^a$/ is a synonym of /$\^a$/, not of /^$a$/.

Locale

Collation-related bracket expressions such as [:digit:], [=a=] and [.[.] are only recognised in the POSIX locale.

BUGS

This implementation relies internally on the javascript regular expressions which do not implement strictly the leftmost, longest matching rule mandated by POSIX. Instead, greedy matching is implemented where each part of the regular expression is tried from left to right for the longest match. As an example:
echo "aaabaaa" | sed 's/a*$a*$b\1/<&>'

outputs `<aaab>aaa' in this implementation (as in many historical implementations), instead of `<aaabaaa>' (mandated by POSIX).

Extensions are not implemented in a consistent manner.

Commands r and w are fake, as well as the wfile argument to the s/// command.

AUTHORS

Some parts of jsed were re-implemented from csed (also known as cheap-sed), itself based on the original sed 1.3 written by Eric S. Raymond ages ago.

Support fot the HTML navigator was taken from jslint by Douglas Crockford

COPYRIGHT

Copyright (C) 2003-2005 Laurent Vogel

This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

REPORTING BUGS

Report bugs to Laurent Vogel <lvl2@club-internet.fr>.

NAME

SYNOPSIS

DESCRIPTION

OPTIONS

MODE OF OPERATION

Command syntax

Addresses

Functions

Regular expressions

PORTABILITY

Extensions

Anchoring

Locale

BUGS

AUTHORS

COPYRIGHT

REPORTING BUGS

SEE ALSO