tr (自动将一组字符转换(替代或映射)到另一组字符)

rose1 发表于 2020-07-30 15:12
浏览次数:
在手机上阅读

在类似Unix的操作系统上,tr命令会自动将一组字符转换(替代或映射)到另一组字符。 本文档介绍了tr的GNU / Linux版本。

查看英文版

目录

1 tr 运行系统环境

2 tr 说明

3 tr 语法

4 tr 例子

tr 运行系统环境

Linux

tr 说明

TR实用程序拷贝输入标准与取代或选择的字符的删除标准输出。
The tr utility copies the standard input to the standard output with substitution or deletion of selected characters.

查看英文版

查看中文版

tr 语法

tr [-Ccsu] string1 string2

在这种形式下,在字符串 字符串1被转换成中的字符字符串2其中在第一字符字符串1被转换成在第一个字符字符串2等。如果string1比string2长,则复制string2中找到的最后一个字符,直到用尽string1。

tr [-Ccu] -d string1

以这种形式,string1中的字符将从输入中删除。

tr [-Ccu] -s string1

按照这种形式,如-s选项所述,压缩string1中的字符(请参见下文)。

tr [-Ccu] -ds string1 string2

在第四形式,在字符字符串1从输入删除,并且在字符字符串2是如对于所描述的压缩-s选项。

选件

-C 补充string1中的字符集,即“ -C ab ”包括除“ a ”和“ b ” 以外的所有字符。
-c 与-C相同,但对string1中的值集进行补充。
-d 从输入中删除string1中的字符。
-s 将输入中最后一个操作数(string1或string2)中列出的多个字符压缩成单个字符实例。在所有删除和翻译完成后会发生这种情况。
-u 确保任何输出都没有缓冲。

如何指定字符

当指定要使用tr转换的字符时,以下约定用于表示字符集(或“类”)。

下列约定之一未描述的任何字符均表示其自身。

\octal 反斜杠后跟1个,2个或3个八进制数字表示具有该编码值的字符。要在八进制序列后跟数字作为字符,请在八进制序列的左边加零。
\character

反斜杠后跟某些特殊字符会映射为特殊值:

\a “警报”字符,向终端发出通知或警报。
\b 退格键。
\f 进纸。
\n 换行符。
\r 回车。
\t Tab.
\v 垂直标签。
反斜杠后跟其他任何字符都映射到该字符。
c-c 字符范围。对于非八进制范围端点,按排序顺序定义,代表范围端点之间的字符范围(包括端值和升序)。如果范围端点中的一个或两个均为八进制序列,则表示范围端点之间(包括端点)的特定编码值的范围。
[:课程:] 表示属于定义的字符类的所有字符。类名是:

alnum 字母数字字符。
alpha 字母字符。
blank 空格字符。
cntrl 控制字符。
digit 数字字符。
graph 图形字符。
ideogram 表意字符。
lower 小写字母字符。
phonogram 留声机字符。
print 可打印的字符。
punct 标点符号。
rune 有效字符。
space 空格字符。
special 特殊的角色。
upper 大写字母字符。
xdigit 十六进制字符。
When "当“ [:降低:] ”出现在字符串1和“ [:上:] ”显示在相同的相对位置字符串2,它代表从所述TOUPPER映射字符对LC_CTYPE当前区域的类别。当“ [:upper:] ”出现在string1中,而[[:lower:] “出现在string2中的相同相对位置时,它表示当前语言环境的LC_CTYPE类别中来自下层映射的字符对。 category of the current locale.

除大小写转换外,类中的字符未指定顺序。

有关这些类中包含哪些ASCII字符的特定信息,请参见ctype和相关的手册页。

“ [= equiv =] ”表示与equiv属于同等类的所有字符,并按其编码值排序。
[#*n] [ #* n ] 表示Ñ所代表的字符的重复出现#。该表达式仅在出现在string2中时才有效。如果省略n或为零,则将其解释为足够大以将string2序列扩展为string1的长度。如果n具有前导零,则将其解释为八进制值,否则,将其解释为十进制值。

环境

该LANG,LC_ALL,LC_CTYPE和LC_COLLATE 环境变量影响的执行TR

退出状态

如果成功运行,tr会返回退出状态0,如果发生错误则返回大于零的值。

tr [-Ccsu] string1 string2

In this form, the characters in the string string1 are translated into the characters in string2 where the first character in string1 is translated into the first character in string2 and so on. If string1 is longer than string2, the last character found in string2 is duplicated until string1 is exhausted.

tr [-Ccu] -d string1

In this form, the characters in string1 are deleted from the input.

tr [-Ccu] -s string1

In this form, the characters in string1 are compressed as described for the -s option (see below).

tr [-Ccu] -ds string1 string2

In the fourth form, the characters in string1 are deleted from the input, and the characters in string2 are compressed as described for the -s option.

Options

-C Complement the set of characters in string1, that is "-C ab" includes every character except for 'a' and 'b'.
-c Same as -C but complement the set of values in string1.
-d Delete characters in string1 from the input.
-s Squeeze multiple occurrences of the characters listed in the last operand (either string1 or string2) in the input into a single instance of the character. This occurs after all deletion and translation is completed.
-u Guarantee that any output is unbuffered.

How Characters Are Specified

When specifying the characters to translate with tr, the following conventions are used to represent sets (or "classes") of characters.

Any character not described by one of the following conventions represents itself.

\octal A backslash followed by 1, 2 or 3 octal digits represents a character with that encoded value. To follow an octal sequence with a digit as a character, pad the octal sequence on the left with zeroes.
\character

A backslash followed by certain special characters maps to special values:

\a The "alert" character, which issues a notification or alert to the terminal.
\b Backspace.
\f Form feed.
\n Newline.
\r Carriage return.
\t Tab.
\v Vertical tab.
A backslash followed by any other character maps to that character.
c-c Character range. For non-octal range endpoints represents the range of characters between the range endpoints, inclusive and in ascending order, as defined by the collation sequence. If either or both of the range endpoints are octal sequences, it represents the range of specific coded values between the range endpoints, inclusive.
[:class:] Represents all characters belonging to the defined character class. Class names are:
alnum Alphanumeric characters.
alpha Alphabetic characters.
blank White space characters.
cntrl Control characters.
digit Numeric characters.
graph Graphic characters.
ideogram Ideographic characters.
lower Lowercase alphabetic characters.
phonogram Phonographic characters.
print Printable characters.
punct Punctuation characters.
rune Valid characters.
space Space characters.
special Special characters.
upper Uppercase alphabetic characters.
xdigit Hexadecimal characters.
When "[:lower:]" appears in string1 and "[:upper:]" appears in the same relative position in string2, it represents the characters pairs from the toupper mapping in the LC_CTYPE category of the current locale. When "[:upper:]" appears in string1 and "[:lower:]" appears in the same relative position in string2, it represents the characters pairs from the tolower mapping in the LC_CTYPE category of the current locale.

With the exception of case conversion, characters in the classes are in unspecified order.

For specific information as to which ASCII characters are included in these classes, see ctype and related manual pages.

"[=equiv=]" Represents all characters belonging to the same equivalence class as equiv, ordered by their encoded values.
[#*n] Represents n repeated occurrences of the character represented by #. This expression is only valid when it occurs in string2. If n is omitted, or is zero, it is be interpreted as large enough to extend string2 sequence to the length of string1. If n has a leading zero, it is interpreted as an octal value, otherwise, it is interpreted as a decimal value.

Environment

The LANGLC_ALLLC_CTYPE and LC_COLLATE environment variables affect the execution of tr.

Exit Status

tr returns an exit status of 0 if it operated successfully, and a value greater than zero if an error occurred.

查看英文版

查看中文版

tr 例子

tr -cs "[:alpha:]" "\n" < file1

file1中创建单词列表,每行一个,其中一个单词被视为字母的最大字符串。

tr "[:lower:]" "[:upper:]" < file1

file1的内容转换为大写。

tr -cd "[:print:]" < file1

file1中删除所有不可打印的字符。

tr "[=e=]" "e"

从字母e的重音版本中删除所有“变音”标记。

tr -cs "[:alpha:]" "\n" < file1

Create a list of the words in file1, one per line, where a word is taken to be a maximal string of letters.

tr "[:lower:]" "[:upper:]" < file1

Translate the contents of file1 to uppercase.

tr -cd "[:print:]" < file1

Remove all non-printable characters from file1.

tr "[=e=]" "e"

Remove all "diacritical" marks from accented versions of the letter e.

查看英文版

查看中文版

其他命令行

tabs | tac | talk | tail | tcopy | tty | tar | tbl | tcpdump | tcsh | time | tee | timex | telinit | telnet | test | top | touch | tput | troff | traceroute |

如此好文,分享给朋友
发表评论
验证码:
评论列表
共0条