cut（删除（或“切出”）文件每一行的部分）_语法_示例_Unix&Linux命令

在类似Unix的操作系统上，cut命令删除（或“切出”）文件每一行的部分。本文档介绍了cut的GNU / Linux版本。

查看英文版

Linux

cut OPTION... [FILE]...

选件

-b, --bytes=LIST	按照LIST中的指定，仅从每行中选择字节。LIST指定一个字节，一组字节或一个字节范围；请参阅下面的指定列表。
-c, --characters=LIST	按照LIST中的指定，仅从每一行中选择字符。LIST指定一个字符，一组字符或一个字符范围；请参阅下面的指定列表。
-d, --delimiter=DELIM	使用字符DELIM，而不是一个标签的领域分隔符。
-f, --fields=LIST	每行仅选择这些字段；除非指定-s选项，否则还将打印任何不包含定界符的行。LIST指定一个字段，一组字段或一系列字段；请参阅下面的指定列表。
-n	该选项将被忽略，但出于兼容性原因而被包括在内。
--complement	补充所选字节，字符或字段的集合。
-s, --only-delimited	不要打印不包含定界符的行。
--output-delimiter=STRING	使用STRING作为输出定界符字符串。默认为使用输入定界符。
--help	显示帮助消息并退出。
--version	输出版本信息并退出。

使用说明

调用cut时，请使用-b，-c或-f选项，但只能使用其中之一。

如果未指定FILE，则cut从标准输入读取。

指定清单

每个LIST由一个整数，一个整数范围或多个以逗号分隔的整数范围组成。所选输入的写入顺序与读取的顺序相同，并且仅写入一次即可输出。范围包括：

N	N从第1个字节开始计数的第N个字节，字符或字段。
N-	从第N个字节，字符或字段到行尾的N-。
N-M	第N至第M个字节，字符或字段（含）的N-M。
-M	从第一个到第M个字节，字符或字段。

例如，假设您有一个名为data.txt的文件，其中包含以下文本：

one	two	three	four	five
alpha	beta	gamma	delta	epsilon

在此示例中，这些单词中的每个单词都由制表符而不是空格分隔。制表符是cut的默认分隔符，因此默认情况下它将认为字段是由制表符分隔的任何内容。

要仅“剪切”每行的第三个字段，请使用以下命令：

cut -f 3 data.txt

...将输出以下内容：

three
gamma

相反，如果您只想“剪切”每行的第二至第四字段，请使用以下命令：

cut -f 2-4 data.txt

...将输出以下内容：

two	three	four
beta	gamma	delta

如果要仅“剪切”每行的第一至第二和第四至第五字段（省略第三字段），请使用以下命令：

cut -f 1-2,4-5 data.txt

...将输出以下内容：

one	two	four	five
alpha	beta	delta	epsilon

或者，假设您要第三个字段及其后的每个字段，而忽略前两个字段。在这种情况下，您可以使用以下命令：

cut -f 3- data.txt

...将输出以下内容：

three	four	five
gamma	delta	epsilon

使用LIST指定范围还适用于从一行中剪切字符（-c）或字节（-b）。例如，要仅输出data.txt每行的第三到第十二个字符，请使用以下命令：

cut -c 3-12 data.txt

...将输出以下内容：

e	two	thre
pha	beta	g

请记住，每个单词之间的“空格”实际上是一个制表符，因此输出的两行都显示十个字符：八个字母数字字符和两个制表符。换句话说，cut省略了每行的前两个字符，将制表符视为一个字符。输出3到12个字符，每个制表符作为一个字符计数；并删除第十二个字符。

计数字节而不是字符将导致在这种情况下相同的输出，因为在一个ASCII - 编码的文本文件中，每个字符由数据的单个字节（8位）表示。所以命令：

cut -b 3-12 data.txt

...将为我们的文件data.txt产生完全相同的输出：

e	two	thre
pha	beta	g

指定制表符以外的定界符

制表符是cut用来确定构成字段的默认分隔符。因此，如果文件的字段已由制表符分隔，则无需指定其他分隔符。

但是，您可以指定任何字符作为分隔符。例如，文件/ etc / passwd包含有关系统上每个用户的信息，每行一个用户，并且每个信息字段均以冒号（“ ： ”）分隔。例如，线/ etc / passwd中为根用户可能看起来像这样：

root:x:0:0:root:/root:/bin/bash

这些字段按以下顺序包含以下信息，并用冒号分隔：

用户名
密码（如果加密，则显示为x）
用户ID号（UID）
组ID号（GID）
注释字段（由finger命令使用）
主目录
Shell

用户名是该行的第一个字段，因此要显示系统上的每个用户名，请使用以下命令：

cut -f 1 -d ':' /etc/passwd

...将输出，例如：

root
daemon
bin
sys
chope

（在一个典型的系统上，有更多的用户帐户，包括许多特定于系统服务的帐户，但是在此示例中，我们假设只有五个用户。）

/ etc / passwd文件中每行的第三个字段是UID（用户ID号），因此要显示每个用户名和用户ID号，请使用以下命令：

cut -f 1,3 -d ':' /etc/passwd

...这将输出以下内容，例如：

root:0
daemon:1
bin:2
sys:3
chope:1000

如您所见，默认情况下，将使用为输入指定的相同分隔符来分隔输出。在这种情况下，这就是冒号（“ ： ”）。但是，您可以为输入和输出指定其他定界符。因此，如果您想运行前面的命令，但输出用空格分隔，则可以使用以下命令：

cut -f 1,3 -d ':' --output-delimiter=' ' /etc/passwd

root 0
daemon 1
bin 2
sys 3
chope 1000

但是，如果您希望输出由制表符分隔怎么办？在命令行上指定制表符比较复杂，因为它是不可打印的字符。要在命令行上指定它，必须从外壳“保护”它。根据您使用的外壳，此操作的执行方法有所不同，但是在Linux默认外壳（bash）中，可以使用$'\ t'指定制表符。所以命令：

cut -f 1,3 -d ':' --output-delimiter=$'\t' /etc/passwd

...将输出以下内容，例如：

root	0
daemon	1
bin	2
sys	3
chope	1000

cut OPTION... [FILE]...

Options

-b, --bytes=LIST	Select only the bytes from each line as specified in LIST. LIST specifies a byte, a set of bytes, or a range of bytes; see Specifying LIST below.
-c, --characters=LIST	Select only the characters from each line as specified in LIST. LIST specifies a character, a set of characters, or a range of characters; see Specifying LIST below.
-d, --delimiter=DELIM	use character DELIM instead of a tab for the field delimiter.
-f, --fields=LIST	select only these fields on each line; also print any line that contains no delimiter character, unless the -s option is specified. LIST specifies a field, a set of fields, or a range of fields; see Specifying LIST below.
-n	This option is ignored, but is included for compatibility reasons.
--complement	complement the set of selected bytes, characters or fields.
-s, --only-delimited	do not print lines not containing delimiters.
--output-delimiter=STRING	use STRING as the output delimiter string. The default is to use the input delimiter.
--help	Display a help message and exit.
--version	output version information and exit.

Usage Notes

When invoking cut, use the -b, -c, or -f option, but only one of them.

If no FILE is specified, cut reads from the standard input.

Specifying LIST

Each LIST is made up of an integer, a range of integers, or multiple integer ranges separated by commas. Selected input is written in the same order that it is read, and is written to output exactly once. A range consists of:

N	the Nth byte, character, or field, counted from 1.
N-	from the Nth byte, character, or field, to the end of the line.
N-M	from the Nth to the Mth byte, character, or field (inclusive).
-M	from the first to the Mth byte, character, or field.

For example, let's say you have a file named data.txt which contains the following text:

one	two	three	four	five
alpha	beta	gamma	delta	epsilon

In this example, each of these words is separated by a tab character, not spaces. The tab character is the default delimiter of cut, so it will by default consider a field to be anything delimited by a tab.

To "cut" only the third field of each line, use the command:

cut -f 3 data.txt