file (报告文件的类型)

瑞兹 发表于 2021-01-18 00:43
浏览次数:
在手机上阅读

在类似Unix的操作系统上,file命令报告文件的类型。

查看英文版

目录

1 file 运行系统环境

2 file 描述

3 file 语法

4 file 例子

file 运行系统环境

Unix&Linux

file 描述

file命令测试每个参数,企图对其进行分类。有三组测试,按此顺序执行:file系统测试,魔术测试和语言测试。第一个成功的测试导致打印file类型。

可打印的类型通常将包含以下单词之一:文本(file仅包含打印字符和一些常见的控制字符,并且可能在ASCII 终端上可以安全读取),可执行(file包含以某种形式编译程序的结果)对于内核是可以理解的),或表示其他含义的数据(通常是二进制或不可打印的)。例外是众所周知的包含二进制数据的file格式(核心file,tar 归档file)。

file系统测试基于检查stat系统调用的返回结果。程序检查file是否为空,或者是否为某种特殊file。如果在系统头file中定义了它们,则适合于您所运行的系统的任何已知file类型(在实现这些file的系统上是套接字,符号链接或命名管道(FIFO))。

魔术测试用于检查带有特定固定格式数据的file。一个典型的例子是一个二进制可执行file(编译程序)a.outfile,其格式定义为标准include目录中的以及可能的。这些file在file开头附近的特定位置存储着一个“幻数”,该数字告诉操作系统该file是二进制可执行file,以及其中的几种类型。“魔术”的概念已通过扩展应用于数据file。通常可以用这种方式描述具有一些不变标识符且在file中的固定偏移很小的任何file。以及已编译的魔术file/usr/share/misc/magic.mgc或目录/ usr / share / misc / magic中的file(如果已编译的file不存在)。另外,如果$ HOME / .magic.mgc$ HOME / .magic存在,它将优先于系统魔术file使用。

如果file与魔术file中的任何条目都不匹配,那么将检查该file是否似乎是文本file。ASCII,ISO -8859-x,非ISO 8位扩展ASCII字符集(例如在Macintosh和IBM PC系统上使用的字符集),UTF -8编码的Unicode,UTF-16编码的Unicode和EBCDIC字符集可以通过构成每个集合中可打印文本的字节的不同范围和顺序来区分。如果file通过了这些测试中的任何一个,则报告其字符集。ASCII,ISO-8859-x,UTF-8和扩展ASCIIfile被标识为“文本”,因为它们几乎可以在任何终端上读取;UTF-16和EBCDIC只是“字符数据”,因为尽管它们包含文本,但这些文本需要翻译才能被读取。此外,file将尝试确定文本类型file的其他特征。如果file的行以CR,CR LF或NEL而不是Unix标准LF终止,则将报告此错误。包含嵌入式转义序列或过分敲击的file也将被识别。

一旦file确定了文本类型file中使用的字符集,它将尝试确定file以哪种语言编写。语言测试寻找可以出现在file前几个块中任何位置的特定字符串(参见)。例如,关键字.br表示该file很可能是troff输入file,就像关键字struct表示C程序一样。这些测试不如前两组可靠,因此最后执行。语言测试例程还测试某些杂项(例如tar存档)。

不能识别为已用上面列出的任何字符集写入的任何file都称为“数据”。

The file command tests each argument in an attempt to classify it. There are three sets of tests, performed in this order: filesystem tests, magic tests, and language tests. The first test that succeeds causes the file type to be printed.

The type printed will usually contain one of the words text (the file contains only printing characters and a few common control characters and is probably safe to read on an ASCII terminal), executable (the file contains the result of compiling a program in a form understandable to a kernel), or data meaning anything else (usually binary or non-printable). Exceptions are well-known file formats (core files, tar archives) that are known to contain binary data.

The filesystem tests are based on examining the return from a stat system call. The program checks to see if the file is empty, or if it's some sort of special file. Any known file types appropriate to the system you are running on (sockets, symbolic links, or named pipes (FIFOs) on those systems that implement them) are intuited if they are defined in the system header file .

The magic tests are used to check for files with data in particular fixed formats. The canonical example of this is a binary executable (compiled program) a.out file, whose format is defined in  and possibly  in the standard include directory. These files have a "magic number" stored in a particular place near the beginning of the file that tells the operating system that the file is a binary executable, and which of several types thereof. The concept of a "magic" has been applied by extension to data files. Any file with some invariant identifier at a small fixed offset into the file can usually be described in this way. The information identifying these files is read from /etc/magic and the compiled magic file /usr/share/misc/magic.mgc, or the files in the directory /usr/share/misc/magic if the compiled file does not exist. Also, if $HOME/.magic.mgc or $HOME/.magic exists, it will be used in preference to the system magic files.

If a file does not match any of the entries in the magic file, it is examined to see if it seems to be a text file. ASCIIISO-8859-x, non-ISO 8-bit extended-ASCII character sets (such as those used on Macintosh and IBM PC systems), UTF-8-encoded Unicode, UTF-16-encoded Unicode, and EBCDIC character sets can be distinguished by the different ranges and sequences of bytes that constitute printable text in each set. If a file passes any of these tests, its character set is reported. ASCII, ISO-8859-x, UTF-8, and extended-ASCII files are identified as "text" because they will be mostly readable on nearly any terminal; UTF-16 and EBCDIC are only "character data" because, while they contain text, it is text that will require translation before it can be read. Also, the file will attempt to determine other characteristics of text-type files. If the lines of a file are terminated by CR, CR LF, or NEL, instead of the Unix-standard LF, this will be reported. Files that contain embedded escape sequences or overstriking will also be identified.

Once file has determined the character set used in a text-type file, it will attempt to determine in what language the file is written. The language tests look for particular strings (cf. ) that can appear anywhere in the first few blocks of a file. For example, the keyword .br indicates that the file is most likely a troff input file, just as the keyword struct indicates a C program. These tests are less reliable than the previous two groups, so they are performed last. The language test routines also test for some miscellany (such as tar archives).

Any file that cannot be identified as having been written in any of the character sets listed above is said to be "data."

查看英文版

查看中文版

file 语法

file [-bchiklLNnprsvz0] [--apple] [--mime-encoding] [--mime-type] 
     [-e testname] [-F separator] [-f namefile] 
     [-m magicfiles] file ...
file -C [-m magicfiles]
file [--help]

选件

-b,--brief 不要在输出行前添加文件名(简短模式)。
-C,--compile 编写一个magic.mgc输出文件,其中包含魔术文件或目录的预解析版本。
-c,-- checking-printout 导致检查魔术文件的已解析形式的打印输出。该选项通常与-m标志一起使用,以便在安装新魔术文件之前对其进行调试。
-e--exclude testname 从确定文件类型的测试列表中排除testname中命名的测试。有效的测试名称是:
应用类型 EMX应用程序类型(仅在EMX上)。
ASCII 各种类型的文本文件(此测试将尝试猜测文本编码,而不管'encoding'选项的设置如何)。
编码方式 软魔术测试的不同文本编码。
代币 为向后兼容而忽略。
CDF 打印复合文档文件的详细信息。
压缩 检查并查看压缩文件。
小精灵 打印ELF文件详细信息。
柔软的 查阅魔术文件。
柏油 检查tar文件。
-F,-- separator separator 使用指定的字符串分隔符作为文件名和返回的文件结果之间的分隔符。默认为'  '。
-f,-- files-from namefile 从参数列表中的名称文件(每行一个)中读取要检查的文件的名称。要么namefile或至少一个文件名参数必须存在; 要测试标准输入,请使用' - '作为文件名参数。请注意,namefile打开并遇到此选项时,封闭的文件名的处理和之前的任何进一步的选项处理完毕。此选项允许在同一文件调用上使用不同的命令行参数处理多个文件列表。因此,如果要设置定界符,则需要在指定文件列表之前执行此操作,例如:“ -F @ -f namefile ”,而不是:“ -f namefile -F @ ”。
-h,-- no-dereference 选项使得符号链接(在支持符号链接的系统)不遵循。如果未定义环境变量POSIXLY_CORRECT,则此选项为默认选项。
-i,-- mime 使file命令输出mime类型的字符串,而不是更传统的人类可读的字符串。因此它可以说“文本/纯文本;charset = us-ascii '而不是“ ASCII text ”。
--mime-type,-- mime-encoding -i类似,但仅输出指定的元素。
-k--keep-going 不要在第一场比赛就停下来,继续前进。随后的匹配将带有字符串' \ 012- '。(如果需要换行符,请参见-r选项。)
-l--list 打印有关每个魔术图案强度的信息。
-L,-- dereference option导致遵循符号链接,如ls中的同名选项(在支持符号链接的系统上)。如果定义了环境变量POSIXLY_CORRECT,则此选项为默认选项。
-l 以用于匹配的顺序显示排序的模式列表。
-m,-- magic-file magicfiles 指定包含魔术的文件和目录的备用列表。此选项可以是单个项目,也可以是用冒号分隔的列表。如果在文件或目录的旁边找到了已编译的魔术文件,则将使用该文件。
-N--no-pad 不要填充文件名,以使它们在输出中对齐。
-n--no-buffer 检查每个文件后强制刷新标准输出。仅当检查文件列表时,此选项才有用。该程序供希望从管道输出文件类型的程序使用。
-p,-- preserve-date 在支持utimeutimes的系统上,尝试保留分析文件的访问时间,以假装该文件从不读取它们。
-r,--raw 不要将不可打印的字符转换为\ ooo。通常,文件将不可打印的字符转换为它们的八进制表示形式。
-s,--special-files 通常,文件仅尝试读取并确定参数文件的类型,这些参数文件是统计报告中的普通文件。这可以防止出现问题,因为读取特殊文件可能会产生特殊的后果。指定-s选项会使文件还读取作为块或字符特殊文件的参数文件。此选项对于确定原始磁盘分区(属于块特殊文件)中数据的文件系统类型很有用。此选项还会导致文件忽略stat报告的文件大小,因为在某些系统上,它报告的原始磁盘分区大小为零。
-v,-- version 打印程序的版本并退出。
-z,--uncompress 尝试查看压缩文件。
-0,-- print0 在文件名的末尾输出一个空字符'\ 0',这在例如您想剪切输出时很有用。此选项不会影响仍在打印的分隔符。
--help 打印帮助消息并退出。
file [-bchiklLNnprsvz0] [--apple] [--mime-encoding] [--mime-type] 
     [-e testname] [-F separator] [-f namefile] 
     [-m magicfiles] file ...
file -C [-m magicfiles]
file [--help]

Options

-b--brief Do not prepend file names to output lines (brief mode).
-C--compile Write a magic.mgc output file that contains a pre-parsed version of the magic file or directory.
-c--checking-printout Cause a checking printout of the parsed form of the magic file. This option is usually used in conjunction with the -m flag to debug a new magic file before installing it.
-e--exclude testname Exclude the test named in testname from the list of tests made to determine the file type. Valid test names are:
apptype EMX application type (only on EMX).
ascii Various types of text files (this test will try to guess the text encoding, irrespective of the setting of the ‘encoding’ option).
encoding Different text encodings for soft magic tests.
tokens Ignored for backward compatibility.
cdf Prints details of Compound Document Files.
compress Checks for, and looks inside, compressed files.
elf Prints ELF file details.
soft Consults magic files.
tar Examines tar files.
-F--separator separator Use the specified string separator as the separator between the file name and the file result returned. Defaults to ‘:’.
-f--files-from namefile Read the names of the files to be examined from namefile (one per line) before the argument list. Either namefile or at least one file name argument must be present; to test the standard input, use ‘-’ as a file name argument. Please note that namefile is unwrapped and the enclosed file names are processed when this option is encountered and before any further options processing is done. This option allows one to process multiple lists of files with different command line arguments on the same file invocation. Thus if you want to set the delimiter, you need to do it before you specify the list of files, like: "-F @ -f namefile", instead of: "-f namefile -F @".
-h--no-dereference option causes symlinks not to be followed (on systems that support symbolic links). This option is the default if the environment variable POSIXLY_CORRECT is not defined.
-i--mime Causes the file command to output mime type strings rather than the more traditional human readable ones. Thus it may say ‘text/plain; charset=us-ascii’ rather than "ASCII text".
--mime-type--mime-encoding Like -i, but print only the specified element(s).
-k--keep-going Don't stop at the first match, keep going. Subsequent matches will be have the string ‘\012- ’ prepended. (If you want a newline, see the -r option.)
-l--list Print information about the strength of each magic pattern.
-L--dereference option causes symlinks to be followed, as the like-named option in ls (on systems that support symbolic links). This option is the default if the environment variable POSIXLY_CORRECT is defined.
-l Shows sorted patterns list in the order that is used for the matching.
-m--magic-file magicfiles Specify an alternate list of files and directories containing magic. This option can be a single item, or a colon-separated list. If a compiled magic file is found alongside a file or directory, it will be used instead.
-N--no-pad Don't pad file names so that they align in the output.
-n--no-buffer Force stdout to be flushed after checking each file. This option is only useful if checking a list of files. It is intended to be used by programs that want filetype output from a pipe.
-p--preserve-date On systems that support utime or utimes, attempt to preserve the access time of files analyzed, to pretend that file never read them.
-r--raw Don't translate unprintable characters to \ooo. Normally file translates unprintable characters to their octal representation.
-s--special-files Normally, file only attempts to read and determine the type of argument files which stat reports are ordinary files. This prevents problems, because reading special files may have peculiar consequences. Specifying the -s option causes file to also read argument files that are block or character special files. This option is useful for determining the filesystem types of the data in raw disk partitions, which are block special files. This option also causes file to disregard the file size as reported by stat since on some systems it reports a zero size for raw disk partitions.
-v--version Print the version of the program and exit.
-z--uncompress Try to look inside compressed files.
-0--print0 Output a null character ‘\0’ after the end of the file name, which is helpful if, for instance, you'd like to cut the output. This option does not affect the separator, which is still printed.
--help Print a help message and exit.

查看英文版

查看中文版

file 例子

file *

下面是运行时,可能会出现什么样的例子file与所有文件通配符:

shutdown.htm: HTML document text
si.htm: HTML document text
side0.gif: GIF image data, version 89a, 107 x 18
robots.txt: ASCII text, with CRLF line terminators
routehlp.htm: HTML document text
rss: setgid directory
file *.txt

下面是运行上述示例时可能显示的示例;运行file命令列出任何以.txt结尾的文件:

form.txt: news or mail text
friend.txt: news or mail text
ihave.txt: news or mail text
index.txt: ASCII Java program text, with 
very long lines, with CRLF line terminators
jargon.txt: news or mail text
news.txt: Non-ISO extended-ASCII C program text, with very long lines, 
with CRLF line terminators
newsdata.txt: Non-ISO extended-ASCII English text, with very long lines, with CRLF line terminators
qad.txt: news or mail text
refrence.txt: news or mail text
robots.txt: ASCII text, with CRLF line terminators
stopwords.txt: ASCII English text, with CRLF line terminators
yhelp.txt: news or mail text
file *

Below is an example of what may appear when running file with a wildcard for all files:

shutdown.htm: HTML document text
si.htm: HTML document text
side0.gif: GIF image data, version 89a, 107 x 18
robots.txt: ASCII text, with CRLF line terminators
routehlp.htm: HTML document text
rss: setgid directory
file *.txt

Below is an example of what may appear when running the above example; running the file command listing any file ending with .txt:

form.txt: news or mail text
friend.txt: news or mail text
ihave.txt: news or mail text
index.txt: ASCII Java program text, with 
very long lines, with CRLF line terminators
jargon.txt: news or mail text
news.txt: Non-ISO extended-ASCII C program text, with very long lines, 
with CRLF line terminators
newsdata.txt: Non-ISO extended-ASCII English text, with very long lines, with CRLF line terminators
qad.txt: news or mail text
refrence.txt: news or mail text
robots.txt: ASCII text, with CRLF line terminators
stopwords.txt: ASCII English text, with CRLF line terminators
yhelp.txt: news or mail text

查看英文版

查看中文版

其他命令行

fc | fdisk | fg | fgrep | findsmb | finger | fmt | fold | for | free | fuser |

如此好文,分享给朋友
发表评论
验证码:
评论列表
共0条