a2p (接受指定AWK脚本并输出perl脚本)

小猪老师 发表于 2020-07-05 08:18
浏览次数:
在手机上阅读

a2p实用程序接受命令行中指定的awk脚本,并在标准输出中生成一个类似的perl脚本。

查看英文版

目录

1 a2p 运行系统环境

2 a2p 语法

3 注意事项

a2p 运行系统环境

Linux

a2p 语法

a2p [options] [filename]

Options

-D< number >

设置调试旗帜flags.

-F< character >

告诉a2p始终使用这个-F开关调用这个awk脚本。

-n< fieldlist >

如果输入不需要分割成数组,则指定输入字段的名称。
如果您正在翻译一个处理密码文件的awk脚本,则可能会说:

a2p -7 -nlogin.password.uid.gid.gcos.shell.home

可以使用任何分隔符分隔字段名。

-< number >

使a2p假定输入将始终具有这么多字段。

-o

告诉a2p使用旧的awk行为。不同之处在于:

  1. 即使没有行动作,旧的awk总是有一个行循环,而新的awk则没有;

  2. 在旧的awk中,sprintf对其观点极为难。

例如,给定语句:

print sprintf(some_args), extra_args;

awk认为extra_args是sprintf的参数; 新的awk认为它们是要打印的参数。

a2p [options] [filename]

Options

-D

Sets debugging flags.

-F

Tells a2p that this awk script is always invoked with this -F switch.

-n

Specifies the names of the input fields if input does not have to be split into an array. If you were translating an awk script that processes the password file, you might say:

a2p -7 -nlogin.password.uid.gid.gcos.shell.home

Any delimiter can be used to separate the field names.

-

Causes a2p to assume that input will always have that many fields.

-o

Tells a2p to use old awk behavior. The differences are:

  1. that old awk always has a line loop even if there are no line actions, whereas new awk does not; and

  2. in old awk, sprintf is extremely greedy about its arguments.

For example, given the statement:

print sprintf(some_args), extra_args;

old awk considers extra_args to be arguments to sprintf; new awk considers them arguments to print.

查看英文版

查看中文版

注意事项

a2p不能像人类那样出色地完成翻译工作,但通常做得很好。在某些区域中,您可能需要检查生成的perl脚本并对其进行一些调整。以下是其中一些,但顺序不分先后。

  • 尽管参数始终始终为整数,但在字符串表达式周围放置int()来强制进行数字解释是一个很尴尬的习惯。这在perl中通常是不需要的,但是a2p不能确定参数是否始终是整数,因此将其保留在其中。您可能希望将其删除。
  • Perl将数字比较与字符串比较区分开来。Awk有两个运算符,它们在运行时决定要进行哪个比较。此时A2p不会尝试完成awk仿真的完整工作。相反,它会猜测您要哪个。几乎总是正确的,但是它可以被欺骗。所有这些猜测都用注释“”#???“”标记。您应该检查一下。您可能需要-w切换到perl至少运行一次,如果使用==会警告您,而应该在eq处使用。
  • Perl不会尝试模仿awk的行为,在这种行为中,不存在的数组元素通过被引用而存在。如果您某种程度上依靠这种机制为for ... in中的后续项创建空条目,则它们不会在perl中出现。
  • 如果a2p产生一条分割线以分配给看起来像(Fld1,Fld2,Fld3 ...)的变量列表,则您可能希望使用上述-n选项重新运行a2p。这样您就可以在整个脚本中命名字段。如果将其拆分为数组,则脚本可能是指某处的字段数。
  • awk中的exit语句不一定要退出;如果有一个,它将转到END块。通过删除END块中的条件并直接从perl脚本中退出,可以简化在这种情况下在END块内进行扭曲以绕过该块的Awk脚本。
  • Perl具有两种数组,即数字索引数组和关联数组。Perl关联数组称为“哈希”。Awk数组通常会转换为哈希,但是如果您碰巧知道索引始终是数字,则可以将{ ... } 更改为[ ... ]。哈希的迭代是使用keys()函数完成的,而数组的迭代则不是。您可能需要修改在此类数组上迭代的任何循环。
  • Awk首先假设OFMT的值为%.6g。Perl首先假设它的等价$#具有值%.20g。如果您使用OFMT的默认值,则需要显式设置$#。
  • 行循环顶部附近是awk脚本中隐含的拆分操作。有时候,您可以将其移至某些条件下,以测试整个记录,以免拆分变得不那么频繁。
  • 出于美学原因,您可能希望将索引变量从基于1的(awk样式)更改为基于0的(Perl样式)。确保更改该变量涉及的所有操作以进行匹配。
  • 未修改的注释(例如“#perl比awk更好”)不会被修改,并按原样传递。
  • Awk脚本通常嵌入在Shell脚本中,该脚本将内容输入和输出awk。通常,shell脚本包装器可以合并到perl脚本中,因为perl可以启动进入和退出自身的管道,并且可以执行awk本身无法完成的其他事情。
  • 引用特殊变量RSTART和RLENGTH的脚本通常可以通过引用变量$`,$&和$'来简化,只要它们在设置它们的模式匹配的范围内即可。
  • 产生的perl脚本可能具有定义的子例程,以处理awk关于getline和print的语义。由于a2p通常会选择正确性而不是效率。通过丢弃语义糖,几乎总是可以将此类代码重写为更有效的方式。
  • 为了提高效率,您可能想从子程序中最后执行的return语句中删除关键字。a2p可以捕获最常见的情况,但是不会分析嵌入式块的情况。
  • ARGV [0]转换为$ ARGV0,但ARGV [n]转换为$ ARGV [$ n-1]。试图遍历ARGV [0]的循环将找不到它。

a2p cannot do as good a job translating as a human would, but it usually does pretty well. There are some areas where you may want to examine the perl script produced and tweak it some. Here are some of them, in no particular order.

  • There is an awk idiom of putting int() around a string expression to force numeric interpretation, even though the argument is always integer anyway. This is generally unneeded in perl, but a2p can't tell if the argument is always going to be integer, so it leaves it in. You may want to remove it.
  • Perl differentiates numeric comparison from string comparison. Awk has one operator for both that decides at run time which comparison to do. A2p does not try to do a complete job of awk emulation at this point. Instead it guesses which one you want. It's almost always right, but it can be spoofed. All such guesses are marked with the comment ""#???"". You should go through and check them. You might want to run at least once with the -w switch to perl, which will warn you if you use == where you should have used eq.
  • Perl does not attempt to emulate the behavior of awk in which nonexistent array elements spring into existence by being referenced. If somehow you are relying on this mechanism to create null entries for a subsequent for...in, they won't be there in perl.
  • If a2p makes a split line that assigns to a list of variables that looks like (Fld1, Fld2, Fld3...) you may want to rerun a2p using the -n option mentioned above. This will let you name the fields throughout the script. If it splits to an array instead, the script is probably referring to the number of fields somewhere.
  • The exit statement in awk doesn't necessarily exit; it goes to the END block if there is one. Awk scripts that do contortions within the END block to bypass the block under such circumstances can be simplified by removing the conditional in the END block and just exiting directly from the perl script.
  • Perl has two kinds of array, numerically-indexed and associative. Perl associative arrays are called "hashes". Awk arrays are usually translated to hashes, but if you happen to know that the index is always going to be numeric you could change the {...} to [...]. Iteration over a hash is done using the keys() function, but iteration over an array is NOT. You might need to modify any loop that iterates over such an array.
  • Awk starts by assuming OFMT has the value %.6g. Perl starts by assuming its equivalent, $#, to have the value %.20g. You'll want to set $# explicitly if you use the default value of OFMT.
  • Near the top of the line loop will be the split operation that is implicit in the awk script. There are times when you can move this down past some conditionals that test the entire record so that the split is not done as often.
  • For aesthetic reasons you may want to change index variables from being 1-based (awk style) to 0-based (Perl style). Be sure to change all operations the variable is involved in to match.
  • Immature comments such as "# perl is better than awk" are not modified, and pass through as-is.
  • Awk scripts are often embedded in a shell script that pipes stuff into and out of awk. Often the shell script wrapper can be incorporated into the perl script, since perl can start up pipes into and out of itself, and can do other things that awk can't do by itself.
  • Scripts that refer to the special variables RSTART and RLENGTH can often be simplified by referring to the variables $`, $& and $', as long as they are within the scope of the pattern match that sets them.
  • The produced perl script may have subroutines defined to deal with awk's semantics regarding getline and print. Since a2p usually picks correctness over efficiency. It is almost always possible to rewrite such code to be more efficient by discarding the semantic sugar.
  • For efficiency, you may want to remove the keyword from any return statement that is the last statement executed in a subroutine. a2p catches the most common case, but doesn't analyze embedded blocks for subtler cases.
  • ARGV[0] translates to $ARGV0, but ARGV[n] translates to $ARGV[$n-1]. A loop that tries to iterate over ARGV[0] won't find it.

查看英文版

查看中文版

其他命令行

apt-cache | apt-get | ar | arch | arp | as | aspell | ac | at | awk | adduser | apropos | alias | agrep | addgroup |

如此好文,分享给朋友
发表评论
验证码:
评论列表
共0条