问题:
On Linux, I have a directory with lots of files.在 Linux 上,我有一个包含大量文件的目录。 Some of them have non-ASCII characters, but they are all valid UTF-8 .其中一些具有非 ASCII 字符,但它们都是有效的UTF-8 。 One program has a bug that prevents it working with non-ASCII filenames, and I have to find out how many are affected.一个程序有一个错误,阻止它使用非 ASCII 文件名,我必须找出有多少受到影响。 I was going to do this with find
and then do a grep to print the non-ASCII characters, and then do a wc -l
to find the number.我打算用find
来做这个,然后用grep来打印非 ASCII 字符,然后用wc -l
来查找数字。 It doesn't have to be grep;它不必是 grep; I can use any standard Unix regular expression , like Perl , sed , AWK , etc.我可以使用任何标准的 Unix正则表达式,如Perl 、 sed 、 AWK等。
However, is there a regular expression for 'any character that's not an ASCII character'?但是,是否有“任何不是 ASCII 字符的字符”的正则表达式?
解决方案:
参考一: https://en.stackoom.com/question/8uYE参考二: https://stackoom.com/question/8uYE