2023-07-15

how to use git diff --name-only with non ascii file names

I have a pre-commit hook that runs

files=`git diff --cached --name-only --diff-filter=ACMR | grep -E "$extension_regex"`

and performs some formatting on those files before committing.

However, I have some files that contain non-ascii letters, and realized those files weren't being formatted.

After some debugging, found that it was because git diff outputted those file names with escaped characters and surrounded with double quotes, for example:

"\341\203\236\341\203\220\341\203\240\341\203\220\341\203\233\341\203\224\341\203\242\341\203\240\341\203\224\341\203\221\341\203\230.ext"

I tried to modify the regex pattern to accept names surrounded with quotes, and even tried removing those quotes, but anywhere I try to access the file it can't be found, for example:

$ cat $file
cat: '"\341\203\236\341\203\220\341\203\240\341\203\220\341\203\233\341\203\224\341\203\242\341\203\240\341\203\224\341\203\221\341\203\230.ext"': No such file or directory

$ file="${file:1:${#file}-2}"

$ cat $file
cat: '\341\203\236\341\203\220\341\203\240\341\203\220\341\203\233\341\203\224\341\203\242\341\203\240\341\203\224\341\203\221\341\203\230.ext': No such file or directory

How do I handle files with non ascii characters?



No comments:

Post a Comment