Can anyone cast some light on this? Why does the same pattern give two results? I’m guessing it’s got something to do with the _.* being interpreted differently, i.e., parameter expansion is at work. However, I need a way for the same pattern to be interpreted the same in both find and for.
Incidentally, I’m using -regextype egrep so the search pattern doesn’t need escaping. Likewise, I’m prepending .*/ to the pattern in find because a match is on the whole path.
I understand that escaping is sometnimes necessary, but I don’t think this is the whole picture here (I am using egrep not posix_egrep.) But this doesn’t explain the variations.
My theory is that the expanding the ${~expression} is treating substrings ._. and ._ the same, and resets after finding the first match. I’ll have to do more experimenting, or approach the problem another way.
Essentially, I have a script that passes the expression to both find and for constructs, e.g., `script.sh --path --regex , and expression should work in both scenarios.
for the for command, shell is applying pathname expansion to your pattern (which is not as powerful as regular expressions); at least this is the area i would concentrate my investigation
You’re absolutely right in suspecting that the discrepancy comes
from how the shell interprets the pattern in the for loop versus
how find interprets it
Here’s a reusable Zsh function that ensures regex-based matching
using find, avoids glob expansion quirks, and handles paths with
spaces or special characters safely:
filter_files_by_regex() {
local base_path=“$1”
local regex_pattern=“$2”
[[ “$base_path” != / ]] && base_path=“${base_path}/”
find “$base_path” -type f -regextype egrep -regex "./$regex_pattern" | while IFS= read -r file_path; do
echo “$file_path”
done
}
Thanks, @PerttiS. As you probably guessed, I’m using globbing to parse files and folders. find is used to test that the at least one file is present before starting the loop.
This method was fine with simple expressions, e.g., *.FLAC, but adding functionality means I need to use regex. I’m not keen on rewriting my scripts, which is why I’d like to find a way that each gives the same results.
Funny thing is I already use a similar loop function elsewhere. I’ll give this a shot, and update you on my progress later.
It sounds like you’re transitioning from simple glob patterns to more complex regex-based file matching while wanting to preserve your current script structure—completely reasonable, especially if you have working code you don’t want to disrupt.
To keep your find logic and still match regex patterns, you might try something like this:
shopt -s nullglob
for file in *; do
if [[ “$file” =~ .FLAC$ ]]; then
echo “Matched: $file”
fi
done
Or, if using find and grep for more flexibility with regex:
find . -type f | grep -E ‘.FLAC$’
This way, you’re bridging the gap without a full rewrite.
@PerttiS, I appreciate your help. I’ve adjusted my code with minimal changes, allowing simple matches, e.g., *.flac, or more complex regex using positional parameters from the command line. Globbing no more!
files=$(find "$base_path" -type f -regextype egrep -regex ".*$regex_pattern")
if [[ -n $files ]]; then
echo $files | while IFS= read -r file_path; do
folder="${file_path%/${~regex_pattern}}"
filename=$file_path:t
echo "folder[$folder]"
echo "file_path[$file_path]"
echo "filename[$filename]"
done
else
echo No matches
fi
I’m sure I would have got there in the end, but not this evening. Thanks.