I think it may be a good idea to leverage fnmatch.filter()
instead of writing an inner loop on a bare fnmatch
.
Add type hints, including - potentially - a generic type hint for your data. Since you haven't given us enough information to justify DATA
being a global, don't leave it as a global.
Write unit tests.
Separate the function into the "hard part" (key filtration) and the "easy part" (dictionary key multi-lookup).
Suggested
import fnmatchfrom typing import Collection, Iterable, TypeVarValueT = TypeVar('ValueT')def glob_filter_overlapping( selectors: Iterable[str], keys: Collection[str],) -> Iterable[str]: for selector in selectors: yield from fnmatch.filter(names=keys, pat=selector.casefold())def glob_select_all( selectors: Iterable[str], data: dict[str, ValueT],) -> list[ValueT]: keys = glob_filter_overlapping(selectors=selectors, keys=data.keys()) return [data[key] for key in set(keys)]def test() -> None: data = {'Alpha-1': 1,'Beta-1': 2,'Beta-2': 3,'Delta-1': 4,'Delta-2': 5, } actual = glob_select_all(selectors=('*lta*', '*-2'), data=data) assert sorted(actual) == [3, 4, 5]if __name__ == '__main__': test()
Preprocessing
If you have
- a lot of
DATA
, and/or - you re-apply the same set of selectors for several different
DATA
inputs,
you may benefit from pre-compiling your multiple selectors into one regular expression. This will have different performance characteristics that I have not measured.
import fnmatchimport refrom typing import Iterable, TypeVarValueT = TypeVar('ValueT')def preprocess_globs( selectors: Iterable[str],) -> re.Pattern: pattern = '|'.join( fnmatch.translate(selector) for selector in selectors ) return re.compile(pattern=pattern, flags=re.I)def pattern_select_all( pattern: re.Pattern, data: dict[str, ValueT],) -> list[ValueT]: return [ value for key, value in data.items() if pattern.search(key) is not None ]def test() -> None: data = {'Alpha-1': 1,'Beta-1': 2,'Beta-2': 3,'Delta-1': 4,'Delta-2': 5, } pat = preprocess_globs(('*lta*', '*-2')) actual = pattern_select_all(pattern=pat, data=data) assert sorted(actual) == [3, 4, 5]if __name__ == '__main__': test()