Quantcast
Channel: Matcher/Searcher with Glob support - Code Review Stack Exchange
Viewing all articles
Browse latest Browse all 4

Answer by Reinderien for Matcher/Searcher with Glob support

$
0
0

I think it may be a good idea to leverage fnmatch.filter() instead of writing an inner loop on a bare fnmatch.

Add type hints, including - potentially - a generic type hint for your data. Since you haven't given us enough information to justify DATA being a global, don't leave it as a global.

Write unit tests.

Separate the function into the "hard part" (key filtration) and the "easy part" (dictionary key multi-lookup).

Suggested

import fnmatchfrom typing import Collection, Iterable, TypeVarValueT = TypeVar('ValueT')def glob_filter_overlapping(    selectors: Iterable[str],    keys: Collection[str],) -> Iterable[str]:    for selector in selectors:        yield from fnmatch.filter(names=keys, pat=selector.casefold())def glob_select_all(    selectors: Iterable[str],    data: dict[str, ValueT],) -> list[ValueT]:    keys = glob_filter_overlapping(selectors=selectors, keys=data.keys())    return [data[key] for key in set(keys)]def test() -> None:    data = {'Alpha-1': 1,'Beta-1': 2,'Beta-2': 3,'Delta-1': 4,'Delta-2': 5,    }    actual = glob_select_all(selectors=('*lta*', '*-2'), data=data)    assert sorted(actual) == [3, 4, 5]if __name__ == '__main__':    test()

Preprocessing

If you have

  • a lot of DATA, and/or
  • you re-apply the same set of selectors for several different DATA inputs,

you may benefit from pre-compiling your multiple selectors into one regular expression. This will have different performance characteristics that I have not measured.

import fnmatchimport refrom typing import Iterable, TypeVarValueT = TypeVar('ValueT')def preprocess_globs(    selectors: Iterable[str],) -> re.Pattern:    pattern = '|'.join(        fnmatch.translate(selector)        for selector in selectors    )    return re.compile(pattern=pattern, flags=re.I)def pattern_select_all(    pattern: re.Pattern,    data: dict[str, ValueT],) -> list[ValueT]:    return [        value        for key, value in data.items()        if pattern.search(key) is not None    ]def test() -> None:    data = {'Alpha-1': 1,'Beta-1': 2,'Beta-2': 3,'Delta-1': 4,'Delta-2': 5,    }    pat = preprocess_globs(('*lta*', '*-2'))    actual = pattern_select_all(pattern=pat, data=data)    assert sorted(actual) == [3, 4, 5]if __name__ == '__main__':    test()

Viewing all articles
Browse latest Browse all 4

Latest Images

Trending Articles





Latest Images