Simplify File Matching in Python with fnmatch

Posted by Afsal on 04-Aug-2023

Hi Pythonistas!

Today we will learn a new module called fnmatch. fnmatch is a Python module that supports matching filenames against a given pattern using Unix shell-style wildcards. It offers an intuitive approach to filter files based on simple patterns, such as * (matches any sequence of characters) and ? (matches any single character). Let us learn with some examples

Code

import fnmatch
fnmatch.fnmatch(filename, pattern)

Patterns

*: Matches any sequence of characters, including none.

?: Matches any single character.

[seq]: Matches any character in the given set seq.

[!seq]: Matches any character not in the set seq.

Examples

*.txt

Description: This pattern matches any filename that ends with ".txt".

Code

import fnmatch

file_names = ['file.txt', 'data.txt', 'log.txt', 'file.csv', 'text.txt.bak', 'notes.txt.old']

output = [item for item in file_names if fnmatch.fnmatch(item, "*.txt")]

print(output)

Output

['file.txt', 'data.txt', 'log.txt']

file?.txt

Description: This pattern matches filenames that start with "file", followed by any single character, and end with ".txt".

Code

import fnmatch

file_names = ['file1.txt', 'fileA.txt', 'file$.txt', 'file.txt', 'file10.txt', 'fileAB.txt']

output = [item for item in file_names if fnmatch.fnmatch(item, "file?.txt")]

print(output)

Output

['file1.txt', 'fileA.txt', 'file$.txt']

file[0-9].txt

Description: This pattern matches filenames that start with "file", followed by a single digit (0-9), and end with ".txt".

Code

import fnmatch

file_names = ['file1.txt', 'fileA.txt', 'file$.txt', 'file.txt', 'file10.txt', 'fileAB.txt']

output = [item for item in file_names if fnmatch.fnmatch(item, "file[0-9].txt")]

print(output)

Output

['file1.txt']

[!abc]*.py

Description: This pattern matches filenames that do not start with the characters "a", "b", or "c" and end with ".py".

Code

import fnmatch

file_names = ['xyz.py', '123.py', 'script.py', 'abc.py', 'baba.py', 'config.py']

output = [item for item in file_names if fnmatch.fnmatch(item, "[!abc]*.py")]

print(output)

Output

['xyz.py', '123.py', 'script.py']

file[0-9][!x].*

Description: This pattern matches filenames that start with "file", followed by a single digit (0-9), then any single character except "x", and end with any extension.

Code

import fnmatch

file_names = ["file1A.txt", "file7C.csv", "file0T.jpg", "file10.txt", "fileX.txt", "file8.jpg"]

output = [item for item in file_names if fnmatch.fnmatch(item, "file[0-9][!x].*")]

print(output)

Output

['file1A.txt', 'file7C.csv', 'file0T.jpg', 'file10.txt']

**/*.txt

Description: This pattern uses the double asterisk ** to match any filename with the ".txt" extension in any subdirectory.

Code

import fnmatch

file_names = ["file.txt", "data.txt", "subdir/example.txt", "file.csv", "subdir/file.docx"]

output = [item for item in file_names if fnmatch.fnmatch(item, "**/*.txt")]

print(output)

Output

['subdir/example.txt']

I hope you have learned something from this post. Please share your valuable suggestion with afsal@parseltongue.co.in