Skip to content

Commit

Permalink
pythonGH-125413: Add pathlib.Path.info attribute (python#127730)
Browse files Browse the repository at this point in the history
Add `pathlib.Path.info` attribute, which stores an object implementing the `pathlib.types.PathInfo` protocol (also new). The object supports querying the file type and internally caching `os.stat()` results. Path objects generated by `Path.iterdir()` are initialised with status information from `os.DirEntry` objects, which is gleaned from scanning the parent directory.

The `PathInfo` protocol has four methods: `exists()`, `is_dir()`, `is_file()` and `is_symlink()`.
  • Loading branch information
barneygale authored Feb 8, 2025
1 parent a1417b2 commit 718ab66
Show file tree
Hide file tree
Showing 10 changed files with 526 additions and 101 deletions.
85 changes: 85 additions & 0 deletions Doc/library/pathlib.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1177,6 +1177,38 @@ Querying file type and status
.. versionadded:: 3.5


.. attribute:: Path.info

A :class:`~pathlib.types.PathInfo` object that supports querying file type
information. The object exposes methods that cache their results, which can
help reduce the number of system calls needed when switching on file type.
For example::

>>> p = Path('src')
>>> if p.info.is_symlink():
... print('symlink')
... elif p.info.is_dir():
... print('directory')
... elif p.info.exists():
... print('something else')
... else:
... print('not found')
...
directory

If the path was generated from :meth:`Path.iterdir` then this attribute is
initialized with some information about the file type gleaned from scanning
the parent directory. Merely accessing :attr:`Path.info` does not perform
any filesystem queries.

To fetch up-to-date information, it's best to call :meth:`Path.is_dir`,
:meth:`~Path.is_file` and :meth:`~Path.is_symlink` rather than methods of
this attribute. There is no way to reset the cache; instead you can create
a new path object with an empty info cache via ``p = Path(p)``.

.. versionadded:: 3.14


Reading and writing files
^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down Expand Up @@ -1903,3 +1935,56 @@ Below is a table mapping various :mod:`os` functions to their corresponding
.. [4] :func:`os.walk` always follows symlinks when categorizing paths into
*dirnames* and *filenames*, whereas :meth:`Path.walk` categorizes all
symlinks into *filenames* when *follow_symlinks* is false (the default.)
Protocols
---------

.. module:: pathlib.types
:synopsis: pathlib types for static type checking


The :mod:`pathlib.types` module provides types for static type checking.

.. versionadded:: 3.14


.. class:: PathInfo()

A :class:`typing.Protocol` describing the
:attr:`Path.info <pathlib.Path.info>` attribute. Implementations may
return cached results from their methods.

.. method:: exists(*, follow_symlinks=True)

Return ``True`` if the path is an existing file or directory, or any
other kind of file; return ``False`` if the path doesn't exist.

If *follow_symlinks* is ``False``, return ``True`` for symlinks without
checking if their targets exist.

.. method:: is_dir(*, follow_symlinks=True)

Return ``True`` if the path is a directory, or a symbolic link pointing
to a directory; return ``False`` if the path is (or points to) any other
kind of file, or if it doesn't exist.

If *follow_symlinks* is ``False``, return ``True`` only if the path
is a directory (without following symlinks); return ``False`` if the
path is any other kind of file, or if it doesn't exist.

.. method:: is_file(*, follow_symlinks=True)

Return ``True`` if the path is a file, or a symbolic link pointing to
a file; return ``False`` if the path is (or points to) a directory or
other non-file, or if it doesn't exist.

If *follow_symlinks* is ``False``, return ``True`` only if the path
is a file (without following symlinks); return ``False`` if the path
is a directory or other other non-file, or if it doesn't exist.

.. method:: is_symlink()

Return ``True`` if the path is a symbolic link (even if broken); return
``False`` if the path is a directory or any kind of file, or if it
doesn't exist.
9 changes: 9 additions & 0 deletions Doc/whatsnew/3.14.rst
Original file line number Diff line number Diff line change
Expand Up @@ -617,6 +617,15 @@ pathlib

(Contributed by Barney Gale in :gh:`73991`.)

* Add :attr:`pathlib.Path.info` attribute, which stores an object
implementing the :class:`pathlib.types.PathInfo` protocol (also new). The
object supports querying the file type and internally caching
:func:`~os.stat` results. Path objects generated by
:meth:`~pathlib.Path.iterdir` are initialized with file type information
gleaned from scanning the parent directory.

(Contributed by Barney Gale in :gh:`125413`.)


pdb
---
Expand Down
47 changes: 30 additions & 17 deletions Lib/glob.py
Original file line number Diff line number Diff line change
Expand Up @@ -348,7 +348,7 @@ def lexists(path):

@staticmethod
def scandir(path):
"""Implements os.scandir().
"""Like os.scandir(), but generates (entry, name, path) tuples.
"""
raise NotImplementedError

Expand Down Expand Up @@ -425,23 +425,18 @@ def wildcard_selector(self, part, parts):

def select_wildcard(path, exists=False):
try:
# We must close the scandir() object before proceeding to
# avoid exhausting file descriptors when globbing deep trees.
with self.scandir(path) as scandir_it:
entries = list(scandir_it)
entries = self.scandir(path)
except OSError:
pass
else:
prefix = self.add_slash(path)
for entry in entries:
if match is None or match(entry.name):
for entry, entry_name, entry_path in entries:
if match is None or match(entry_name):
if dir_only:
try:
if not entry.is_dir():
continue
except OSError:
continue
entry_path = self.concat_path(prefix, entry.name)
if dir_only:
yield from select_next(entry_path, exists=True)
else:
Expand Down Expand Up @@ -483,15 +478,11 @@ def select_recursive(path, exists=False):
def select_recursive_step(stack, match_pos):
path = stack.pop()
try:
# We must close the scandir() object before proceeding to
# avoid exhausting file descriptors when globbing deep trees.
with self.scandir(path) as scandir_it:
entries = list(scandir_it)
entries = self.scandir(path)
except OSError:
pass
else:
prefix = self.add_slash(path)
for entry in entries:
for entry, _entry_name, entry_path in entries:
is_dir = False
try:
if entry.is_dir(follow_symlinks=follow_symlinks):
Expand All @@ -500,7 +491,6 @@ def select_recursive_step(stack, match_pos):
pass

if is_dir or not dir_only:
entry_path = self.concat_path(prefix, entry.name)
if match is None or match(str(entry_path), match_pos):
if dir_only:
yield from select_next(entry_path, exists=True)
Expand Down Expand Up @@ -528,9 +518,16 @@ class _StringGlobber(_GlobberBase):
"""Provides shell-style pattern matching and globbing for string paths.
"""
lexists = staticmethod(os.path.lexists)
scandir = staticmethod(os.scandir)
concat_path = operator.add

@staticmethod
def scandir(path):
# We must close the scandir() object before proceeding to
# avoid exhausting file descriptors when globbing deep trees.
with os.scandir(path) as scandir_it:
entries = list(scandir_it)
return ((entry, entry.name, entry.path) for entry in entries)

if os.name == 'nt':
@staticmethod
def add_slash(pathname):
Expand All @@ -544,3 +541,19 @@ def add_slash(pathname):
if not pathname or pathname[-1] == '/':
return pathname
return f'{pathname}/'


class _PathGlobber(_GlobberBase):
"""Provides shell-style pattern matching and globbing for pathlib paths.
"""

lexists = operator.methodcaller('exists', follow_symlinks=False)
add_slash = operator.methodcaller('joinpath', '')

@staticmethod
def scandir(path):
return ((child.info, child.name, child) for child in path.iterdir())

@staticmethod
def concat_path(path, text):
return path.with_segments(str(path) + text)
73 changes: 29 additions & 44 deletions Lib/pathlib/_abc.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,9 @@

import functools
import io
import operator
import posixpath
from errno import EINVAL
from glob import _GlobberBase, _no_recurse_symlinks
from glob import _PathGlobber, _no_recurse_symlinks
from pathlib._os import copyfileobj


Expand Down Expand Up @@ -76,21 +75,6 @@ def magic_open(path, mode='r', buffering=-1, encoding=None, errors=None,
raise TypeError(f"{cls.__name__} can't be opened with mode {mode!r}")


class PathGlobber(_GlobberBase):
"""
Class providing shell-style globbing for path objects.
"""

lexists = operator.methodcaller('exists', follow_symlinks=False)
add_slash = operator.methodcaller('joinpath', '')
scandir = operator.methodcaller('_scandir')

@staticmethod
def concat_path(path, text):
"""Appends text to the given path."""
return path.with_segments(str(path) + text)


class CopyReader:
"""
Class that implements the "read" part of copying between path objects.
Expand Down Expand Up @@ -367,7 +351,7 @@ def full_match(self, pattern, *, case_sensitive=None):
pattern = self.with_segments(pattern)
if case_sensitive is None:
case_sensitive = _is_case_sensitive(self.parser)
globber = PathGlobber(pattern.parser.sep, case_sensitive, recursive=True)
globber = _PathGlobber(pattern.parser.sep, case_sensitive, recursive=True)
match = globber.compile(str(pattern))
return match(str(self)) is not None

Expand All @@ -388,33 +372,45 @@ class ReadablePath(JoinablePath):
"""
__slots__ = ()

@property
def info(self):
"""
A PathInfo object that exposes the file type and other file attributes
of this path.
"""
raise NotImplementedError

def exists(self, *, follow_symlinks=True):
"""
Whether this path exists.
This method normally follows symlinks; to check whether a symlink exists,
add the argument follow_symlinks=False.
"""
raise NotImplementedError
info = self.joinpath().info
return info.exists(follow_symlinks=follow_symlinks)

def is_dir(self, *, follow_symlinks=True):
"""
Whether this path is a directory.
"""
raise NotImplementedError
info = self.joinpath().info
return info.is_dir(follow_symlinks=follow_symlinks)

def is_file(self, *, follow_symlinks=True):
"""
Whether this path is a regular file (also True for symlinks pointing
to regular files).
"""
raise NotImplementedError
info = self.joinpath().info
return info.is_file(follow_symlinks=follow_symlinks)

def is_symlink(self):
"""
Whether this path is a symbolic link.
"""
raise NotImplementedError
info = self.joinpath().info
return info.is_symlink()

def __open_rb__(self, buffering=-1):
"""
Expand All @@ -437,15 +433,6 @@ def read_text(self, encoding=None, errors=None, newline=None):
with magic_open(self, mode='r', encoding=encoding, errors=errors, newline=newline) as f:
return f.read()

def _scandir(self):
"""Yield os.DirEntry-like objects of the directory contents.
The children are yielded in arbitrary order, and the
special entries '.' and '..' are not included.
"""
import contextlib
return contextlib.nullcontext(self.iterdir())

def iterdir(self):
"""Yield path objects of the directory contents.
Expand All @@ -471,7 +458,7 @@ def glob(self, pattern, *, case_sensitive=None, recurse_symlinks=True):
else:
case_pedantic = True
recursive = True if recurse_symlinks else _no_recurse_symlinks
globber = PathGlobber(self.parser.sep, case_sensitive, case_pedantic, recursive)
globber = _PathGlobber(self.parser.sep, case_sensitive, case_pedantic, recursive)
select = globber.selector(parts)
return select(self)

Expand All @@ -498,18 +485,16 @@ def walk(self, top_down=True, on_error=None, follow_symlinks=False):
if not top_down:
paths.append((path, dirnames, filenames))
try:
with path._scandir() as entries:
for entry in entries:
name = entry.name
try:
if entry.is_dir(follow_symlinks=follow_symlinks):
if not top_down:
paths.append(path.joinpath(name))
dirnames.append(name)
else:
filenames.append(name)
except OSError:
filenames.append(name)
for child in path.iterdir():
try:
if child.info.is_dir(follow_symlinks=follow_symlinks):
if not top_down:
paths.append(child)
dirnames.append(child.name)
else:
filenames.append(child.name)
except OSError:
filenames.append(child.name)
except OSError as error:
if on_error is not None:
on_error(error)
Expand Down
Loading

0 comments on commit 718ab66

Please sign in to comment.