From 3136623cea579c5144310b4fc25248bc6baf7a50 Mon Sep 17 00:00:00 2001 From: David Peter Date: Tue, 11 Feb 2025 12:23:13 +0100 Subject: [PATCH 1/8] [red-knot] Document 'public type of undeclared symbols' behavior --- .../resources/mdtest/doc/README.md | 2 + .../doc/public_type_undeclared_symbols.md | 115 ++++++++++++++++++ 2 files changed, 117 insertions(+) create mode 100644 crates/red_knot_python_semantic/resources/mdtest/doc/README.md create mode 100644 crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md diff --git a/crates/red_knot_python_semantic/resources/mdtest/doc/README.md b/crates/red_knot_python_semantic/resources/mdtest/doc/README.md new file mode 100644 index 0000000000000..0168c8339c2e7 --- /dev/null +++ b/crates/red_knot_python_semantic/resources/mdtest/doc/README.md @@ -0,0 +1,2 @@ +This directory contains user-facing documentation, but also doubles as an extended test suite that +makes sure that our documentation stays up to date. diff --git a/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md b/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md new file mode 100644 index 0000000000000..6ea6d163d6f48 --- /dev/null +++ b/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md @@ -0,0 +1,115 @@ +# Public type of undeclared symbols + +## Summary + +One major deviation from the behavior of existing Python type checkers is our handling of 'public' +types for undeclared symbols. This is best illustrated with an example: + +```py +class Wrapper: + value = None + +wrapper = Wrapper() + +reveal_type(wrapper.value) # revealed: Unknown | None + +wrapper.value = 1 +``` + +Mypy and Pyright both infer a type of `None` for the type of `wrapper.value`. Consequently, both +tools raise an error when trying to assign `1` to `wrapper.value`. But there is nothing wrong with +this program. Raising an error here violates the [gradual guarantee] which states that *"Removing +type annotations (making the program more dynamic) should not result in additional static type +errors."*: If `value` were annotated with `int | None` here, Mypy and Pyright would not raise any +errors. + +By inferring `Unknown | None` instead, we allow arbitrary values to be assigned to `wrapper.value`. +This is a deliberate choice to prevent false positive errors on untyped code. + +More generally, we infer `Unknown | T_inferred` for undeclared symbols, where `T_inferred` is the +inferred type of the right-hand side of the assignment. This gradual type represents an *unknown* +fully-static type that is *at least as large as* `T_inferred`. It accurately describes our static +knowlege about this type. In the example above, we don't know what values `wrapper.value` could +possibly contain, but we *do know* that `None` is a possibility. This allows us to catch errors +where `wrapper.value` is used in a way that is incompatible with `None`: + +```py +def f(w: Wrapper) -> None: + # This is fine + x: int | None = w.value + + # This is not okay because `w.value` could be `None`. We therefore raise the following + # error: "Object of type `Unknown | None` is not assignable to `int`" + y: int = w.value +``` + +## False negatives + +In the first example, we demonstrated how our behavior prevents false positives. Howeer, it can also +prevent false negatives. The following example contains a bug, but Mypy and Pyright do not catch it. +To make this a bit more realistic, imagine that `OptionalInt` is important from an external, untyped +module: + +`optional_int.py`: + +```py +class OptionalInt: + value = 10 + +def reset(o): + o.value = None +``` + +It is then used like this: + +```py +from typing_extensions import assert_type +from optional_int import OptionalInt, reset + +o = OptionalInt() + +reset(o) # Oh no... + +# This assertion is incorrect, but Mypy and Pyright do not catch it. We raise the following +# error: "Actual type `Unknown | Literal[10]` is not the same as asserted type `int`" +assert_type(o.value, int) + +print(o.value // 2) # Runtime error! +``` + +To be fair, we only catch this due to the `assert_type` call. But the type of +`Unknown | Literal[10]` for `o.value` reflects more accurately what the possible values of `o.value` +are. + +## Stricter behavior + +Users can always opt in to stricter behavior by adding type annotations. For the `OptionalInt` +class, this would probably be: + +```py +class OptionalInt: + value: int | None = 10 + +# The following public type is now +# revealed: int | None +reveal_type(OptionalInt.value) +``` + +## What is meant by 'public' type? + +We refer to the 'public' type of a symbol if we assign a type to the *use* of a symbol from another +scope. *Within* the same scope, we use a narrower type of `T_inferred` for undeclared symbols, as +there is no way that the symbol could have been reassigned. For example: + +```py +class Wrapper: + value = None + + # Type as seen from the same scope: + reveal_type(value) # revealed: None + +# Type as seen from another scope: +reveal_type(Wrapper.value) # revealed: Unknown | None +``` + +[gradual guarantee]: https://typing.readthedocs.io/en/latest/spec/concepts.html#the-gradual-guarantee From beb44b8f36acc7a8f55766957064ac25ba7d7bfb Mon Sep 17 00:00:00 2001 From: David Peter Date: Tue, 11 Feb 2025 12:40:36 +0100 Subject: [PATCH 2/8] Wording and typos --- .../mdtest/doc/public_type_undeclared_symbols.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md b/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md index 6ea6d163d6f48..e42e48cd57c3c 100644 --- a/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md +++ b/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md @@ -17,10 +17,10 @@ wrapper.value = 1 ``` Mypy and Pyright both infer a type of `None` for the type of `wrapper.value`. Consequently, both -tools raise an error when trying to assign `1` to `wrapper.value`. But there is nothing wrong with -this program. Raising an error here violates the [gradual guarantee] which states that *"Removing +tools emit an error when trying to assign `1` to `wrapper.value`. But there is nothing wrong with +this program. Emitting an error here violates the [gradual guarantee] which states that *"Removing type annotations (making the program more dynamic) should not result in additional static type -errors."*: If `value` were annotated with `int | None` here, Mypy and Pyright would not raise any +errors."*: If `value` were annotated with `int | None` here, Mypy and Pyright would not emit any errors. By inferring `Unknown | None` instead, we allow arbitrary values to be assigned to `wrapper.value`. @@ -45,10 +45,10 @@ def f(w: Wrapper) -> None: ## False negatives -In the first example, we demonstrated how our behavior prevents false positives. Howeer, it can also -prevent false negatives. The following example contains a bug, but Mypy and Pyright do not catch it. -To make this a bit more realistic, imagine that `OptionalInt` is important from an external, untyped -module: +In the first example, we demonstrated how our behavior prevents false positives. However, it can +also prevent false negatives. The following example contains a bug, but Mypy and Pyright do not +catch it. To make this a bit more realistic, imagine that `OptionalInt` is imported from an +external, untyped module: `optional_int.py`: From 6f6b7105972bcd615575687edc4a045e690c6e5f Mon Sep 17 00:00:00 2001 From: David Peter Date: Tue, 11 Feb 2025 15:38:23 +0100 Subject: [PATCH 3/8] Improve description of public types --- .../mdtest/doc/public_type_undeclared_symbols.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md b/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md index e42e48cd57c3c..b9af27ee6e9a9 100644 --- a/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md +++ b/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md @@ -97,9 +97,12 @@ reveal_type(OptionalInt.value) ## What is meant by 'public' type? -We refer to the 'public' type of a symbol if we assign a type to the *use* of a symbol from another -scope. *Within* the same scope, we use a narrower type of `T_inferred` for undeclared symbols, as -there is no way that the symbol could have been reassigned. For example: +We apply different semantics depending on whether a symbol is accessed from the same scope in which +it was originally defined, or whether it is accessed from an external scope. External scopes will +see the symbol's "public type", which has been discussed above. But within the same scope the symbol +was defined in, we use a narrower type of `T_inferred` for undeclared symbols. This is because, from +the perspective of this scope, there is no way that the value of the symbol could have been +reassigned from external scopes. For example: ```py class Wrapper: From 008c600e36ea247a8664ff7a1447b58a9dfaf0e2 Mon Sep 17 00:00:00 2001 From: David Peter Date: Tue, 11 Feb 2025 22:59:08 +0100 Subject: [PATCH 4/8] Fix typo --- .../resources/mdtest/doc/public_type_undeclared_symbols.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md b/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md index b9af27ee6e9a9..db981b4cd66df 100644 --- a/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md +++ b/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md @@ -29,7 +29,7 @@ This is a deliberate choice to prevent false positive errors on untyped code. More generally, we infer `Unknown | T_inferred` for undeclared symbols, where `T_inferred` is the inferred type of the right-hand side of the assignment. This gradual type represents an *unknown* fully-static type that is *at least as large as* `T_inferred`. It accurately describes our static -knowlege about this type. In the example above, we don't know what values `wrapper.value` could +knowledge about this type. In the example above, we don't know what values `wrapper.value` could possibly contain, but we *do know* that `None` is a possibility. This allows us to catch errors where `wrapper.value` is used in a way that is incompatible with `None`: From f708bb025bb473e7b3211a01578e7085e3e47323 Mon Sep 17 00:00:00 2001 From: David Peter Date: Tue, 11 Feb 2025 23:00:34 +0100 Subject: [PATCH 5/8] =?UTF-8?q?Use=20chr(=E2=80=A6)=20function=20call=20to?= =?UTF-8?q?=20demonstrate=20the=20assignability=20error?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .../mdtest/doc/public_type_undeclared_symbols.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md b/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md index db981b4cd66df..f332dc1c271f8 100644 --- a/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md +++ b/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md @@ -36,11 +36,11 @@ where `wrapper.value` is used in a way that is incompatible with `None`: ```py def f(w: Wrapper) -> None: # This is fine - x: int | None = w.value + v: int | None = w.value - # This is not okay because `w.value` could be `None`. We therefore raise the following - # error: "Object of type `Unknown | None` is not assignable to `int`" - y: int = w.value + # This function call is incorrect, because `w.value` could be `None`. We therefore emit the following + # error: "`Unknown | None` cannot be assigned to parameter 1 (`i`) of function `chr`; expected type `int`" + c = chr(w.value) ``` ## False negatives From e51d3773a89aa116df6a239ccc961946e3d2f0c0 Mon Sep 17 00:00:00 2001 From: David Peter Date: Tue, 11 Feb 2025 23:02:27 +0100 Subject: [PATCH 6/8] Rewrite the 'false negative' section --- .../doc/public_type_undeclared_symbols.md | 24 +++++++++---------- 1 file changed, 11 insertions(+), 13 deletions(-) diff --git a/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md b/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md index f332dc1c271f8..d4942ecdc26f5 100644 --- a/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md +++ b/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md @@ -43,12 +43,12 @@ def f(w: Wrapper) -> None: c = chr(w.value) ``` -## False negatives +## Explicit lack of knowledge -In the first example, we demonstrated how our behavior prevents false positives. However, it can -also prevent false negatives. The following example contains a bug, but Mypy and Pyright do not -catch it. To make this a bit more realistic, imagine that `OptionalInt` is imported from an -external, untyped module: +The following example demonstrates how Mypy and Pyright's type inference of fully-static types in +these situations can lead to false-negatives, even though everything appears to be (statically) +typed. To make this a bit more realistic, imagine that `OptionalInt` is imported from an external, +untyped module: `optional_int.py`: @@ -63,23 +63,21 @@ def reset(o): It is then used like this: ```py -from typing_extensions import assert_type from optional_int import OptionalInt, reset o = OptionalInt() - reset(o) # Oh no... -# This assertion is incorrect, but Mypy and Pyright do not catch it. We raise the following -# error: "Actual type `Unknown | Literal[10]` is not the same as asserted type `int`" -assert_type(o.value, int) +# Mypy and Pyright infer a fully-static type of `int` here, which appears to make the +# subsequent division operation safe -- but it is not. We infer the following type: +reveal_type(o.value) # revealed: Unknown | Literal[10] print(o.value // 2) # Runtime error! ``` -To be fair, we only catch this due to the `assert_type` call. But the type of -`Unknown | Literal[10]` for `o.value` reflects more accurately what the possible values of `o.value` -are. +We do not catch this mistake either, but we accurately reflect our lack of knowledge about +`o.value`. Together with a possible future type-checker mode that would detect the prevalance of +dynamic types, this could help developers catch such mistakes. ## Stricter behavior From fdde684ff2fa3d9199bec3fdc7e17113b702b372 Mon Sep 17 00:00:00 2001 From: David Peter Date: Tue, 11 Feb 2025 23:02:35 +0100 Subject: [PATCH 7/8] Add incompatible-assignment demonstration --- .../mdtest/doc/public_type_undeclared_symbols.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md b/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md index d4942ecdc26f5..29b53cc805234 100644 --- a/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md +++ b/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md @@ -88,9 +88,15 @@ class, this would probably be: class OptionalInt: value: int | None = 10 +o = OptionalInt() + # The following public type is now # revealed: int | None -reveal_type(OptionalInt.value) +reveal_type(o.value) + +# Incompatible assignments are now caught: +# error: "Object of type `Literal["a"]` is not assignable to attribute `value` of type `int | None`" +o.value = "a" ``` ## What is meant by 'public' type? From b8c0b5a77ef5b6c37121868b5513d27681d0b8bf Mon Sep 17 00:00:00 2001 From: David Peter Date: Wed, 12 Feb 2025 08:42:21 +0100 Subject: [PATCH 8/8] Use custom accepts_int function instead of builtin chr --- .../mdtest/doc/public_type_undeclared_symbols.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md b/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md index 29b53cc805234..c2f589619937f 100644 --- a/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md +++ b/crates/red_knot_python_semantic/resources/mdtest/doc/public_type_undeclared_symbols.md @@ -34,13 +34,16 @@ possibly contain, but we *do know* that `None` is a possibility. This allows us where `wrapper.value` is used in a way that is incompatible with `None`: ```py +def accepts_int(i: int) -> None: + pass + def f(w: Wrapper) -> None: # This is fine v: int | None = w.value # This function call is incorrect, because `w.value` could be `None`. We therefore emit the following - # error: "`Unknown | None` cannot be assigned to parameter 1 (`i`) of function `chr`; expected type `int`" - c = chr(w.value) + # error: "`Unknown | None` cannot be assigned to parameter 1 (`i`) of function `accepts_int`; expected type `int`" + c = accepts_int(w.value) ``` ## Explicit lack of knowledge @@ -76,7 +79,7 @@ print(o.value // 2) # Runtime error! ``` We do not catch this mistake either, but we accurately reflect our lack of knowledge about -`o.value`. Together with a possible future type-checker mode that would detect the prevalance of +`o.value`. Together with a possible future type-checker mode that would detect the prevalence of dynamic types, this could help developers catch such mistakes. ## Stricter behavior