Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in Generated SQLAlchemy Model for CHAR Column #285

Open
2 tasks done
hyoj0942 opened this issue Sep 21, 2023 · 15 comments · May be fixed by #286
Open
2 tasks done

Error in Generated SQLAlchemy Model for CHAR Column #285

hyoj0942 opened this issue Sep 21, 2023 · 15 comments · May be fixed by #286
Assignees
Labels
Milestone

Comments

@hyoj0942
Copy link

hyoj0942 commented Sep 21, 2023

Things to check first

  • I have searched the existing issues and didn't find my bug already reported there

  • I have checked that my bug is still present in the latest release

Sqlacodegen version

2.3.0.post1

SQLAlchemy version

2.0.15

RDBMS vendor

MySQL (or compatible)

What happened?

Summary

I encountered an issue when using the sqlacodegen library to generate SQLAlchemy models from my database schema. The generated code for a CHAR column contains an error that causes a TypeError during application execution.

Steps to Reproduce

  1. Use the sqlacodegen library to generate SQLAlchemy models from a database schema that includes a CHAR column with specific collation settings.
  2. Attempt to use the generated SQLAlchemy model in an application.
  3. Observe the TypeError with the following traceback:
[ERROR] TypeError: init() takes from 1 to 2 positional arguments but 3 were given
Traceback (most recent call last):
...
...
File "models.py", line XX, in <module>
result_code = Column(CHAR(1, "utf8mb3_bin"), nullable=False)

Expected Behavior

I expected the generated SQLAlchemy model to correctly represent the CHAR column with the specified collation settings and not produce a TypeError during application execution.

Actual Behavior

The generated code for the CHAR column includes incorrect syntax, resulting in a TypeError when the model is used in the application.

How I solved

Instead of entering the arguments in order in the CHAR method, the collation keyword was defined and used.

result_code = Column(CHAR(1, collation="utf8mb3_bin"), nullable=False)

Environment

  • SQLAlchemy Version: 2.0.15
  • sqlacodegen Version: 2.3.0.post1

Additional Information

I believe the issue may be related to how sqlacodegen handles collation settings for CHAR columns. The generated code attempts to pass three arguments to the Column constructor instead of the expected two, which leads to the TypeError. Manually adjusting the generated code to specify the collation settings separately resolves the issue.

Database schema for reproducing the bug

Environment

  • MySQL Version: 8.0.34
  • Infra: AWS RDS
CREATE TABLE `TableNameForIssue` (
  `id` int NOT NULL AUTO_INCREMENT,
  `plate_number` varchar(100) COLLATE utf8mb3_bin NOT NULL,
  `result_code` char(1) COLLATE utf8mb3_bin NOT NULL,
  `result_msg` varchar(100) COLLATE utf8mb3_bin DEFAULT NULL,
  `createdAt` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `updatedAt` datetime DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`),
  KEY `TableNameForIssue_createdAt_IDX` (`createdAt`) USING BTREE,
  KEY `TableNameForIssue_plate_number_IDX` (`plate_number`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3 COLLATE=utf8mb3_bin COMMENT='comment'
@hyoj0942 hyoj0942 added the bug label Sep 21, 2023
@agronholm
Copy link
Owner

Have you tried with the latest 3.0 release candidate?

@hyoj0942
Copy link
Author

hyoj0942 commented Sep 21, 2023

Yes, I just tested it with version 3.0.0rc3 and it's the same

Note: Part of SQLAlchemy

# dialects\mysql\types.py
...
class CHAR(_StringType, sqltypes.CHAR):
    """MySQL CHAR type, for fixed-length character data."""

    __visit_name__ = "CHAR"

    def __init__(self, length=None, **kwargs):
        """Construct a CHAR.

        :param length: Maximum data length, in characters.

        :param binary: Optional, use the default binary collation for the
          national character set.  This does not affect the type of data
          stored, use a BINARY type for binary data.

        :param collation: Optional, request a particular collation.  Must be
          compatible with the national character set.

        """
        super(CHAR, self).__init__(length=length, **kwargs)
...

hyoj0942 added a commit to hyoj0942/sqlacodegen that referenced this issue Sep 27, 2023
@sheinbergon
Copy link
Collaborator

@agronholm this looks important. Maybe not for this milestone, but for the one after that

@sheinbergon sheinbergon added this to the 3.1 milestone Feb 22, 2025
@agronholm
Copy link
Owner

I can take a stab at this tomorrow if I have time.

@sheinbergon
Copy link
Collaborator

I can take a stab at this tomorrow if I have time.

I can also have a look. Just wanted to make sure it doesn't get overlooked. I marked it for milestone 3.1.

@sheinbergon sheinbergon modified the milestones: 3.1, 3.0 Feb 28, 2025
@sheinbergon sheinbergon self-assigned this Feb 28, 2025
@sheinbergon
Copy link
Collaborator

@agronholm looking at this issue I can report that:

  • No codegen errors
  • No runtime errors
  • No duplicate imports

The only remaining issue here, is the collation and charset data, while being reflected. is not part of the rendered column types. This is due to the fact the mysql.CHAR passes collation using kwargs, so signature identification taking place here cannot pickup on them. It does detect the kw args, but that's about all.

If I were to add some parent type signature parsing, it'd probably solve this issue .wdyt?

@agronholm
Copy link
Owner

Finding the correct kwargs to add is a massive PITA. Does one of its parent classes take such keyword arguments?

@agronholm
Copy link
Owner

To answer my own question, yes. But it takes plenty of other keyword arguments too. It seems just like I feared - that there is no way to automatically determine the correct keyword arguments to render.

@sheinbergon
Copy link
Collaborator

sheinbergon commented Mar 2, 2025

In this case - the type hierarchy is mysql.CHAR -> sqltypes.CHAR -> sqltypes.String
I think it should not be too hard to implement a logic that, when encountering kwargs parse named arguments all the way to the simplest base type and then try and extract them from the passed kwargs dict. Do you see a scenario where this approach might break?

@agronholm
Copy link
Owner

In this case - the type hierarchy is mysql.CHAR -> sqltypes.CHAR -> sqltypes.String I think it should not be too hard to implement a logic that, when encountering kwargs parse named arguments all the way to the simplest base type and then try and extract them from the passed kwargs dict. Do you see a scenario where this approach might break?

What passed kwargs dict? Where do we get that?

@sheinbergon
Copy link
Collaborator

The signature and parameters extracted from a mysql.CHAR instance. Not sure I understand your question

@agronholm
Copy link
Owner

I'm asking how we can reliably get the values passed to the class initializer. The classes are in no way obligated to retain attributes with the same names.

@sheinbergon
Copy link
Collaborator

sheinbergon commented Mar 3, 2025

We're already getting the reflected type properly from SQLAlchemy. Meaning, the values it holds already rely on how implementation classes declare constructor field names and pass them up the type hierarchy. All we need to do is follow hierarchy constructor field trail in a best effort manner in order to make sure types are being rendered following the same principal. I believe it should work

@agronholm
Copy link
Owner

We're already getting the reflected type properly from SQLAlchemy. Meaning, the values it holds already rely on how implementation classes declare constructor field names and pass them up the type hierarchy. All we need to do is follow hierarchy constructor field trail in a best effort manner in order to make sure types are being rendered following the same principal. I believe it should work

OK so in other words, follow the MRO, get initializer signatures, extract argument names and their default values, check for attributes with matching names whose values differ from the defaults?

@sheinbergon
Copy link
Collaborator

Pretty much so

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants