Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type error on field #21

Open
lukevers opened this issue Jul 29, 2024 · 3 comments
Open

Type error on field #21

lukevers opened this issue Jul 29, 2024 · 3 comments

Comments

@lukevers
Copy link

Hey,

I was testing this out and ran into an issue. I have a field in airtable that is a formula and returns a number. It looks like it's mad it's not a string (and was not cast to a string):

Screenshot 2024-07-29 at 2 29 43 PM

Here are my logs:

root@docker-desktop:/project# meltano run tap-airtable target-jsonl
2024-07-29T18:25:46.303044Z [info     ] Environment 'dev' is active   
2024-07-29T18:25:47.103235Z [warning  ] No state was found, complete import.
2024-07-29T18:25:49.473064Z [info     ] 2024-07-29 18:25:49,472 | INFO     | tap-airtable.active_journeys | Beginning full_table sync of 'active_journeys'... cmd_type=elb consumer=False name=tap-airtable producer=True stdio=stderr string_id=tap-airtable
2024-07-29T18:25:49.473868Z [info     ] 2024-07-29 18:25:49,472 | INFO     | tap-airtable.active_journeys | Tap has custom mapper. Using 1 provided map(s). cmd_type=elb consumer=False name=tap-airtable producer=True stdio=stderr string_id=tap-airtable
2024-07-29T18:25:49.915153Z [info     ] Traceback (most recent call last): cmd_type=elb consumer=True name=target-jsonl producer=False stdio=stderr string_id=target-jsonl
2024-07-29T18:25:49.915836Z [info     ]   File "/project/.meltano/loaders/target-jsonl/venv/bin/target-jsonl", line 8, in <module> cmd_type=elb consumer=True name=target-jsonl producer=False stdio=stderr string_id=target-jsonl
2024-07-29T18:25:49.916610Z [info     ]     sys.exit(main())           cmd_type=elb consumer=True name=target-jsonl producer=False stdio=stderr string_id=target-jsonl
2024-07-29T18:25:49.917122Z [info     ]   File "/project/.meltano/loaders/target-jsonl/venv/lib/python3.10/site-packages/target_jsonl.py", line 92, in main cmd_type=elb consumer=True name=target-jsonl producer=False stdio=stderr string_id=target-jsonl
2024-07-29T18:25:49.917589Z [info     ]     state = persist_messages(  cmd_type=elb consumer=True name=target-jsonl producer=False stdio=stderr string_id=target-jsonl
2024-07-29T18:25:49.917983Z [info     ]   File "/project/.meltano/loaders/target-jsonl/venv/lib/python3.10/site-packages/target_jsonl.py", line 54, in persist_messages cmd_type=elb consumer=True name=target-jsonl producer=False stdio=stderr string_id=target-jsonl
2024-07-29T18:25:49.919601Z [info     ]     validators[o['stream']].validate((o['record'])) cmd_type=elb consumer=True name=target-jsonl producer=False stdio=stderr string_id=target-jsonl
2024-07-29T18:25:49.920150Z [info     ]   File "/project/.meltano/loaders/target-jsonl/venv/lib/python3.10/site-packages/jsonschema/validators.py", line 130, in validate cmd_type=elb consumer=True name=target-jsonl producer=False stdio=stderr string_id=target-jsonl
2024-07-29T18:25:49.920646Z [info     ]     raise error                cmd_type=elb consumer=True name=target-jsonl producer=False stdio=stderr string_id=target-jsonl
2024-07-29T18:25:49.921338Z [info     ] jsonschema.exceptions.ValidationError: 6 is not of type 'string', 'null' cmd_type=elb consumer=True name=target-jsonl producer=False stdio=stderr string_id=target-jsonl
2024-07-29T18:25:49.921995Z [info     ]                                cmd_type=elb consumer=True name=target-jsonl producer=False stdio=stderr string_id=target-jsonl
2024-07-29T18:25:49.922677Z [info     ] Failed validating 'type' in schema['properties']['days_idle']: cmd_type=elb consumer=True name=target-jsonl producer=False stdio=stderr string_id=target-jsonl
2024-07-29T18:25:49.923281Z [info     ]     {'type': ['string', 'null']} cmd_type=elb consumer=True name=target-jsonl producer=False stdio=stderr string_id=target-jsonl
2024-07-29T18:25:49.923736Z [info     ]                                cmd_type=elb consumer=True name=target-jsonl producer=False stdio=stderr string_id=target-jsonl
2024-07-29T18:25:49.924801Z [info     ] On instance['days_idle']:      cmd_type=elb consumer=True name=target-jsonl producer=False stdio=stderr string_id=target-jsonl
2024-07-29T18:25:49.925646Z [info     ]     6                          cmd_type=elb consumer=True name=target-jsonl producer=False stdio=stderr string_id=target-jsonl
2024-07-29T18:25:49.949321Z [error    ] Loader failed                 
2024-07-29T18:25:49.949836Z [error    ] Block run completed.           block_type=ExtractLoadBlocks err=RunnerError('Loader failed') exit_codes={<PluginType.LOADERS: 'loaders'>: 1} set_number=0 success=False
@lukevers
Copy link
Author

lukevers commented Jul 29, 2024

So I was able to get through that specific error by making some changes to the types:

AirtableOneOfType = th.OneOf(
    th.StringType,
    th.NumberType,
    th.BooleanType,
    th.DateTimeType,
    th.DateType,
)

AirtableAnyType = th.OneOf(
    AirtableOneOfType,
    th.ArrayType(AirtableOneOfType),
)

Then in AIRTABLE_TO_SINGER_MAPPING I updated:

    "formula": AirtableAnyType,

I then had the same error on "lookup" so did that too:

    "lookup": AirtableAnyType,

Still having some problems, but getting somewhere.

If you take a look at the typescript types for fields, it's pretty chaotic -- and basically in Airtable it does seem like a lot of these fields can be things other than strings (in my case, formulas and lookups were actually numbers):

export interface FieldSet {
    [key: string]: undefined | string | number | boolean | Collaborator | ReadonlyArray<Collaborator> | ReadonlyArray<string> | ReadonlyArray<Attachment>;
}

I think the solution here is either:

  1. Convert field types to the same or similar structure as their typescript sdk types (might have a python one? I haven't looked)
  2. Cast these values to strings and keep them as a string

Or something else haha.

@lukevers
Copy link
Author

I ended up making a few more changes to get things working on my end.

  1. In the config schema I added a new field for specific base->table->column fields to exclude
th.Property(
    "exclude",
    th.ObjectType(
        additional_properties=th.ObjectType(
            additional_properties=th.ArrayType(th.StringType)
        )
    ),
    description="Exclude fields from specific tables in bases",
    required=False,
)

which looks like this (the slugified version of the column name):

config:
  exclude:
    base_id:
      table_id:
        - field_name1
        - field_name2

and then I had to continue to keep making changes in types.py, these are the updates I ended up making:

AirtableCollaborator = th.ObjectType(
    th.Property("id", th.StringType),
    th.Property("email", th.StringType),
    th.Property("name", th.StringType),
    th.Property("permissionLevel", th.StringType),
    th.Property("profilePicUrl", th.StringType),
)

AirtableButtonType = th.ObjectType(
    th.Property("label", th.StringType),
    th.Property("url", th.StringType),
)

AirtableOneOfType = th.OneOf(
    th.StringType,
    th.NumberType,
    th.BooleanType,
    th.DateTimeType,
    th.DateType,
    th.ArrayType(th.StringType),
    th.ArrayType(th.NumberType),
    th.ArrayType(th.BooleanType),
    th.ArrayType(th.DateTimeType),
    th.ArrayType(th.DateType),
)

AirtableAnyType = th.OneOf(
    AirtableOneOfType,
    th.ArrayType(AirtableOneOfType),
)

AIRTABLE_TO_SINGER_MAPPING: dict[str, Any] = {
    "singleLineText": th.StringType,
    "email": th.StringType,
    "url": th.StringType,
    "multilineText": th.StringType,
    "number": th.NumberType,
    "percent": th.OneOf(th.StringType, th.NumberType),
    "currency": th.OneOf(th.StringType, th.NumberType),
    "singleSelect": th.StringType,
    "multipleSelects": th.ArrayType(th.StringType),
    "singleCollaborator": AirtableCollaborator,
    "multipleCollaborators": th.ArrayType(AirtableCollaborator),
    "multipleRecordLinks": th.ArrayType(AirtableAnyType),
    "date": th.DateType,
    "dateTime": th.DateTimeType,
    "phoneNumber": th.StringType,
    "multipleAttachments": th.ArrayType(AirtableAttachment),
    "checkbox": th.BooleanType,
    "formula": AirtableAnyType,
    "createdTime": th.DateTimeType,
    "rollup": AirtableAnyType,
    "count": AirtableAnyType,
    "lookup": AirtableAnyType,
    "multipleLookupValues": th.ArrayType(AirtableOneOfType),
    "autoNumber": th.OneOf(th.StringType, th.NumberType),
    "barcode": th.StringType,
    "rating": th.StringType,
    "richText": th.StringType,
    "duration": th.StringType,
    "lastModifiedTime": th.DateTimeType,
    "button": AirtableButtonType,
    "createdBy": AirtableCollaborator,
    "lastModifiedBy": th.StringType,
    "externalSyncSource": th.StringType,
    "aiText": th.StringType,
}

I know this is quite a bit of changes, so I won't directly open a PR right now. Happy to open a PR if anyone else runs into these problems.

@tomasvotava
Copy link
Owner

Hey @lukevers, thanks a lot for posting this!

TBH I haven't tried anything with formulas yet, so it's just natural it doesn't work out of the box and I'm glad you've found this issue. Since, if I understand correctly, formula can output almost anything, would it be sufficient to type formulas as Any type in JSON schema?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants