-
-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encoding helpers #52
Comments
Hm. I am just dealing with 8869-1 encoding in my files. But yes, there is the specification for all in the header. ok. AFAIU, it would suffice to read the 4 bytes of the interchange and decide which encoding to take, and then, read the rest of the stream in that encoding. I have files starting with "UNB+ANSI:1+ME123456" - mostly without an UNA header, and none of those UNO[x] specifiers. An example is in the test data files. |
ok. AFAIU, it would suffice to read the 4 bytes of the interchange and decide which encoding to take, and then, read the rest of the stream in that encoding.
I have files starting with "UNB+ANSI:1+ME123456" - mostly without an UNA header, and none of those UNO[x] specifiers. An example is [in the test data files](https://github.com/nerdocs/pydifact/blob/master/tests/data/patient2.edi).
How to deal with that?
Seems legit to make an assumption (like utf-8 ? Unless the standard have another-and-imcompatible default ?) and to offer a way to force a decoding charset. The forcing option will also marginally be useful to allow dealing with messages having bad match between unoa and actual encoding.
I suspect that pour exemple is ANSI/X12 and not edifact. So is it in the scope of pydifact anyway ?
|
It is definitely part of what I want to cope with, because I need to deal with that kind of files... But I'm afraid this is EDIFACT. It's a file I got myself (just changed names to pseudonymize them) - but here in medical systems, many companies don't care about standards... |
I kinda struggle with edifact encoding, but here what I came up to :
data:
deserializing helper:
I wonder what pydifact could embed in its scope in terms of :
Interchange.serialize_to_bytes()
helper with automatic encoding selection based on syntax identifier ?)Any thought appreciated :-).
The text was updated successfully, but these errors were encountered: