Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model file export #4305

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

Conversation

omer-candan
Copy link

@omer-candan omer-candan commented Jul 8, 2024

Write models to .mps files

We can export linear solver models and get a string as a result. However, when working with very large models, the process (.NET in my case) fails when the export method is called, due to string type's size limitation. In our use case, we don't actually need the exported model in memory, we write the string to a file. A method that directly writes to a file helps us overcome the limitation.
The file write method has been added by @lperron, which simplifies this PR.

@lperron
Copy link
Collaborator

lperron commented Jul 9, 2024

this is interesting. I cannot use it as is. We cannot use iostream. The complex path is to implement gzip reading/writing in the File interface, which is a pain.

@omer-candan
Copy link
Author

this is interesting. I cannot use it as is. We cannot use iostream. The complex path is to implement gzip reading/writing in the File interface, which is a pain.

Thanks @lperron . My first attempt was to not use iostream. I had to duplicate ExportModelAsLpFormat, AppendComments and AppendConstraint functions to directly write to gz files. I guess that would also not be OK, because of too much duplication.
Did you mean something similar?

@lperron
Copy link
Collaborator

lperron commented Jul 9, 2024

So, the best way to integrate it to add a field in the File class (ortools/base/file.h) that indicates if the file is a gzip one.
In that case, in the relevant methods, we need to duplicate the code to use gzip method instead of the C FILE* methods.

For instance, the MPS reader uses the FileLine class, which is turns call File::Read() raw API.

On the MPS writer, I would make sure that we flush the output string to file regularly, and that this flush is done by the File API, which internally would use the gzip API.

Am I clear ?

@omer-candan
Copy link
Author

So, the best way to integrate it to add a field in the File class (ortools/base/file.h) that indicates if the file is a gzip one. In that case, in the relevant methods, we need to duplicate the code to use gzip method instead of the C FILE* methods.

For instance, the MPS reader uses the FileLine class, which is turns call File::Read() raw API.

On the MPS writer, I would make sure that we flush the output string to file regularly, and that this flush is done by the File API, which internally would use the gzip API.

Am I clear ?

Yes, thanks again! I'll try my best to work on those parts.

@lperron
Copy link
Collaborator

lperron commented Jul 12, 2024

Please sync with main. You will have a conflict.
I have implemented MpModelProtoExporter::WriteModelToMpsFile() and hooked it with model_builder (python, java, .NET).
The implementation is not that robust to very large models, but better than before.

The only missing is gzip support in File (and hooking to linear_solver C# if you want it).

@omer-candan
Copy link
Author

Please sync with main. You will have a conflict. I have implemented MpModelProtoExporter::WriteModelToMpsFile() and hooked it with model_builder (python, java, .NET). The implementation is not that robust to very large models, but better than before.

The only missing is gzip support in File (and hooking to linear_solver C# if you want it).

@lperron Thank you so much. I resolved the conflicts and called your function from linear_solver. Made it optional to export to gzip with a bool parameter.
I had a difficult time adding compression part. Your method can write a 13GB(1.2GB compressed) model file I used in my tests in a single call, but it was not possible with the new gzip write (gzwrite). The compressed output files were cut off and usually not decompressable. Multiple write operations to work with chunks solved this problem. Is this an OK approach? If yes, I'd like to get your opinion on chunk size and compression level (1=fastest now) that I chose.

@omer-candan omer-candan marked this pull request as ready for review August 21, 2024 11:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants