-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Code returns inconsistent results with CGAL 5.5 #3
Comments
I had segmentation faults on join when using my testing code (that was a very similar modified copy of the official test code) compiling against the development version of CGAL with the attached solution file. This is the smallest one I could obtain. The crash is fairly rare on small invalid solutions and becomes very common on instances with more than 10,000 vertices. The crash happens on the join function that fills a vector of polygons with holes, as in the official code. I then switched my code to use the Polygon_set_2 join method instead and the problem disappeared. |
Yes, this is about what we have seen with that version of CGAL. |
I do not see any reason why it would be a bug in our code. I could not find any pattern on the cases when it happens to build a small example, though. However, it happens on every execution of the problematic files. |
I just asked Efi and he suggested to try (UsePolylines=) |
Ah - but does he think that this behavior is working as intended or would they be interested in a bug report? |
With this I get a |
The following C++-code results in a segmentation fault with CGAL 5.5 cgal_problem.cpp.zip. It essentially converts the solution directly to a vector of cgal polygons without any further dependencies. |
I think that even on the version of CGAL used by pyutils23, the problem still exists, even if it is more rare. I'm attaching a solution that gives a segmentation fault, and I believe it is a valid solution. Using Polygon_set_2::join seems to solve the problem. |
That would be surprising: If I remember correctly, Efi told me that the old join is thoroughly tested. |
So, maybe the problem is somewhere else and deserves a separate issue. All, I know is that when I test this solutions with pyutils23, I consistently get a segmentation fault (could be anywhere). |
I would prefer it to be our fault, because then it is easier to fix. 😅 |
I have no idea how to debug this mixed python C++, or even how to compile everything with debug information. I think it'll be easier for you to test this file when you have the time. It's not urgent. I was just surprised that it gives a segmentation fault and it was the only file to do so out of several hundreds. When I tested it with a separate code by myself (that is only really meant to test coverage, not if the polygons stick outside for example), the file seems ok. So, I thought it was valuable information for the future. |
I cannot reproduce the segmentation fault locally. The instance/solution combination, for me, is verified correctly (i.e., claimed to be valid). |
That's surprising. I test it with:
|
Looks solid. Even under debug builds, it still works for me. There are, however, a few sanitizer warnings from the undefined behavior sanitizer that look like some sort of type confusion within CGAL/Surface_sweep_2/Arr_no_intersection_insertion_ss_visitor.h (or may be false positives, though I'm not sure what would cause such false positives). Are you absolutely sure that you built against the 5.3.2 version of CGAL, or is there even a slight chance that somehow, the newer headers ended up being the ones dragged in for some reason? |
I have no idea which version of CGAL is used. How can I find out? I simply ran pip install cgshop2023-pyutils as a normal user on a Fedora machine without any CGAL installed system-wide. |
Then it should be the exact same version we are using. I am worried: As far as I know, there are only two different algorithms used. One of them, we already know to be buggy. If there also is a bug in the other one, just calling it (indirectly) in another way won't be a reliable workaround. |
@gfonsecabr You can kinda do a debug build as follows: Pull the latest version of the repository. There is some minor change to the way cmake looks up find modules, which should make a CMake build of the C++ side easier. Create a build directory in the top-level directory of the repository, e.g., called 'cmake-build-debug' or so, and enter it. Execute 'conan install .. -s build_type=Debug --build=missing' in it. Then execute 'cmake -DCMAKE_BUILD_TYPE=Debug ..' in it, followed by 'cmake --build .' This should produce a debug build in the build directory; the tricky part is (temporarily) replacing the installed version of the C++ module by the debug version to test it/be able to see where the problem occurs. For that, you'd have to find the installation location of the package you installed (probably in some conda environment). |
Also, could you possibly tell me exactly what C++ compiler/version you are using? Maybe I can reproduce the issue on a linux system, then. |
I'm using g++ (GCC) 12.2.1 20221121 (Red Hat 12.2.1-4) |
I've tried to reproduce on linux with GCC 12 (unfortunately, i only have 12.1 not 12.2...); still no luck. The instance verifies just fine. I guess you already tried to reinstall the package? |
Okay, now we're getting somewhere (needed to patch python with patchelf to add libstdc++ as fake dependency to python3 to get address sanitizer to work in the mixed python-and-c++ setting, and LD_PRELOAD to preload libasan into python3). While it doesn't segfault, it definitely does something wrong. Address Sanitizer recognizes at least one heap use-after-free and a lot of the already-known errors that look like type confusion between a halfedge and a face. |
Even with CGAL 5.5 and UsePolylines = CGAL::Tag_false, we sometimes have segfaults on our testing machine (rtx01); whether a segfault occurs or not depends, among other things, on debug vs. release mode. |
The implementation of 2D Boolean ops of CGAL depends on the 2D Arrangement package of CGAL. |
Did you test if doing a random permutation of the input set of polygons changes anything? |
Not sure whether you referred the question to me; in case you did, I haven't tested anything with segments yes.
Last thing, I don't understand the output of the sanitizer (perhaps I need to read it again), but the last line contains '~Gps_on_surface_base_2'. So just for laughs, can you please comment out the body of this destructor and try again... |
I mean that a possible work around for the contest could be doing a random permutation of the solution polygons. This would be very likely to work if the polygons are added one by one and their union is updated. With the current join function that computes the union of many polygons, it depends a lot on the algorithm used (the order shouldn't change anything if a sweep line approach is used internally), but I believe that the join function with iterators is only a shortcut to apply join many times. |
Right, just a shortcut.
Under the hood, a Boolean op on two polygons translates to the application
of the overlay() function on the two 2D-arrangements that represent the two
input polygons, respectively.
For a set of n polygons, overlay is applied n-1 times (not necessarily
sequentially).
…____ _ ____ _
/_____/_) o /__________ __ //
(____ ( ( ( (_/ (_/-(-'_(/
_/
On Sun, 29 Jan 2023 at 17:09, Guilherme D. da Fonseca < ***@***.***> wrote:
I mean that a possible work around for the contest could be doing a random
permutation of the solution polygons. This would be very likely to work if
the polygons are added one by one and their union is updated. With the
current join function that computes the union of many polygons, it depends
a lot on the algorithm used (the order shouldn't change anything if a sweep
line approach is used internally), but I believe that the join function
with iterators is only a shortcut to apply join many times.
—
Reply to this email directly, view it on GitHub
<#3 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABVBNOBWYBCE4ORIHOWBFG3WU2B2BANCNFSM6AAAAAARMJMLCU>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Using a random permutation, we were able to verify one of the submissions. The other four submissions still make problems after very many shuffles. Also tried working with Debug-compilation mode. |
Hello, I only today learned about your problem through a mail sent by Sándor. |
Great! I was lucky to share an office with Efi for the last three months, so he consulted us directly regarding this problem. |
what is "huge"? |
Well, the solution files for which this happens range between 6-30mb. |
You can open an issue here and I'll try to reduce it. You can put the data at https://gist.github.com |
Hello, @sloriot has identified the problem and you can find his fix in the PR CGAL/cgal#7243 |
Just to clear things out, this has absolutely nothing to do with the
current issue you are facing.
…On Fri, Feb 3, 2023, 16:12 Andreas Fabri ***@***.***> wrote:
Hello, @sloriot <https://github.com/sloriot> has identified the problem
and you can find his fix in the PR CGAL/cgal#7243
<CGAL/cgal#7243>
We would be glad if you gave it a try.
And the next time you encounter a problem with CGAL please file an issue.
We consider bug reports as contributions, as only if we learn about a bug
we can fix it.
—
Reply to this email directly, view it on GitHub
<#3 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABVBNOENYVV6PDJDHLQZ4EDWVUG3NANCNFSM6AAAAAARMJMLCU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
You mean the "fix" does not fix the problem described in this issue, but only a subproblem you, Efi, filed here? |
The "fix" addresses a problem found about 3 months ago, and worked around by falling back to the old code of the free Boolean op functions, which do not use polylines. It stopped being an issue once this workaround was in place, which happened 10 minutes after it was brought to my attention (again ~3 months ago). BW, this problem occurs only when using the free functions. When using the CGAL::Polygon_set_2 for example, segments (and not polylines) are used by default. About 2 weeks ago a new issue was encountered. As far as I understand it was (still is) hard to reproduce. Dominik told me that despite this issue, the contest verification process has completed by now. He also told me, that being a "good citizen" in our community, he or his colleges would nevertheless try to peruse it, so that it can be properly addressed. (sorry for the many words...) |
Exactly, we have been falling back to the old code, which worked for most solutions. However, we still got some rare segmentation faults, apparently within CGAL. Also at least one participant reported similar problems within his own code. However, I already spent some time in trying to track down the origin of the problem and it doesn't make much sense to me so far. The big problem with this bug hunt is, that it is not only extremely rare and system dependent, but it only appeared in complete verification processes which are mixed Python/C++-code (PyBind11) and complex to debug properly (cannot really provide a reproducible example based on that). We may let a student assistant create a pure C++ version, which can then be analyzed easier, but I can imagine that then the bug suddenly does no longer appear. I won't rule out completely that this remaining problem may be a bug on our side, but the segmentation fault is happening within CGAL and we check before that the polygons are all nice and simple. |
When compiling the code with GCC and CGAL 5.5, we get wrong results from CGAL::join and sometimes segmentation faults from deep within CGAL. We should extract the corresponding instances and write a minimal piece of code to reproduce this error. If we cannot do this, the bug is probably from our side. If we can do it, it may actually be a CGAL bug and we should fill out a bug report.
Additionally, make sure that if the system provides CGAL, still the conan version is used.
The text was updated successfully, but these errors were encountered: