-
Notifications
You must be signed in to change notification settings - Fork 291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option for polygon/linestring in results #823
base: master
Are you sure you want to change the base?
Conversation
I haven't done a full review yet but I do have some general thoughts on the implementation:
Two other considerations come in mind but they are easily deferred to follow-up PRs:
|
Thanks. I will update the PR with the changes in this file soon. One question though about the 'Elasticsearch is on it's way out': I've been planning to update the Elastic client to the Java API so you can use an existing Elasticsearch cluster instead of the internal one (newer versions of Elasticsearch don't support the Transport client). Is my new PR still a good idea?
Will take this along as well.
Agreed
OK. I will make the default to return the polygon when it's available in the index. The option 'polygon=false' will return the centroid instead.
Will do.
I have to look into this issue.
I will look into this. |
We'll drop ES support completely and go with OpenSearch. Note that the OS version already supports an external OpenSearch cluster. The support is just somewhat rudimentary and HTTP-only. |
Would this PR return a Multipolygon in case for queries like "hamburg"? See the nominatim response. Currently photon returns a slightly confusing, but correct extent as only a single extent is allowed https://photon.komoot.io/api/?q=hamburg ... or maybe we add a new field |
Yes, it would return a Polygon like Nominatim does. |
Done
Done
Done I'm looking forward to your response :) |
Sorry for the long delay in responding. I've decided to get a release out first before looking into this again. Because of this there is a minor merge conflict now regarding the dependencies. Also, the opensearch variant doesn't compile because |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of more comments after trying this out. The OpenSearch version currently doesn't work at all. It already fails to import.
The output for the ElasticSearch version displays a wrong projection:
...
"geometry": {
"coordinates":[9.5227103,47.1395576],
"type":"Point",
"crs":{"type":"name","properties":{"name":"EPSG:0"}}
}
...
Photon doesn't supply a CRS at all with the geometry. I'd leave it at that.
src/main/java/de/komoot/photon/nominatim/NominatimConnector.java
Outdated
Show resolved
Hide resolved
Will do this weekend. |
I've made some adjustments and have been trying to get the OpenSearch version running however, it fails with an error:
I've searching online, but can't seem to find the reason for OpenSearch clearly has geo_shape support. In other news: I fixed the PR with your suggestions, but will perform some additional tests this weekend. |
That's where I gave up my experiments with this branch as well. It's possible that the embedded version of OpenSearch that we are using is missing a dependency needed for geo_shape. Maybe try running against a stand-alone OpenSearch and see if it works then? |
https://github.com/opensearch-project/geospatial ? Maybe it's enough to add the OpenSearch plugin as a dependency. |
So, I've found out that there were multiple issues with my build:
Therefore I switched to testcontainers and everything is working-ish. Unit tests for Opensearch work, but due to some small differences with Testcontainers, the Elasticsearch Unit tests don't work... yet! (It's mainly because testcontainer runs a Docker container with an Opensearch instance on random ports and the unit tests have to connect to said random port, although it's possible to expose the default ports). Is there anyone who can help me add the Geo plugin to the opensearch-runner of would you be willing to switch to testcontainers and docker? If Photon switches to testcontainers, I can make some changes to the Elasticsearch unit tests. |
You could open an issue at https://github.com/codelibs/opensearch-runner. Either it's already working or it's a valid feature request. |
It needs to work with opensearch-runner because that is also what Photon uses in embedded mode. From the code you cite it looks like you can supply your own module list by adding
|
With #855 merged you can now simply add the plugin to
|
Opensearch has fixed it and I've updated the PR. Both Opensearch tests and Es Embedded are 100% working. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still not completely finished with a detailed review but I added some comments.
I currently see two major issues that still need to be solved:
- the OS variant is currently hard-coded towards polygons with the result that streets cannot be found anymore. The code should be able to deal with any geometries including invalid ones.
- import in the ES variant with geometry enabled is very slow for me. Slower by a factor of 5. Any idea what is causing this?
The first issue does make me wonder if we shouldn't rather call the API parameter geometry
and consistently go with that in the code. "polygon" is quite confusing for something that might also cause linestrings to appear. We could then have geometry=centroid/bbox/full
.
// gen.writeObjectFieldStart("properties"); | ||
// gen.writeStringField("name", "EPSG:0"); | ||
// gen.writeEndObject(); | ||
// gen.writeEndObject(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't just assume that the geometry is of type polygon. There may be pretty much anything in there: point, linestring, multilinestring, polygon, multipolygon.
Looks like we need tests for all the different geometry types to make sure this works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added support for LineString and Polygons for now. I have to examine the Nominatim database if they use other formats.
app/opensearch/src/test/java/de/komoot/photon/ESBaseTester.java
Outdated
Show resolved
Hide resolved
I have build in support for LineString and Polygons and have to analyze the Nominatim database for other Geometries.
I will rename it to 'geometry' support in my next commit. |
Another thing: the Tests are working for Elastisearch as well as Opensearch. Though, whilst importing data, I get an 'Error during bulk import.' Debugging this from the commandline is very hard, because I can't see the actual data that is being imported. I've been unable to run the import process in debugging from IntelliJ (I keep getting the error 'Jar hell'). Is there any way I can run the import process from Intellij so that I can better debug the error? |
Nominatim has everything: Point, LineString, MultiLineString, Polygon, MultiPolygon. It definitely would be better to use a geojson builder instead of hand-coding the conversion. I see that you did that for ES. I do wonder though if that is the part that is making ES slow. That is why I was asking about the performance.
Not sure where the jar hell comes from. I've only seen this in the ES variant. Maybe you could look into improving error reporting here? This code really could have a more helpful message and print the errors that the bulk import returned. |
At the moment, photon only returns the point of a location, and not the polygon (see #259). This PR will add the option to add the polygon (i.e. geometry) to the Elasticsearch Index and a API parameter
polygon
to return said polygon. If no polygon exists, the point is returned.WARNING: This will increase the Elasticsearch Index size! (~575GB for a Planet import).
To enable: add the command line argument
-use-geometry-column
whilst importing and add&polygon=true
to the API call.