Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Switch geodata providers #7393

Draft
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

camdecoster
Copy link
Contributor

@camdecoster camdecoster commented Mar 20, 2025

Description

Adds scripts to build topojson from UN sourced data and updates build process to run these scripts.

Closes #7334

Changes

  • Adds script for downloading shapefiles
  • Adds script for converting shapefiles into topojson
  • Updates topojson path references

Testing

  • Run npm run build_topojson and make sure the script completes successfully
  • Try loading the maps in ./dist/topojson in Mapshaper
  • Open the test dashboard
  • Try loading some of the geo plots and look for errors

Notes

  • The new maps are built from this UN data source
  • The current maps come from sane-topojson, which gets its data from Natural Earth Data
  • These new maps omit the subunits layer because that information isn't available in the UN geodata
  • The 'waterbodies' layer replaces the 'lakes' and 'rivers' layers
  • I haven't been able to verify the resolution of the data from the UN. There's a reference here to the "UNGeodata 25 million scale", but I haven't been able to confirm what that means.
  • Mapshaper can't handle antimeridian cutting, so the UN data looks weird when projected as WGS84 (source)
  • I want to credit sane-topojson for some of the design of these scripts. I'm attempting to create a (mostly) drop in replacement, so it made sense to follow those conventions.

TODO

  • Clean up shapefiles after converting to topojson
  • Fix Antarctica maps (or just remove them)
  • Commit new maps to ./dist/topojson folder
  • Fix broken tests (I'm sure there will be some)
  • Remove log statements in process_geodata.mjs (or hide them behind flag)
  • Add the 'usa' map (though it will be pretty bland without states)
  • Remove sane-topojson package
  • Verify resolutions
  • Only include actual coastlines in coastlines
  • Make sure 110m maps don't have any unintended artifacts (like South America)
  • Add UN geodata archive to repo and only attempt download if file can't be found
  • Add markdown log file in draftlogs: 7393_feat.md
  • Remove unnecessary properties info from final maps

@gvwilson gvwilson added feature something new P1 needed for current cycle labels Mar 20, 2025
@camdecoster camdecoster changed the title Switch geodata providers feat: Switch geodata providers Mar 20, 2025
import config from './config.json' assert { type: 'json' };

try {
// Download data from UN
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As these origin URLs may change (link rot), I worry that this piece of the script could become out of date quickly. I'm not sure there's too much we can do here but I might recommend committing the current source file (but not adding it to the dist/. Maybe someone else has a better idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems reasonable. The script could be updated to look for the file first and only download if it doesn't exist.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marthacryan do you have an opinion or ideas on this? What's the ideal dev UX here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, in 3b47c67 I added the archive from the UN.

@ndrezn ndrezn requested a review from emilykl March 20, 2025 17:03
@gvwilson
Copy link
Contributor

  1. Sync with this branch.
  2. npm i to install/update packages.
  3. npm run build_topojson to build JSON files.
  4. Go to mapshaper.org.
  5. Load europe_50m.json.
  6. Initially looks like the lakes are misplaced (for example, the Caspian Sea seems to overlap the Russian border in a weird way in the screenshot).
  7. On closer inspection, maybe this is valid: I'm unable to find a map showing how national borders cross the Caspian Sea, and Russia might very well have a claim to the northwest section.
  8. The three lakes in the north check out: one is in Sweden, the other two are near St. Petersburg.
    Screenshot 2025-03-25 at 9 58 13 AM

@gvwilson
Copy link
Contributor

Q for @camdecoster : did the old usa map have the states? as you say, it's not much use without that…

@gvwilson
Copy link
Contributor

Trying to get npm run test-jasmine to work with the new topojson files, it looks like I'm tripping over a configuration issue.

  1. Original error message complained about not being able to find ../../../dist/topojsonworld_110m.json (note the missing '/' between 'topojson' and 'world').
  2. I modified getTopojsonPath in src/lib/topojson_utils.js as follows to insert the '/' if required:
 topojsonUtils.getTopojsonPath = function(topojsonURL, topojsonName) {
-    return topojsonURL + topojsonName + '.json';
+    if (topojsonURL.endsWith('/')) {
+       return topojsonURL + topojsonName + '.json';
+    } else {
+       return topojsonURL + '/' + topojsonName + '.json';
+    }
 };
  1. I also modified test/jasmine/karma.conf.js as follows because we're getting the JSON files from the local dist directory not the installed sane-topojson package:
-var pathToSaneTopojsonDist = path.join(__dirname, '..', '..', 'node_modules', 'sane-topojson', 'dist');
+var pathToTopojsonDist = path.join(__dirname, '..', '..', 'dist', 'topojson');

// ... and further down, edit to match the new variable name

-        {pattern: pathToSaneTopojsonDist + '/**', included: false, watched: false, served: true}
+        {pattern: pathToTopojsonDist + '/**', included: false, watched: false, served: true}

This still produces error messages like the one shown below saying that the JSON files can't be found:

	Failed: plotly.js could not find topojson file at ../../../dist/topojson/world_110m.json. Make sure the *topojsonURL* plot config option is set properly.
	Error: plotly.js could not find topojson file at ../../../dist/topojson/world_110m.json. Make sure the *topojsonURL* plot config option is set properly.
	    at Object.<anonymous> (/Users/gvwilson/plotly/plotly.js/src/plots/geo/geo.js:137:35 <- /private/var/folders/w2/l51fjbjd25n9zbwkz9fw9jp00000gn/T/0c95a0b81672d3a2e3f59fc0534657ef-bundle.js:147433:31)
	    at Object.event (/Users/gvwilson/plotly/plotly.js/node_modules/@plotly/d3/d3.js:504:42 <- /private/var/folders/w2/l51fjbjd25n9zbwkz9fw9jp00000gn/T/0c95a0b81672d3a2e3f59fc0534657ef-bundle.js:4875:48)
	    at XMLHttpRequest.respond (/Users/gvwilson/plotly/plotly.js/node_modules/@plotly/d3/d3.js:1951:24 <- /private/var/folders/w2/l51fjbjd25n9zbwkz9fw9jp00000gn/T/0c95a0b81672d3a2e3f59fc0534657ef-bundle.js:6326:30)
25 03 2025 10:40:03.170:WARN [web-server]: 404: /dist/topojson/world_110m.json
[note: the message above is then repeated several times]

I've tried variations on the path in test/jasmine/tests/geo_test.js specified by this line:

Plotly.setPlotConfig({ topojsonURL: '../../../dist/topojson' });

Adding an extra .. or removing one of the ones that's there doesn't affect the outcome.

@camdecoster
Copy link
Contributor Author

@gvwilson yes the old maps had 'subunits' layers for a few regions (USA and Brazil did for sure). The UN maps don't include that information. Thanks for looking into the tests. I'm working on an update to get the tests working but I haven't pushed that yet. I'll try to get that out later today.

@ndrezn
Copy link
Member

ndrezn commented Mar 25, 2025

@etpinard when you have a chance could you give this PR a look? Would love your eyes. Thank you!

In the short term what was the source for US states/Canadian provinces in the old dataset?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature something new P1 needed for current cycle
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use UN GeoJSON data for geo charts rather than Natural Earth
3 participants