-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keeping the country list up-to-date #44
Comments
Thank you for the thoughtful issue. I'll try to cover everything that you touched on, but let me know if I miss anything. First, I'll provide a little historical context. My primary two motivations for writing this library, both related to work, are decoding (and normalizing) countries provided by firewall logs (PAN and Fortigate) and similarly decoding (and normalizing) countries listed in Geolite's data set. For this reason, there is much more emphasis on the decoding functions in this library than on the encoding functions. If I recall correctly, support for the XK country code was added because I started seeing it in Geolite's CSV file one day, and I needed to be able to handle it. As a more general guiding philosophy for this library, it makes sense to lean in the directory of supporting more country codes rather than fewer. If the decode functions in this library result in territorial distinctions that are not acceptable for someone's application, it's easy for that user to replace occurrences of the one country with another. But if we go the other way and combine two country codes during decoding, there's no good way for a user to recover the original distinguished territories. The countries.csv file is this repo is just something I found somewhere when I was writing the library. It's not special, and it might have things that are incorrect. It's probably better to use the one from the country-codes repo you linked as a source of truth. If you look in Back to the original question about AC and TA, yes, I think it absolutely makes sense to add them. They need numeric codes as well. My cursory search online didn't turn up anything, but please PR a change if you have information for them. There's no build instructions for this library, and you may have already figured this part out, but just in case:
That will regenerate haskell source code based on the CSV and aliases.txt. |
Hi, I just tried using this library to check against the ISO-3166 codes supported by the libphonenumber library, which uses the C bindings of Google's libphonenumber.
It appears, there are two regions that are in that list, but not in here:
(which, it appears, are both sub-divisions of
SH
, but also have their own codes)AC
- Ascension IslandTA
- Tristan da CunhaSince we have the temporarily assigned XK added to this list, should we also include these two above?
--
on a side note,
The
countries.csv
file in the repository has not been updated for the last 8 years. The list of supported Regions forlibphonenumber
library hasn't changed for the last 4 years either. But, is it possible to get some details on what conditions are required to make the changes into the CSV, and also on how that CSV is being generated?You probably know about it, but, there is a repository called
datasets/country-codes
which contains a list of countries with a lot of details at https://github.com/datasets/country-codes/blob/main/data/country-codes.csv(which also has these two ISO codes missing for
AC
andTA
, as well asXK
which the current CSV includes, but, the datasets project [rejected(https://github.com/datasets/country-codes/issues/66) to include for now)I tried to re-generate it with some python code via that source:
(which also compares the two versions against three relevant codes)
But, the downside is that I see a lot of changes in the sub-region and sub-region-code columns. In any case, I'm just going to drop the generated CSV if it's any useful.
countries-new.csv
The text was updated successfully, but these errors were encountered: