-
Notifications
You must be signed in to change notification settings - Fork 377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLDR-11570 Update Ancient Greek writing systems #4058
Conversation
@@ -1628,7 +1629,7 @@ XXX Code for transations where no currency is involved | |||
<language type="gos" scripts="Latn"/> | |||
<language type="got" scripts="Goth" alt="secondary"/> | |||
<language type="grb" scripts="Latn"/> | |||
<language type="grc" scripts="Cprt Grek Linb" alt="secondary"/> | |||
<language type="grc" scripts="Cprt Grek" alt="secondary"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be primary Grek and secondary Cprt (which as far as i can tell isn't even 100% historically certain)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The list here is alphabetical not ordered by frequency. Another reason I want to redo tag/attribute hierarchy here.
I'll look into removing the auto-alphabetical ordering for the script attribute. It may not be too tricky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I cannot find the right piece of code to do that.
Now that UTW's done I can work more on my proposal to redo this section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that was the job of language_script.tsv
but i really don't know any more-
yeah
grc Ancient Greek secondary Cprt Cypriot
grc Ancient Greek secondary Grek Greek
grc Ancient Greek secondary Linb Linear B
@macchiati why is Grek secondary here with no primary? ^
cldr/tools/cldr-code/src/main/resources/org/unicode/cldr/util/data/language_script.tsv
Line 284 in 4898ee6
grc Ancient Greek secondary Grek Greek |
@@ -324,6 +324,7 @@ public static void main(String[] args) throws IOException { | |||
"unr_Beng_IN", | |||
"ttt_Latn_AZ", | |||
"pnt_Grek_GR", | |||
"grc_Grek_GR", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may need to add grc_Cprt_CY too
tools/cldr-code/src/main/resources/org/unicode/cldr/util/data/language_script.tsv
Show resolved
Hide resolved
Ugh, more cryptic error messages:
C'mon, tell us why it was 'expected'. anyway, i'm not sure if there's another fix besides a manual override |
707edbf
to
8b2f7b2
Compare
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
8b2f7b2
to
5649abe
Compare
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
@@ -1629,7 +1631,7 @@ XXX Code for transations where no currency is involved | |||
<language type="gos" scripts="Latn"/> | |||
<language type="got" scripts="Goth" alt="secondary"/> | |||
<language type="grb" scripts="Latn"/> | |||
<language type="grc" scripts="Cprt Grek Linb" alt="secondary"/> | |||
<language type="grc" scripts="Grek" alt="secondary"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
secondary still… but, better than before
5649abe
to
4dfeab0
Compare
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
@@ -123,13 +123,13 @@ Dupl; 33; 1BC20; FR; 1; EXCLUSION; NO; NO; NO; YES; NO | |||
Egyp; 33; 13153; EG; 3; EXCLUSION; NO; NO; YES; YES; NO | |||
Elba; 33; 10500; AL; 1; EXCLUSION; NO; NO; NO; NO; NO | |||
Elym; 33; 10FF1; IR; 1; EXCLUSION; YES; NO; NO; NO; NO | |||
Gara; 33; 10D5D; SN; 1; EXCLUSION; YES; NO; YES; NO; YES # provisional data for future Unicode 16.0 script | |||
Gara; 33; 10D5D; SN; 1; EXCLUSION; YES; NO; YES; NO; YES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No longer provisional since Unicode 16.0 is out now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say to put the age here but that's kind of pointless because age data is a unicode property.
"tag": "grc", | ||
"tags": [ "grc-GR", "grc-Grek" ], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no - please don't change langtags.json! it's an external file (hence /external/
… ) - fyi @mhosken @DavidLRowe @emily-roth
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries. I also submitted a PR to their project silnrsi/langtags#21
But I don't have to get ahead of myself, I'll undo the json changes. I wasn't sure whether to be this aggressive or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's better to override with code or other data than to modify external files- makes it hard to reproduce the situation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
langtags.json shouldn't be patched.
sorry i didn't see that langtags.json itself was modified and shouldn't be. |
CLDR-11570 Retain order when using ConvertLanguageData When building the file, it changes the order of supplementalData's scripts. This changing a TreeSet (orders nodes by keys) to a LinkedHashSet (serialized output stays in originally inputed order). Scripts to run: ` mvn package -DskipTests=true && java -jar tools/cldr-code/target/cldr-code.jar ConvertLanguageData && java -jar tools/cldr-code/target/cldr-code.jar GenerateLikelySubtags ` CLDR-11570 Make grc_Cprt map to CY CLDR-11570 Fix test for grc_Cprt Last commit I edited a file but made a slight mistake. This fixes that and hopefully addresses the test issue. Code ran: mvn package -DskipTests=true && java -jar tools/cldr-code/target/cldr-code.jar ConvertLanguageData && java -jar tools/cldr-code/target/cldr-code.jar GenerateLikelySubtags && java -jar tools/cldr-code/target/cldr-code.jar GenerateTestData Note `ht` Changes look like they come from changes after adding the locale in a different pull request.
Jeez, if it exists at all I get the bug. Thereby, I'm removing it again and created https://unicode-org.atlassian.net/browse/CLDR-18087 for the long-term fix.
ffcc9e6
to
57eef6b
Compare
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
@conradarcturus did you mean to include French changes or is that just because of merging in main? |
french changes seem to be from already merged #3985 |
CLDR-11570
Ancient Greek is only written in Greek alphabet rather than Linear B or Cypriot Syllabary -- so this change corrects that. In fact, Linear B was never used for Ancient Greek -- rather it was used by its predecessor Mycenaean Greek (ancient greek to the ancient greeks). The Cypriot Syllabary was used primarily for a different language, Eteocypriot -- its not related to Greek or Indo-European. There is some Ancient Greek text (particularly the Arcadocypriot dialect) in the Cypriot Syllabary, and it's even present in the Idalion Tablet and the Idalion bilingual but its pretty rare.
Unfortunately I cannot list "Cypriot Syllabary" as secondary without it bumping out the Greek Alphabet because Cprt > Grek alphabetically. Thereby I made a separate Jira ticket to fix ambiguities in the xml https://unicode-org.atlassian.net/browse/CLDR-18087. This ticket just focused on the Ancient Greek fix so we can fix the bug.
Scripts ran:
ALLOW_MANY_COMMITS=true