Language coverage details #19

thlinard · 2017-02-23T17:50:40Z

In Charset Coverage Details, I'd like to see separated the old and new (2016) Google sets . Latin, Cyrillic and Greek next to Cyrillic Plus, Latin Expert, etc. is a bit confusing.

Same for Charset Coverage Details.

Language support by all available Char Sets is sometimes erroneous. Greek Core doesn't include Latin Plus, for example.

graphicore · 2017-02-23T18:31:05Z

In Charset Coverage Details, I'd like to see separated the old and new (2016) Google sets . Latin, Cyrillic and Greek next to Cyrillic Plus, Latin Expert, etc. is a bit confusing.

I see. Easiest would be for me to just sort the legacy sets to the bottom. But I can also make a clean separation. I'm not sure how the GF API will handle the legacy encodings next to the novel ones., but that will be important to the users of the font specimen in the end.

Language support by all available Char Sets is sometimes erroneous. Greek Core doesn't include Latin Plus, for example

Aha, OK. We should probably discuss this at google/fonts. If Greek Core doesn't include Latin Plus, where is it taking it's (standard) punctuation from? I have some similar questions on my list. The discussion of google/fonts#624 is related.

Note that the online version uses the files of google/fonts#642 where I included Latin Plus in Greek Core, which surly could be wrong.

thlinard · 2017-02-24T15:55:00Z

We should probably discuss this at google/fonts. If Greek Core doesn't include Latin Plus, where is it taking it's (standard) punctuation from?

Hum… From Latin Core? But Latin Core seems to exist only virtually. Probably this should be clarified.

graphicore · 2017-02-24T16:05:31Z

Yeah, I'm in the process of writing something up. There are a few issues I have with this charset analysis. As a matter of fact in the moment you posted I just created this list:

0x0021 ! EXCLAMATION MARK
0x0022 " QUOTATION MARK
0x0026 & AMPERSAND
0x0028 ( LEFT PARENTHESIS
0x0029 ) RIGHT PARENTHESIS
0x002A * ASTERISK
0x002C , COMMA
0x002D - HYPHEN-MINUS
0x002E . FULL STOP
0x002F / SOLIDUS
0x003A : COLON
0x003B ; SEMICOLON
0x0040 @ COMMERCIAL AT
0x005B [ LEFT SQUARE BRACKET
0x005C \ REVERSE SOLIDUS
0x005D ] RIGHT SQUARE BRACKET
0x00A7 § SECTION SIGN
0x00AB « LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
0x00BB » RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
0x0301 ́ COMBINING ACUTE ACCENT
0x0308 ̈ COMBINING DIAERESIS
0x2010 ‐ HYPHEN
0x2013 – EN DASH
0x2014 — EM DASH
0x2026 … HORIZONTAL ELLIPSIS

These are the chars missing from Greek Core when asking the CLDR.

But Latin Core seems to exist only virtually. Probably this should be clarified.

In google/fonts#624 I came to the same conclusion :-) The question for me is whether to pack this into the google/fonts#642 PR or do it with a new PR. Also, from 642 I should probably remove the including of Latin Plus into Greek Core, heh?

(just updated the charlist above: removed duplicates, sorted)

thlinard · 2017-02-24T17:07:34Z

These are the chars missing from Greek Core when asking the CLDR.

And not the Arabic numerals?

Also, from 642 I should probably remove the including of Latin Plus into Greek Core, heh?

Yes, probably. One set (GF Greek Pro) needs some characters from GF Latin Plus and Pro sets, like stated in the README.md, but unless Latin Pro is intended as a prerequisite for all GF, it's too much for Greek coverage.

graphicore · 2017-02-24T17:33:38Z

And not the Arabic numerals?

Interesting. I'm using this: https://github.com/unicode-cldr/cldr-misc-modern/blob/master/main/el/characters.json
And of that the keys main.characters.exemplarCharacters and main.characters.punctuation also I'm using the JavaScript String.prototype.toUpperCase function on all chars, which should do the right thing and change the char if Unicode defines an uppercase, otherwise leave it. There are no numerals in this document though. Similarly, for Arabic no numerals are defined either: https://github.com/unicode-cldr/cldr-misc-modern/blob/master/main/ar/characters.json

Good find, thanks!

The information should be somewhere, maybe in https://github.com/unicode-cldr/cldr-numbers-modern? But on a first glance it seems to define rather number formating. Do you know where to look for the numerals in the CLDR?

Also, from 642 I should probably remove the including of Latin Plus into Greek Core, heh?

Yes, probably.

Will do.

One set (GF Greek Pro) needs some characters from GF Latin Plus and Pro sets

I've seen that. This needs a decision. Either we do kind of "technical" Namelist files, so that we don't repeat ourselves (if this is feasible, it would be quite a bummer to end up with one Namelist per char) or we just include these chars in GF Greek Pro. "technical" Namelist files wouldn't be available via the Fonts API, just for us to define charsets.

I wrote something yesterday for Dave to look at, it's interesting for this discussion as well, sort of:

#20 It suggests that we can support languages even if we don't support the whole GF-charset. This could have implications on how we define GF-charsets.

thlinard · 2017-02-24T17:51:05Z

Do you know where to look for the numerals in the CLDR?

It seems to be https://github.com/unicode-cldr/cldr-core/blob/master/supplemental/numberingSystems.json

graphicore · 2017-02-24T17:59:31Z

Ah, great thanks. It's linked to the locales via cldr-numbers-modern:

excerpt

      "numbers": {
        "defaultNumberingSystem": "arab",
        "otherNumberingSystems": {
          "native": "arab"
},

for el:

      "numbers": {
        "defaultNumberingSystem": "latn",
        "otherNumberingSystems": {
          "native": "latn",
          "traditional": "grek"
},

thlinard · 2017-02-24T18:12:43Z

Oh, they called "latn" the Arabic numerals, I suppose… And "arab" the Indic numerals used in the Arabic script.

thlinard · 2017-02-24T18:13:39Z

The list still lacks basic characters, like # % < > + = × ÷

graphicore · 2017-02-24T18:29:33Z

Oh, they called "latn" the Arabic numerals, I suppose… And "arab" the Indic numerals used in the Arabic script.

Yeah, right, but it seems to do the right thing anyways:

      "latn": {
        "_digits": "0123456789",
        "_type": "numeric"
},

Though they make it more complicated for me sometimes, "_type": "algorithmic" … :

      "grek": {
        "_rules": "greek-upper",
        "_type": "algorithmic"
},

The list still lacks basic characters, like # % < > + = × ÷

I guess there's the question if these are needed to write the language. I'm not really deep into the concepts of CLDR.

…ore/specimenTools#19

graphicore mentioned this issue Feb 24, 2017

Improve and extend Namelist file format documentation. google/fonts#642

Closed

graphicore added a commit to graphicore/googleFonts that referenced this issue Mar 7, 2017

[tools/encodings] include GF-latin-core in GF-greek-core. See graphic…

11b3cd0

…ore/specimenTools#19

graphicore added a commit to graphicore/googleFonts that referenced this issue Mar 13, 2017

[tools/encodings] include GF-latin-core in GF-greek-core. See graphic…

5e61c78

…ore/specimenTools#19

graphicore added a commit that referenced this issue Mar 17, 2017

[build] Add numerals to charsets. See #19

327e45b

graphicore added a commit that referenced this issue Oct 10, 2017

[build] Add numerals to charsets. See #19

fa8a07e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Language coverage details #19

Language coverage details #19

thlinard commented Feb 23, 2017

graphicore commented Feb 23, 2017

thlinard commented Feb 24, 2017

graphicore commented Feb 24, 2017 •

edited

Loading

thlinard commented Feb 24, 2017

graphicore commented Feb 24, 2017

thlinard commented Feb 24, 2017

graphicore commented Feb 24, 2017 •

edited

Loading

thlinard commented Feb 24, 2017

thlinard commented Feb 24, 2017

graphicore commented Feb 24, 2017 •

edited

Loading

Language coverage details #19

Language coverage details #19

Comments

thlinard commented Feb 23, 2017

graphicore commented Feb 23, 2017

thlinard commented Feb 24, 2017

graphicore commented Feb 24, 2017 • edited Loading

thlinard commented Feb 24, 2017

graphicore commented Feb 24, 2017

thlinard commented Feb 24, 2017

graphicore commented Feb 24, 2017 • edited Loading

thlinard commented Feb 24, 2017

thlinard commented Feb 24, 2017

graphicore commented Feb 24, 2017 • edited Loading

graphicore commented Feb 24, 2017 •

edited

Loading

graphicore commented Feb 24, 2017 •

edited

Loading

graphicore commented Feb 24, 2017 •

edited

Loading