Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some suggestions for code improvement #2

Open
philshem opened this issue Apr 9, 2015 · 4 comments
Open

Some suggestions for code improvement #2

philshem opened this issue Apr 9, 2015 · 4 comments

Comments

@philshem
Copy link

philshem commented Apr 9, 2015

I'm a big fan the code you've done to make this great dataset usable. I've had to request a major international company install "sexmachine", so that's how useful it has been *. In using the code, I've come up with some suggestions for how to improve. I'm happy to implement some if you think they are useful.

  • Country naming support: Although defined in detector.py,

    COUNTRIES = u"""great_britain ireland usa italy ... other_countries""".split()

Names can be additionally mapped by ISO 3166-1-alpha-2 codes. For example,

d.get_gender(u"Andrea", 'IT')

would return the same as

d.get_gender(u"Andrea", 'italy')
  • Distinguish "not found" and "androgynous": Names that are not found should be distinguished from names that are androgynous. For example,

    print d.get_gender(u"Pauley")
    print d.get_gender(u"dssad12jkasdl")

Should print "andy", "unknown" instead of twice printing "andy".

  • Include some data: Return country probability for a name. For example:

    print d.get_gender(u"Andrea")

would return something like:

italy, male, 99%
usa, mostly_female, 85%
global, female, 95%

* The Ruby package was renamed gender_detector based on the blog post here.

@alisonjo315
Copy link

I think those are really thoughtful and good suggestions, so I'm just expressing my interest and support (oh and maybe a little bit bumping this thread). I'd be happy to help, too, if that makes a difference.

@dukebody
Copy link

The mantainer doesn't look very responsive. Perhaps we can continue this on https://github.com/eddies/sexmachine ? @eddies

dukebody referenced this issue in lead-ratings/gender-guesser Nov 25, 2015
@dukebody
Copy link

dukebody commented Dec 6, 2015

@philshem @alisonjo2786 I've implemented the 'Distinguish "not found" and "androgynous"' feature in the dev version of https://github.com/lead-ratings/gender-guesser. Please take a look and see if we can continue the development there.

@alisonjo315
Copy link

Thank you @dukebody ! I can't review this week, but I will make a note to try to review it this weekend or next week -- sorry, and thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants