Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Less than common #24

Open
jerryharrison opened this issue Jan 30, 2019 · 5 comments
Open

Less than common #24

jerryharrison opened this issue Jan 30, 2019 · 5 comments

Comments

@jerryharrison
Copy link

Hi Damon,

I use your js module for parsing addresses -- and it's awesome, great job and thank you for maintaining it.

I recently came into several circumstances where an odd street type such as Run caused the parser to fail. Personally I've never heard of that street type, and if I did I had just assumed it was Run St or Rd... that's bad on my part for assuming.

Cutting to the chase I would like to know if I can submit a PR with an updated list of street types per the USPS, the body in the US that determines valid street types. I can also update/add a "beefier" unit designation list as well as addressit right now only accepts APT or APARTMENT.

I understand if you do not wish to bloat the module with US suffixes as you're from Australia, I just would rather contribute than make another module.

US Street Type List: https://pe.usps.com/text/pub28/28apc_002.htm
Unit Designation: https://pe.usps.com/text/pub28/28apc_003.htm

Thank you!

@markstos
Copy link
Collaborator

markstos commented Jan 30, 2019 via email

@markstos
Copy link
Collaborator

@jerryharrison It would also be interesting to compare this list against what contact-parser implements. It appears that contact-parser also misses "Run" in it's regex:

https://github.com/thrustlabs/contact-parser/blob/master/src/contact-parser.coffee
https://github.com/thrustlabs/contact-parser

@missinglink
Copy link
Collaborator

heya, I'd recommend this dictionary.

@jerryharrison
Copy link
Author

@missinglink Great source! Duh! I already use libpostal on the serverside for our go server.

@DamonOehlman
Copy link
Owner

@missinglink yep, great reference thanks for dropping it into the issue. I still have at the back of my mind the intention to rewrite addressit without the regexes and that's an excellent resource for potentially kicking that off.

@jerryharrison Ideally I'd like to implement a system that would support that style of libpostal pluggable dictionaries as I think that would be fantastic. Would take some thinking about how to do this with minimal bloat for a client-side usage scenario.

Given most bundlers (webpack, rollup, et al) support dead code elimination at module boundaries it's likely that something like the following might be a good pattern (note - I'm completely spitballing potential APIs here):

import { parse } from 'addressit';
import { streetNames } from 'addressit/dictionaries/en';

const result = parse(..., { streetNames });

It's likely that the parse function would also use the en street names as a default option if none was provided just to make the library work "with sensible defaults".

Anyway... like I said this is just spit balling some potential approaches. I think the majority of folks are pretty happy with contact-parser but if there is some interest in moving addressit forward with reworked internals I'd be happy to commit some time to that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants