Part-of-Speech Tagging with Dynamic Algorithms

Overview

This project explores the implementation and comparison of three Part-of-Speech (POS) tagging algorithms—Eager, Viterbi, and Individually Most Probable Tags—across English, Swedish, and Korean. These algorithms were designed to navigate the complexities of morphology and syntax in different languages, revealing intriguing patterns in linguistic structure and algorithm performance.

Project Goals

Implement three distinct POS tagging algorithms of varying complexity: Eager, Viterbi, and Individually Most Probable Tags.
Train and evaluate these algorithms using multilingual corpora from the Universal Dependencies Treebank.
Uncover linguistic insights by analyzing algorithm performance across English, Swedish, and Korean.

Key Findings

Algorithm Performance at a Glance

Language	Eager Accuracy (%)	Viterbi Accuracy (%)	Individually Most Probable Tags Accuracy (%)
English	88.6	91.3	88.6
Swedish	85.7	90.2	85.7
Korean	80.8	79.2	80.8

How to Use This Project

1. Install Dependencies

pip install conllu
pip install nltk

2. Run the Script

python3 pos_tagging.py

Technologies Used

Python: Primary programming language for implementation.
CoNLL-U: For parsing and preparing corpora.
NLTK: To calculate emission and transition probabilities.

Acknowledgements

Grateful for the Universal Dependencies Treebank for providing high-quality multilingual data, enabling this exploration into the intricacies of POS tagging.

Want to fin out more?

For more insights, read the associated blog post:

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
__pycache__		__pycache__
treebanks		treebanks
README.md		README.md
logsumexptrick.py		logsumexptrick.py
p1.py		p1.py
smoothing.py		smoothing.py
treebanks.py		treebanks.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Part-of-Speech Tagging with Dynamic Algorithms

Overview

Project Goals

Key Findings

Algorithm Performance at a Glance

How to Use This Project

1. Install Dependencies

2. Run the Script

Technologies Used

Acknowledgements

Want to fin out more?

About

Releases

Packages

Languages

emma-horton/PartsOfSpeech

Folders and files

Latest commit

History

Repository files navigation

Part-of-Speech Tagging with Dynamic Algorithms

Overview

Project Goals

Key Findings

Algorithm Performance at a Glance

How to Use This Project

1. Install Dependencies

2. Run the Script

Technologies Used

Acknowledgements

Want to fin out more?

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages