Final Project for CS579 Computational Linguistics at KAIST, Fall 2021, by Janggun Lee.
Konglog is an implementation of the Korean Loanword Orthography in Prolog and Python. It aims to faifully encode the rules of the orthography, and provide a simple API for all to use.
- Install Python 3.
swiplserver
is tested on 3.7 and above, and NLTK only supports up to 3.9, so get a version in between.
- Install SWI-Prolog.
- Install
swiplserver
. - Check the
swiplserver
docs and make sure you set up SWI-Prolog correctly.
- Install NLTK.
- Download the
cmudict
corpus. Run the following Python script. This will download only the nessecary data.
import nltk
nltk.download('cmudict')
- If the download doesn't start with
[SSL:CERTIFICATE_VERIFY_FAILED]
, check this comment for a solution.
Konglog provides a simple function, eng_to_kong
that takes in a english word as input, and returns the Konglish translation as output. A very simple example is shown below.
import konglog
def main():
word = "shrimp"
print(konglog.eng_to_kong(word))
For a more complete example, try the script in __init__.py
with python3 __init__.py
! This will create a CLI where the user can type in words to translate as below.
CS579-Project % python3 __init__.py
Welcome to Konglog mini example.
Type an English word you want to translate.
English word, or N to exit: shrimp
Translating to Konglish. This may take a bit...
"shrimp" is translated into "슈림프"
Type an English word you want to translate.
English word, or N to exit: N
Thanks for trying Konglog!
Konglog has three main steps in its architecture, depicted in the picture below.
- First, the input word is translated into phonems by looking up the CMU pronounciation dictionary, provided by NLTK.
- Second, the phonems are trasnalted into jaem and moems. Translation rules are provided in
ipa.pl
. - Finally, the jaem and moems are combined into one. The tools for this combination are in
unicode.py
, and is taken fromhangulutils
.