Skip to content

Commit

Permalink
Migrate to pdm (#39)
Browse files Browse the repository at this point in the history
* Update dependencies

Signed-off-by: Olga Bulat <[email protected]>

* Move to pdm

Signed-off-by: Olga Bulat <[email protected]>

* Remove install nltk from ci

Signed-off-by: Olga Bulat <[email protected]>

* Migrate to pdm

Signed-off-by: Olga Bulat <[email protected]>

---------

Signed-off-by: Olga Bulat <[email protected]>
  • Loading branch information
obulat authored Jul 14, 2024
1 parent 2e52c68 commit 527113f
Show file tree
Hide file tree
Showing 81 changed files with 1,619 additions and 17,021 deletions.
36 changes: 0 additions & 36 deletions .github/workflows/build.yml

This file was deleted.

30 changes: 30 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# This workflow will install Python dependencies, run tests and lint with a variety of Python versions
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions

name: build

on: [push, pull_request]

jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.9, '3.10', '3.11', '3.12']

steps:
- uses: actions/checkout@v4
- name: Set up PDM
uses: pdm-project/setup-pdm@v4
with:
python-version: ${{ matrix.python-version }}

- name: Install dependencies
run: pdm install

- name: Test with pytest and coverage
run: pdm run pytest

- name: Codecov
uses: codecov/[email protected]

15 changes: 8 additions & 7 deletions .github/workflows/publish_to_pypi.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,19 +13,20 @@ jobs:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v2
uses: actions/setup-python@v4
with:
python-version: '3.x'

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install setuptools wheel twine
run: pdm install

- name: Build and publish to real pypi
env:
TWINE_USERNAME: __token__
TWINE_PASSWORD: ${{ secrets.pypi_password }}
run: |
python setup.py sdist bdist_wheel
twine upload dist/*
pdm build
pdm publish
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -102,3 +102,7 @@ ENV/
.mypy_cache/
/vvenv/
/.idea/

# pdm stuff
.pdm-python
.pdm-build/
12 changes: 0 additions & 12 deletions MANIFEST.in

This file was deleted.

88 changes: 0 additions & 88 deletions Makefile

This file was deleted.

50 changes: 50 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Zeyrek: Morphological Analyzer and Lemmatizer

![PyPI - Version](https://img.shields.io/pypi/v/zeyrek)

Zeyrek is a partial port of the [Zemberek library](https://github.com/ahmetaa/zemberek-nlp) to Python for lemmatizing
and analyzing Turkish language words. It is in alpha stage, and the API
will probably change.


* Free software: MIT license
* Documentation: https://zeyrek.readthedocs.io.


## Basic Usage

To use Zeyrek, first create an instance of `MorphAnalyzer` class:

```shell
import zeyrek
analyzer = zeyrek.MorphAnalyzer()
```

Then, you can call its `analyze` method on words or texts to get all possible analyses::

```shell
print(analyzer.analyze('benim'))
Parse(word='benim', lemma='ben', pos='Noun', morphemes=['Noun', 'A3sg', 'P1sg'], formatted='[ben:Noun] ben:Noun+A3sg+im:P1sg')
Parse(word='benim', lemma='ben', pos='Pron', morphemes=['Pron', 'A1sg', 'Gen'], formatted='[ben:Pron,Pers] ben:Pron+A1sg+im:Gen')
Parse(word='benim', lemma='ben', pos='Verb', morphemes=['Noun', 'A3sg', 'Zero', 'Verb', 'Pres', 'A1sg'], formatted='[ben:Noun] ben:Noun+A3sg|Zero→Verb+Pres+im:A1sg')
Parse(word='benim', lemma='ben', pos='Verb', morphemes=['Pron', 'A1sg', 'Zero', 'Verb', 'Pres', 'A1sg'], formatted='[ben:Pron,Pers] ben:Pron+A1sg|Zero→Verb+Pres+im:A1sg')
```
If you only need the base form of words, or lemmas, you can call `lemmatize`. It returns a list
of tuples, with word itself and a list of possible lemmas::

```shell
print(analyzer.lemmatize('benim'))
[('benim', ['ben'])]
```


## Credits

This package is a Python port of part of the [Zemberek](https://github.com/ahmetaa/zemberek-nlp) package by [Ahmet A. Akın](https://github.com/ahmetaa)


This package was created with
[Cookiecutter](https://github.com/audreyr/cookiecutter) and the
[audreyr/cookiecutter-pypackage](https://github.com/audreyr/cookiecutter-pypackage)
project template.

59 changes: 0 additions & 59 deletions README.rst

This file was deleted.

Binary file removed docs/_build/doctrees/environment.pickle
Binary file not shown.
Binary file removed docs/_build/doctrees/index.doctree
Binary file not shown.
4 changes: 0 additions & 4 deletions docs/_build/html/.buildinfo

This file was deleted.

Loading

0 comments on commit 527113f

Please sign in to comment.