Skip to content

Commit

Permalink
Merge branch 'init/search-lindera'
Browse files Browse the repository at this point in the history
  • Loading branch information
attakei committed Dec 10, 2024
2 parents 6f9b031 + 007ae11 commit 2cec57d
Show file tree
Hide file tree
Showing 7 changed files with 165 additions and 1 deletion.
1 change: 1 addition & 0 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ jobs:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: 'Configure dependencies'
run: |
echo '3.11' > .python-version
uv sync --frozen --all-extras
- name: 'Run tests'
run: |
Expand Down
12 changes: 12 additions & 0 deletions docs/components/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
==========
Components
==========

Index of toybox components that are used in ``conf.py`` directly.
Subpages have overview, usage and **demo**.

.. toctree::
:maxdepth: 1
:glob:

*/index
45 changes: 45 additions & 0 deletions docs/components/lindera_search/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
==============================
atsphinx.toybox.lindera_search
==============================

.. note:: Added by v2024.12.10

Overview
========

This is to use Lindera search components of Sphinx.

Usage
=====

Install requires
----------------

This extension requires optional installation.
Please add 'lindera-search' option when you install it.

.. code-block:: console
pip install 'atsphinx-toybox[lindera-search]'
Set configuration
-----------------

Add this into your ``conf.py`` of Sphinx.

.. code-block:: python
:caption: conf.py
html_search_options = {
"type": "atsphinx.toybox.lindera_search.LinderaSplitter",
}
Demo
====

.. todo: Write it.
Refs
====

* `lindera-py <https://pypi.org/project/lindera-py/>`_
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,5 @@ This is attakei's alpha-level Sphinx-extensions.

guide
extensions/index
components/index
changes
5 changes: 4 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,9 @@ Documentation = "https://atsphinx.github.io/toybox"
sass = [
"httpx>=0.28.0",
]
lindera-search = [
"lindera-py>=0.38.4",
]

[build-system]
requires = ["hatchling"]
Expand Down Expand Up @@ -77,5 +80,5 @@ only-includes = ["src/atsphinx"]
exclude = 'conf\.py$'

[[tool.mypy.overrides]]
module = ['docutils', 'docutils.*']
module = ['docutils', 'docutils.*', 'lindera_py']
ignore_missing_imports = true
16 changes: 16 additions & 0 deletions src/atsphinx/toybox/lindera_search/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
"""Tokenize override by Lindera."""

from lindera_py import Segmenter, Tokenizer, load_dictionary
from sphinx.search.ja import BaseSplitter


class LinderaSplitter(BaseSplitter):
"""Simple splitter class using Lindera as tokeniser."""

def __init__(self, options: dict[str, str]) -> None: # noqa: D107
self.dictionary = load_dictionary("ipadic")
self.segmenter = Segmenter("normal", self.dictionary)
self.tokenizer = Tokenizer(self.segmenter)

def split(self, input: str) -> list[str]: # noqa: D102
return [token.text for token in self.tokenizer(input)]
86 changes: 86 additions & 0 deletions uv.lock

Large diffs are not rendered by default.

0 comments on commit 2cec57d

Please sign in to comment.