Metadata-Version: 2.4 Name: giellaltlextools Version: 0.3.2 Summary: Test and process lexicon data for giellalt projects License: GPL-3.0 License-File: LICENSE Author: Flammie A Pirinen Author-email: flammie@iki.fi Requires-Python: >=3.9,<4.0 Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3) Classifier: Programming Language :: Python :: 3 Classifier: Programming Language :: Python :: 3.9 Classifier: Programming Language :: Python :: 3.10 Classifier: Programming Language :: Python :: 3.11 Classifier: Programming Language :: Python :: 3.12 Classifier: Programming Language :: Python :: 3.13 Classifier: Programming Language :: Python :: 3.14 Requires-Dist: cython (>=3.0.0,<4.0.0) Requires-Dist: pyhfst (>=1.3.0,<2.0.0) Project-URL: Homepage, https://divvun.github.io/giellaltlextools Project-URL: Repository, https://github.com/divvun/giellaltlextools Description-Content-Type: text/markdown # GiellaLTLexTools Scripts for testing lexicony stuff in giellalt plus some processing lexc python scripts. ## Dependencies Uses [pyhfst](https://github.com/Rootroo-ltd/pyhfst) to load HFST automata. Run `poetry install` to install dependencies. Spell-checker testing uses [divvunspell](https://github.com/divvun/divvunspell) binaries. You can install divvunspell with [cargo](https://www.rust-lang.org/tools/install). ## Installation You can install giellaltlextools with [pipx](https://pipx.pypa.io): `pipx install git+https://github.com/divvun/giellaltlextools`. ## Technical Details This project uses Poetry's build system to ensure optimal pyhfst installation. The project is configured to automatically optimize `pyhfst` installation with Cython for better performance: - **Build System**: Declares Cython as a build-time requirement - **Build Script**: `scripts/build.py` automatically handles pyhfst optimization - **Dependencies**: Cython is included as both a runtime and build dependency The build script runs automatically during `poetry install` and `poetry build`, ensuring pyhfst is always installed with Cython support when available. ## Usage Mainly from `make check` in GiellaLT infra. There are currently three programs installed: - `gtlemmatest` for testing that a generator generates lemmas found from a lexc file - `gtparadigmteset` for testing that a generator generates full paradigms of the lemmas - `gtspelltest` for testing that a spell checker accepts lemmas from lexc files. ### Lemma testing ```console $ gtlemmatest -l src/fst/morphology/stems/nouns.lexc \ -a src/fst/analyser-gt-desc.hfstol \ -g src/fst/generator-gt-desc.hfstol \ -t +N+Sg+Nom -t +N+Pl+Nom ``` The lexc files should mainly contain lexc lines that contain full lemma forms. ### Paradigm testing ```console $ gtparadigmtest -l src/fst/morphology/stens/nouns.lexc \ -p src/fst/morphology/test/testnounparadigm.txt \ -g src/fst/generator-gt-desc.hfstol ``` The lexc files should mainly contain lexc lines that contain full lemma forms. ### Spell-checker lemma testing ```console $ gtspelltest -z tools/spellcheckers/se.zhfst -D divvunspell \ src/fst/morphology/stems/*.lexc ``` The lexc files should mainly contain lexc lines that contain full lemma forms.