Computer Engineer

Muhammed Kumcu

NLP · Automatic Speech Recognition · Turkic-language AI

Computer Engineering graduate from Marmara University. I build open, reproducible NLP and speech-recognition systems and datasets for Turkic languages — bridging academic research with engineering practice.

About

I work mainly on low-resource Turkic NLP and ASR — fine-tuning speech models, building morphological tools, and releasing open datasets, models and benchmarks. I also have hands-on experience in database administration and high-performance database architectures, and I develop end-to-end data solutions with Python and modern web tools.

Publications

E. Aydın, M. Kumcu, “TurkmenFST: A Comprehensive Rule-Based Morphological Analysis and Generation System for the Turkmen Language,” 14th International Conference on Computer Processing of Turkic Languages (TurkLang 2026), Astana, Kazakhstan, 2026.

Selected projects

TurkMedSTT — Turkish Medical Speech Recognition2026 · TÜBİTAK 2209-A

Two-stage LoRA fine-tuning of Whisper Large V3 for general and medical Turkish; a 20-model ASR benchmark and an AcoSemantic evaluation. WER reduced 34.7% (relative) on an independent test set.

PythonPyTorchWhisperLoRA/PEFTHugging Face
TurkmenFST — Morphological analyzer & lexicon2026

Rule-based analysis and generation for Turkmen, with the largest open-source Turkmen lexicon (30,000+ entries) served through a web interface. Published at TurkLang 2026.

PythonNLPFlask
Multimodal Deepfake Detection2026

Multi-task detector on FakeAVCeleb fusing video (Xception), audio (Wav2Vec 2.0) and lip-sync streams via late fusion, with an ablation across modalities.

PythonPyTorchComputer Vision
PulsarMetric — Health-tech startupCo-founder

A two-branch digital-health idea: ECG analysis (arrhythmia detection) and an ECG training platform for medical students. Accepted into the Yıldız Kaşifleri (YTÜ) and Tech İstanbul NOVA (İBB) entrepreneurship programs.

ECGMLProduct

More projects on GitHub →