UK
HomeProjectsBlogAboutContact
Uğur Kaval

AI/ML Engineer & Full Stack Developer building innovative solutions with modern technologies.

Quick Links

  • Home
  • Projects
  • Blog
  • About
  • Contact

Connect

GitHubLinkedInTwitterEmail
Download CV →RSS Feed

© 2026 Uğur Kaval. All rights reserved.

Built with Next.js 16, TypeScript, Tailwind CSS & Prisma

  1. Home
  2. Blog
  3. #LLM

Tag

#LLM

12 articles tagged with this topic.

Beyond the Linter: Engineering AI-First Review Pipelines in 2026
AI/ML

Beyond the Linter: Engineering AI-First Review Pipelines in 2026

Manual code reviews are a bottleneck that costs your team millions in lost velocity. Here is how I built a multi-agent AI pipeline that catches race conditions, generates property-based tests, and reduced our MTTR by 42%.

June 1, 20266 min read
Beyond Vector Search: Building Production Knowledge Graphs with LLMs
AI/ML

Beyond Vector Search: Building Production Knowledge Graphs with LLMs

Vector embeddings are hitting a wall. Learn how to build a robust, queryable knowledge graph from unstructured text using LLMs, Pydantic, and Graph databases for true multi-hop reasoning.

May 28, 20265 min read
Context is Everything: Engineering Persistent Memory for LLM Agents
AI/ML

Context is Everything: Engineering Persistent Memory for LLM Agents

Forget basic chat history. Scaling conversational AI in 2026 requires semantic retrieval, windowed summaries, and stateful graph management. Here is how I built a production-grade memory system using LangGraph and vector compaction.

May 12, 20266 min read
Scaling Engineering Velocity: Building Autonomous Code Review Pipelines in 2026
AI/ML

Scaling Engineering Velocity: Building Autonomous Code Review Pipelines in 2026

Stop wasting senior engineering hours on syntax and basic logic. I'll show you how we integrated GPT-5 and Llama 4 into our CI/CD to automate 80% of code reviews and unit test generation.

April 30, 20266 min read
Stop Shipping LLMs Blind: Building Production-Grade Evaluation Frameworks
AI/ML

Stop Shipping LLMs Blind: Building Production-Grade Evaluation Frameworks

Most LLM features die in production because teams treat testing like a vibe check. Here is how to build a rigorous, automated evaluation pipeline using G-Eval, DeepEval, and custom synthetic data generators.

April 14, 20265 min read
Stop Using Fixed-Size Chunking: Building Production RAG Pipelines That Actually Work
AI/ML

Stop Using Fixed-Size Chunking: Building Production RAG Pipelines That Actually Work

Fixed-size chunking is the quickest way to ruin a RAG pipeline. Learn how to implement semantic splitting and context-rich metadata injection to build production-grade retrieval systems.

April 2, 20266 min read
Engineering Reliable AI Agents: A Practical Guide to Tool Use and Function Calling
AI/ML

Engineering Reliable AI Agents: A Practical Guide to Tool Use and Function Calling

Stop treating AI agents like chat bots and start treating them like distributed systems. Here is how to implement tool-calling that actually works in production without the hallucinations.

March 29, 20266 min read
Beyond the Vibe Check: Engineering a Production-Grade LLM Evaluation Framework
AI/ML

Beyond the Vibe Check: Engineering a Production-Grade LLM Evaluation Framework

Stop relying on manual 'vibe checks' for your LLM outputs. Here is how I built a robust, automated evaluation pipeline using G-Eval, RAGAS, and custom LLM-as-a-judge patterns for production-scale deployments.

March 25, 20266 min read
Building Production-Grade Knowledge Graphs: Beyond the LLM Extraction Hype
AI/ML

Building Production-Grade Knowledge Graphs: Beyond the LLM Extraction Hype

Stop building flat RAG systems. Learn how to extract high-fidelity entities and relationships from unstructured text using Pydantic, DSPy, and Neo4j to build a graph-augmented LLM stack that actually scales.

March 17, 20266 min read
Building Evaluation Frameworks for LLM Applications: Beyond the Vibe Check
AI/ML

Building Evaluation Frameworks for LLM Applications: Beyond the Vibe Check

Stop guessing if your prompt changes are working. Learn how to build a production-grade evaluation pipeline using LLM-as-a-judge, synthetic data, and automated regression testing.

March 9, 20265 min read
Beyond Fixed-Size Windows: Production Chunking Strategies for RAG in 2026
AI/ML

Beyond Fixed-Size Windows: Production Chunking Strategies for RAG in 2026

Fixed-size chunking is the reason your RAG pipeline fails on complex queries. Learn how to implement semantic, late-chunking, and recursive strategies that preserve context and boost retrieval precision.

March 1, 20265 min read
Fine-Tuning Large Language Models: A Practical Guide
Deep Learning

Fine-Tuning Large Language Models: A Practical Guide

Learn to fine-tune LLMs for your specific use case. Covers LoRA, QLoRA, and best practices for efficient training.

November 18, 20242 min read