Skip to content

AI Infrastructure Architect & Enterprise Solution Architect

About
Contact
Disclaimer

Tag: Inference Optimization

The Hidden Context Window Problem in RAG Systems: A Real Production Incident with vLLM and Qwen3

When Your 32K Context LLM Fails at 4K Tokens: A Production vLLM Troubleshooting Guide One of the most common misconceptions in Generative AI systems is: “The model supports 32K context, so my application automatically supports 32K context.” In production, that assumption can lead to unexpected failures. Recently, we encountered a production issue…

Amit Patriwala

June 22, 2026
Google Gemma 4 12B: The Model That Signals a Bigger Shift in AI Infrastructure

Over the past year, I’ve spent a considerable amount of time working with both local and production AI environments. On one side, I’ve been experimenting with local LLMs using Ollama, testing quantized models, and exploring how much intelligence can realistically run on developer laptops. On the other side, I’ve been deploying production workloads…

Amit Patriwala

June 19, 2026

Type your email…

Posts

Why RAG Systems Sometimes Answer Questions Nobody Asked

June 24, 2026
Loops: The Quiet Skill Behind Every AI System That Actually Scales

June 23, 2026
Learn AI for Free: 10 Platforms From OpenAI, Google, Microsoft, NVIDIA & More

June 22, 2026
The Hidden Context Window Problem in RAG Systems: A Real Production Incident with vLLM and Qwen3

June 22, 2026
Want to Learn AI Without Spending Thousands on Courses?

June 19, 2026
Google Gemma 4 12B: The Model That Signals a Bigger Shift in AI Infrastructure

June 19, 2026

Agentic AI AI Engineering AI Infrastructure Artificial Intelligence ASP.NET Angular Consultant ASP.NET Consultant in Ahmedabad ASP.NET CORE bcp queryout pipe delimiter C# Binary Search C# LinkedList Class File In App_Code Folder Not Working Climbing Leaderboard using C# Container Components Create a scrollable Gridview in asp.net 2.0 Deep Learning export query results into text file in SQL Server export select query result in to pipe delimiter Textfile export to a pipe delimited .txt file Functional Base Components Generative AI HackerRank Circular Array Rotation HackerRank Climbing the Leaderboard HIERARCHYID Data Type HIERARCHYID Data Type In Sql Server 2008 Inference Optimization Large Language Models LLMOps Microsoft .NET Consultant in Ahmedabad new data types in sql server 2008 Output an query save into Textfile Pipe Delimited Export Option Pipe Delimited Export Option In Sqlserver 2005 Presentational Components Prompt Engineering RAG React Class Base Component React Functional Components React Hooks remove duplicate rows from a table in SQL Server Removing Duplicate Records Retrieval Augmented Generation select query out put save in to pipe delimiter file Select query result save into Textfile in sqlserver Use the ROW_NUMBER Over Partition by What is React Component?

AI Infrastructure Architect & Enterprise Solution Architect

Loading Comments...

Write a Comment...

Email (Required)

Name (Required)

Website

Notifications