JIYON P J

Building AITfES: Designing a Practical AI System for Engineering Workflows

Overview

AITfES (AI Tool for Engineering Systems) began as an attempt to answer a simple question:

Can AI be made reliable enough to assist in real engineering decision-making?

Rather than building a generic chatbot, the objective was to design a system that could:

Work within defined engineering domains
Provide context-aware responses
Maintain a reasonable level of accuracy and control

This led to the adoption of a Retrieval-Augmented Generation (RAG) architecture.

Why RAG Instead of a Standard LLM?

Early experimentation with standalone LLMs (via the Gemini API) exposed a key limitation:

Responses were fluent, but not grounded in domain-specific data
Hallucinations were unacceptable in engineering contexts

RAG was chosen because it:

Anchors responses to retrieved, relevant data
Allows control over knowledge boundaries
Scales with structured data rather than model retraining

The decision was less about trend adoption and more about risk reduction in technical outputs.

System Architecture

The system was structured into three primary layers:

Data Layer — Engineering data ingestion and structuring
Retrieval Layer — Semantic search using a vector database
Application Layer — User interface and interaction logic

Core technologies:

Gemini API (LLM inference)
Pinecone (vector database for semantic retrieval)
SvelteKit (frontend and server-side logic)

This separation ensured modularity and made debugging significantly easier.

Data Pipeline: Scaling Was Not Optional

The system initially operated on a small dataset (~148 records), which was insufficient for meaningful retrieval.

What Changed

The pipeline was redesigned to:

Automate ingestion of structured engineering data
Normalize and clean inputs
Generate embeddings consistently

This scaled the dataset to 12,000+ records.

Lesson

RAG performance is directly proportional to data quality and coverage.

A well-structured dataset contributes more to output reliability than prompt tuning alone.

Retrieval Layer: Precision Over Volume

Using Pinecone enabled semantic search, but early results revealed:

High recall, but sometimes low relevance
Context noise affecting LLM output

Adjustments Made

Tuned retrieval parameters (top-k results)
Improved chunking strategy for documents
Focused on semantic coherence over raw data volume

Lesson

More data does not guarantee better results—better retrieval does.

Prompt Design: Controlling the LLM

A key challenge was ensuring that the LLM:

Stayed within retrieved context
Avoided speculative responses

Approach

Structured prompts with explicit instructions
Constrained response formats
Reinforced context usage

Lesson

Prompting is not about clever phrasing—it is about system control.

Without constraints, even a strong model behaves unpredictably in technical domains.

Full-Stack Integration

The system integrates:

LLM inference (Gemini API)
Retrieval (Pinecone)
Interface and routing (SvelteKit)

This was not just an implementation step—it exposed architectural considerations:

Latency across components
Error propagation between layers
Need for graceful fallback mechanisms

Lesson

Building AI systems is less about models and more about orchestration.

What Worked

Modular architecture simplified iteration
RAG significantly improved response relevance
Scaling the dataset had immediate impact on output quality

What Didn’t Work Initially

Over-reliance on raw LLM capability
Poor early data structuring
Weak retrieval tuning

Each of these led to unreliable outputs until corrected.

Key Takeaways

AI systems for engineering must be controlled, not just powerful
Data pipelines are as critical as model selection
Retrieval quality defines system usefulness
Prompting is a design discipline, not an afterthought
System integration introduces real-world constraints often ignored in prototypes

Closing Note

AITfES is not positioned as a finished product, but as a working system that demonstrates:

How AI can be integrated into engineering workflows
What trade-offs are required to make it reliable
Where further improvements are necessary

The focus remains on building systems that are practical, scalable, and grounded in real use cases, rather than purely experimental.