We have been working on an agentic AI project where we are trying to bring application of intelligence to private enterprise data leveraging OpenAI. The private dataset is a combination of structured, semi-structured, and unstructured (pdf documents) data. The goal is to allow users to be able to ask natural language questions about this data.
This is a typical use case of agentic AI, so I am sharing some starting technical choices that we came across. The broad idea is to store the data in vector database so that we can pull out the right context for open-ai to answer the user questions.
Data Storage Strategy
Since we have data ranging from structured to unstructured data, we had to figure out how to store them in the vector database. The choices are mainly about what is an entity, embedded document size, entity metadata against for document.
But since we have structured data too, it is good to leverage the structure as well to be able to answer precise questions and not lose opportunity. Treating even structured data as unstructured data leads to loss in meaning.
Embedding Generation
Local Models vs Open AI - Generate embeddings using local models or Open AI. One can optimize for the quality of embedding by choosing the right approach.
Critical Quality Factor - This is an important step since the embeddings generated will play an important role in the quality of context one will be able to create later when answering user questions.
Personal Information Handling
The user may expect the AI responses and their natural language queries to contain enterprise-private personal information. You may want to not share the personal information with open-ai. Hence, removal of that before sending to LLM and adding it back to in the user response is something that you will need to handle.
- Remove Personal Information - We wanted to remove personal information, hence used local models to do so.
- Named Entity Recognition - This also involved using region/locale specific models for named entity recognition in the un/semi-structured data.
- Indic Model Implementation - We used Indic model from Hugging face to help in this.
Leveraging Mastra Framework
A tool/framework like Mastra is really helpful for two reasons.
- Structured Building Blocks - The tool provides a structure that is helpful for new teams to work in the agentic use cases. It provides the building blocks that the teams can depend on when they are starting off in this new space.
- Web Console Benefits - This is really helpful in viewing component information and especially in testing agents in a modular fashion without involving the full application. If you are using tools it helps with viewing tool callback sequences - instead of piecing all together from the logs ourselves.
There are some issues with Mastra like Zod integration and its use is mono-repo is not stable or well-supported. But we still liked using it.
Development and Model Selection
During development one has to do a lot of testing of prompts, wiring of application components. Many a times during development, the focus may not be on the final output quality but the application correctness.
Hence, is such cases using a smaller model like GPT-mini or nano may make sense.
The Critical Importance of Testing
Developers should test a lot of different type of natural language questions. One will observe via the steps taken by AI to see how it is solving them differently. One can reflect and make one's solution better in quality for the end user.
During development cycle, it is likely easily noticeable that the AI is not responding to one's satisfaction because of either prompt, context, tool description, or input/output schema of structured output. Through this process of testing and fixing code one can improve this output to a great extent. This is an essential step in development and iterating over this is critical. This is a new step in software development that one will to recognize.
QA Evolution
Similarly, we see that this is going to be very important for testers to learn the same and help improve the quality of the system. For agentic AI applications, the testing now includes testing quality of output, which is a new aspect introduced to software development process.








