Knowledge Ready
Master ML concepts and tech stack fundamentals. Understand the theory behind ML algorithms and production technologies.
ML/DL Frameworks
8What is overfitting? How do you prevent it?
ML Concept: 3-5 minutes to answer
Explain the bias-variance tradeoff.
ML Concept: 4-6 minutes to answer
Explain backpropagation. How does it work?
ML Concept: 5-7 minutes to answer
Explain precision, recall, and F1 score. When to optimize for which?
ML Concept: 4-5 minutes to answer
Explain PCA (Principal Component Analysis). When and how do you use it?
ML Concept: 5-7 minutes to answer
How do you choose the number of epochs? Explain early stopping and gradient descent optimizers.
ML Concept: 5-7 minutes to answer
Explain LoRA and QLoRA. How do they work and when would you use them?
LoRA/QLoRA - Tests understanding of efficient fine-tuning techniques
Tell me about your last ML project. What challenges did you face and how did you solve them?
Project Experience - Tests practical project experience and problem-solving
AI/LLM Tools
40Compare zero-shot, few-shot, and chain-of-thought prompting. When would you use each?
ML Concept: 4-6 minutes to answer
When would you fine-tune an LLM vs using prompting?
ML Concept: 5-7 minutes to answer
Explain function calling in LLMs. How does the model decide when to use tools?
ML Concept: 5-6 minutes to answer
What causes LLM hallucinations? How do you detect and prevent them?
ML Concept: 5-7 minutes to answer
How do you evaluate LLM outputs? What metrics matter?
ML Concept: 5-7 minutes to answer
How do you manage context windows in LLM applications?
ML Concept: 4-6 minutes to answer
Compare GPT-4, Claude, and open-source LLMs. When would you use each?
ML Concept: 4-6 minutes to answer
How do you ensure LLMs output valid structured data (JSON, specific formats)?
ML Concept: 5-7 minutes to answer
AI Orchestration: How do you coordinate multiple AI agents in workflows?
ML Concept: 8-10 minutes to answer
When would you use fine-tuning vs RAG vs prompt engineering? How do you decide?
ML Concept: 5-7 minutes to answer
How do you prepare training data for fine-tuning? How much data do you need and how do you ensure quality?
ML Concept: 5-7 minutes to answer
Can you walk us through a recent AI or GenAI project you've worked on?
GenAI - Tests practical AI experience, ability to communicate technical work clearly, and understanding of end-to-end AI system development
How is AutoGen created behind the scene? Can you create a talking multi-agent system without AutoGen?
AutoGen - Tests deep understanding of multi-agent architectures beyond just using frameworks. Shows you understand the underlying patterns and can build from scratch if needed.
What is Google ADK (Agent Development Kit)? How does it compare to other agent frameworks?
Google ADK - Tests knowledge of Google's agent framework and ability to compare different approaches to building AI agents
What do you know about memory management in AI agents? What is LangMem?
LangMem - Tests understanding of long-term memory patterns for agents, which is critical for building personalized, context-aware AI systems that improve over time.
How do you integrate with external APIs in Python and Java? Compare the approaches and best practices.
Python/Java - Tests practical knowledge of API integration patterns, error handling, and production best practices in both languages
How do you deliver a multi-agent system to production?
Multi-Agent Systems - Tests end-to-end understanding of productionizing AI systems, including deployment, observability, error handling, and scaling considerations.
How do you think about integrating AI capabilities into existing products or systems?
AI Integration - Tests strategic thinking about AI adoption, understanding of integration patterns, and awareness of practical challenges beyond just 'adding an LLM'
What's the difference between Tools and MCP (Model Context Protocol)?
MCP - Tests understanding of LLM integration patterns and Anthropic's emerging protocol
Have you implemented an MCP server? Walk me through how you would build one.
MCP - Tests practical experience with Anthropic's MCP ecosystem
What is the A2A (Agent-to-Agent) protocol? How does it compare to MCP?
A2A - Tests awareness of emerging agent communication standards
Tell me about your LangGraph experience. How do you ensure correct results and visualize agent connections?
LangGraph - Tests practical multi-agent development experience
How do you deploy ML models in production? Walk me through your approach.
Model Deployment - Tests end-to-end MLOps understanding
What model parameters do you adjust when using LLM APIs? What API types are available?
LLM APIs - Tests practical LLM API experience
Compare GPT-4o vs GPT-4o-mini vs other models. When do you use each?
Model Selection - Tests model selection judgment and cost awareness
What vector databases have you used? Explain the different similarity metrics.
Vector Databases - Tests RAG implementation experience
You don't have customer data yet. How do you build and validate an ML system?
Data Strategy - Tests practical ML development without ideal data conditions
How do you evaluate RAG system accuracy? Have you used RAGAS?
RAG Evaluation - Tests RAG evaluation methodology
How do you evaluate LLM outputs? What metrics and methods do you use?
LLM Evaluation - Tests understanding of LLM quality assessment beyond simple accuracy
How do you detect and prevent hallucinations in LLM applications?
LLM Evaluation - Critical for production LLM systems where factual accuracy matters
How do you run A/B tests for LLM-powered features?
LLM Evaluation - Tests understanding of production ML experimentation
How do you evaluate and test an LLM-based system before deploying to production? What metrics do you track?
LLM Evaluation - Tests comprehensive understanding of LLM system quality, safety, and operational readiness
How do you handle model versioning and rollback in production? What happens if a new model performs worse than expected?
MLOps - Tests understanding of production ML lifecycle, risk management, and operational maturity
Walk me through building a production fine-tuning pipeline from start to finish. What are the key steps?
Fine-Tuning - Tests end-to-end production ML experience - interviewers want to know you can deliver a complete, production-ready fine-tuned model, not just run training scripts
What is a knowledge graph and how does it differ from a traditional relational database?
Knowledge Graphs - Tests understanding of graph-based data structures and their advantages for representing complex relationships - critical for RAG systems, entity resolution, and semantic search
What is an ontology in the context of Knowledge Graphs? Why is it important?
Knowledge Graphs - Tests deeper understanding of knowledge representation - ontologies are the schema/contract that makes knowledge graphs semantically meaningful and interoperable
How can Knowledge Graphs help reduce hallucinations in LLM applications?
Knowledge Graphs - Tests practical understanding of grounding LLMs with structured knowledge - a critical production concern as hallucinations can cause real business damage
Can you explain what MCP (Model Context Protocol) is and why Anthropic created it? What problem does it solve?
MCP - Tests awareness of emerging standards in AI tooling and understanding of the integration challenges MCP addresses - increasingly relevant as AI agents become more sophisticated
Briefly describe an AI-powered system or feature you've shipped to production. What problem did it solve, and what was your role?
AI/LLM - Tests real-world AI experience, ability to communicate impact clearly, and understanding of end-to-end AI system delivery
When would you choose A2A over LangGraph?
A2A/LangGraph - Tests understanding of multi-agent architecture patterns and when to use distributed vs orchestrated approaches
MLOps & Production
7Tell me about a time you had to design an API or service. How did you approach it and what decisions did you make around scalability and maintainability?
System Design - Tests end-to-end system design thinking, ability to make architectural decisions, and understanding of production concerns like scalability and maintainability
Have you built and maintained production-grade ML pipelines and implemented privacy/security controls for sensitive data?
MLOps - Tests real-world MLOps experience and understanding of data privacy/security requirements critical for enterprise ML systems
What performance/load testing tools have you used? How do you create realistic load tests?
Load Testing - Tests understanding of production readiness and ability to validate system performance before deployment
What do you do when traffic is too high? How do you handle traffic spikes?
Scaling - Tests ability to design resilient systems and handle production incidents
How do you identify and fix slow endpoints?
Performance Optimization - Tests debugging skills and systematic approach to performance optimization
What challenges have you faced in full-stack development and deployment? How did you solve them?
Full-Stack Deployment - Tests real-world experience with end-to-end system development and production deployment
Describe a technical tradeoff you had to make when building or scaling a system (AI-related or otherwise). What were the constraints, and how did you decide?
System Design - Tests real-world engineering judgment, ability to analyze constraints, and communicate technical decisions clearly
Data Science
8How do you approach feature engineering for ML models? Walk me through your process.
Feature Engineering - Feature engineering is often the biggest driver of model performance. Tests practical ML experience.
How do you ensure data quality in production ML pipelines? What tools and practices do you use?
Data Quality - Data quality issues are the #1 cause of ML system failures. Tests production readiness.
Explain your approach to data aggregation for analytics and ML features. How do you handle different time windows?
Data Aggregation - Aggregations are fundamental for feature engineering and analytics. Tests SQL and data modeling skills.
How do you validate data at different stages of a pipeline? What tools do you use?
Data Validation - Data validation prevents garbage-in-garbage-out. Tests understanding of data pipeline best practices.
How do you optimize slow SQL queries for large datasets? Walk me through your debugging process.
SQL - SQL optimization is critical for data engineering. Tests practical database performance skills.
Compare ETL vs ELT approaches. When would you use each?
Data Pipelines - Understanding data pipeline architectures is fundamental for data engineering roles.
How do you handle missing data in ML pipelines? What are the different strategies?
Data Preprocessing - Missing data is ubiquitous. How you handle it significantly impacts model performance.
How do you detect and handle data drift in production ML systems?
MLOps - Data drift is a primary cause of model degradation. Tests production ML maturity.
Programming Languages
12When do you use Design Patterns in your programming? Which ones are your 'go-to' or favorite patterns?
Software Engineering - Tests understanding of software architecture principles, code organization, and ability to apply proven solutions to common problems. Shows maturity in software design.
How do you make sure high traffic is possible in FastAPI? How do you make it scalable?
FastAPI - Tests understanding of horizontal vs vertical scaling, stateless design, and architectural patterns for building systems that can grow with demand.
What are the differences between FastAPI and Flask? When would you choose one over the other?
Python Web Frameworks - Tests understanding of Python web frameworks, async programming, and ability to make architectural decisions for API development.
How many types of caches are there in a system?
System Design - Tests understanding of system architecture, performance optimization, and ability to design scalable systems. Caching is fundamental to building performant applications.
How do you handle high traffic in FastAPI?
FastAPI - Tests understanding of scalability, async programming, caching, and production deployment strategies. Critical for building robust ML/API services.
You mention 'Python (Advanced)' on your resume. Explain Python's GIL and how it affects multi-threading in ML workloads.
Python - Tests deep Python knowledge and understanding of concurrency
Explain Python's type hints and how you use them in production ML code. Why are they important?
Python - Tests modern Python practices and code quality
Walk me through your code structure for a production ML API. What design patterns do you use?
Python - Tests software engineering maturity and production experience
How does async work in Python and Java? Compare their async models.
Python/Java - Tests understanding of asynchronous programming across languages
How does concurrency work in Python and Java? Explain the key differences.
Python/Java - Tests understanding of concurrent programming models and trade-offs
How do you implement a production service in Python and Java? Walk me through the key components.
Python/Java - Tests practical experience building production backend services
Your ML interview website is built with TypeScript. Why TypeScript over JavaScript for this project?
TypeScript - Tests understanding of TypeScript benefits and frontend architecture
Web & APIs
3You mention React experience from Android/React Native. How did you apply that to building web interfaces for your ML projects?
React - Tests ability to transfer skills and build production UIs
Why did you choose Next.js for your ML interview website instead of plain React?
Next.js - Tests understanding of framework tradeoffs and SSR/SSG
You've built 4 production FastAPI services. What are your REST API design principles?
REST APIs - Tests API design maturity and best practices
Databases
3Have you used Google BigQuery? How do you optimize queries and manage costs in BigQuery?
BigQuery - Tests experience with cloud data warehouses and understanding of BigQuery's unique architecture, pricing model, and optimization techniques
You use pgvector for semantic search in your RAG chatbot. How does it work and why PostgreSQL over a dedicated vector DB?
PostgreSQL - Tests understanding of vector search and database tradeoffs
You mention 35% cache hit rate with Redis. Walk me through your caching strategy.
Redis - Tests understanding of caching strategies and optimization
NLP & Computer Vision
5What is the attention mechanism in transformers?
ML Concept: 5-7 minutes to answer
Explain RAG (Retrieval-Augmented Generation). When and how would you use it?
ML Concept: 5-7 minutes to answer
You fine-tuned BERT for sentiment analysis (89% F1). Explain the fine-tuning process step-by-step.
BERT - Tests practical NLP experience and understanding of transfer learning
You used ResNet50 for image classification (94% accuracy). Why ResNet over other architectures?
ResNet - Tests understanding of CV architectures and transfer learning
You implemented Grad-CAM for model interpretability. How does it work and why is it useful?
Grad-CAM - Tests understanding of model interpretability and explainability