🤖 Machine Learning Interview Questions & Answers (2025)
Basic Level Questions
▶
What is Machine Learning?Machine Learning is a branch of AI where systems learn patterns from data without being explicitly programmed.
▶
Types of Machine Learning?Supervised, Unsupervised, Semi-supervised, and Reinforcement Learning.
▶
What is supervised learning?Model trained on labeled data to predict outcomes for new, unseen inputs.
▶
What is unsupervised learning?Model trained on unlabeled data to find patterns like clusters or associations.
▶
Define reinforcement learning.An agent learns to make decisions by taking actions in an environment to maximize rewards.
▶
What is overfitting?When a model performs well on training data but poorly on new data due to memorizing noise instead of general patterns.
▶
What is underfitting?When a model is too simple to capture data patterns, resulting in poor performance on both training and test sets.
▶
Examples of ML applications?Spam detection, recommendation systems, fraud detection, speech recognition, medical diagnosis.
▶
What is a model in ML?A mathematical representation of learned patterns that can make predictions on new inputs.
▶
Difference between AI and ML?AI is the overall concept of intelligent machines; ML is a subset focused on learning from data.
Intermediate Level Questions
▶
Explain decision trees.A model splitting data into branches based on feature conditions to arrive at a decision or prediction.
▶
What is Random Forest?An ensemble learning method combining multiple decision trees to improve prediction accuracy and control overfitting.
▶
What is the bias-variance trade-off?Balancing error due to bias (assumptions) and variance (sensitivity to data), aiming for optimum model complexity.
▶
What is logistic regression used for?For binary or multi-class classification tasks using a logistic (sigmoid) function.
▶
Difference between classification and regression.Classification predicts categories; regression predicts continuous values.
▶
What is regularization?A technique to prevent overfitting by adding a penalty to the model complexity in the loss function (L1, L2).
▶
What is feature scaling?Standardizing or normalizing input features to improve training stability and convergence speed.
▶
What is PCA?Principal Component Analysis reduces dimensionality by transforming features into uncorrelated components capturing most variance.
▶
What are support vector machines?Classifiers that find the optimal hyperplane separating classes with maximum margin.
▶
What is a neural network?An interconnected group of nodes simulating brain neuron behavior to learn data patterns.
▶
Explain k-means clustering.An unsupervised algorithm partitioning data into k clusters by minimizing intra-cluster variance.
▶
What is cross-validation?A method to evaluate model performance by training/testing on different data subsets.
▶
What is gradient descent?An iterative optimization algorithm used to reduce loss by updating weights in opposite direction of gradients.
▶
What are hyperparameters?Settings like learning rate, batch size, epochs, chosen before training starts and not learned from data.
▶
What is confusion matrix?A table showing correct vs incorrect predictions for classification problems, used to calculate precision, recall, etc.
▶
Define precision and recall.Precision: proportion of correct positive predictions; Recall: proportion of actual positives correctly predicted.
▶
What is F1-score?The harmonic mean of precision and recall, useful for imbalanced datasets.
▶
What is a ROC curve?A plot of true positive rate vs false positive rate to visualize classification performance trade-offs.
▶
What is ensemble learning?Combining multiple models to produce better predictive performance than a single model.
Advanced Level Questions
▶
Explain XGBoost.Extreme Gradient Boosting is an efficient, scalable implementation of gradient boosted decision trees, known for high performance in ML competitions.
▶
What is bagging and boosting?Bagging trains models in parallel on random data subsets; boosting trains sequentially, correcting errors of prior models.
▶
What is stacking in ensemble learning?Combining multiple base model predictions via a meta-model to improve accuracy.
▶
What is the curse of dimensionality?When feature space becomes high-dimensional, data becomes sparse, making models prone to overfitting and slow computation.
▶
What are word embeddings in ML?Vector representations of words capturing semantic meaning, useful in NLP tasks.
▶
Explain reinforcement learning algorithms.Includes Q-learning, Deep Q-Networks, and Policy Gradient methods where agents learn optimal actions through rewards.
▶
What is deep reinforcement learning?Combines deep neural networks with reinforcement learning to make decisions in complex environments.
▶
What is transfer learning in ML?Using a model trained on one task as a starting point for a different but related task, reducing the required data and training time.
▶
Difference between batch and online learning.Batch learning trains on the whole dataset; online learning updates the model incrementally as new data arrives.
▶
Explain model interpretability.Understanding how a model arrives at its decisions, using techniques like SHAP, LIME for transparency and trust.
▶
What is anomaly detection?Identifying rare instances in data that differ significantly from the majority, e.g., fraud detection.
▶
What is learning rate scheduling?Adjusting the learning rate over epochs to improve convergence and avoid overshooting minima.
▶
What are generative models?Models that learn the distribution of data to generate new, similar samples, e.g., GANs, VAEs.
▶
What is model deployment?Process of integrating a trained ML model into a production environment for real-world predictions.
▶
Explain federated learning.A decentralized learning approach where models are trained across multiple devices without centralizing raw data.
▶
What is adversarial ML?Techniques where inputs are manipulated to fool models, highlighting security vulnerabilities.
▶
What is real-time inference?Making predictions instantly as new data arrives, critical for applications like fraud detection.
▶
Explain concept drift.When the statistical properties of target variables change over time, requiring model adaptation.
▶
What is meta-learning?“Learning to learn” — developing models that learn new tasks quickly with minimal data.