Linear Algebra

Linear algebra is foundational to machine learning because many ML algorithms, including deep learning, rely on vector and matrix operations.
Linear algebra also helps in data compression and efficient computation.

Eigenvectors are vectors that only scale and don’t change direction during a transformation.
Eigenvalues are the scalars associated with this transformation.
They are important in PCA, spectral clustering, and other ML algorithms.

Statistics

Knows how to set up A/B statistical tests from concrete business problems.
Knows which statistical tests to perform based on the problem.
Knows how to do hypothesis testing. E.g. Chi2 test for BCR / t-test or z-test for revenue.

Understands and can explain different approaches at a high level (hypothesis testing, belief distributions) and some common drawbacks (p-hacking, computational requirements)

Bayes’ theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event. It’s used in ML for probabilistic models like Naive Bayes, Bayesian networks, etc.

Type I error is the false positive rate, where we reject a true null hypothesis.
Type II error is a false negative, where we accept a false null hypothesis.

Understands importance of sample size and effect magnitude and can translate this to given context. Is cognizant of p-value hacking and proper intepretation of p-values within the context of decision making.
The p-value is the probability under the null hypothesis of obtaining a result equal to or more extreme than what was actually observed. If a p-value is 0.06, it means there’s a 6% chance of observing a result at least as extreme given that the null hypothesis is true.

T-test: compare averages between two groups
Chi-square test: for categorical data
ANOVA (Analysis of Variance): to determine difference in means for > 2 groups
Mann-Whitney U test and Wilcoxon signed-rank test: non-parametric alternative to t test
Kruskal-Wallis test: non-parametric alternative to ANOVA
Other non parametric methods includes: bootstrapping, permutation tests…etc

Assuming the acceptance rates are normally distributed, we could use a t-test to test the hypothesis if we have two independent groups of bookings (cash vs. go-pay).
If we have paired samples, for instance, the same drivers accepting both cash and go-pay bookings, a paired t-test would be appropriate.
Recognises correlation does not necessarily imply causation (e.g. cities with low go-pay penetration might have higher acceptance rates).

An A/B test is an experiment where two versions (A and B) are compared, which are identical except for one variation that might impact a user’s behavior.
The answer should include setting up the experiment, selecting the target audience, determining the key metric, and statistical analysis of the result.