DL Concepts

DL Concepts

What are the various parameters to consider when tuning a specific deep learning model?

Check with candidate how do they tune the neural network model, especially when it has a huge degree of freedom.
Walk through the process of what parameters was fixed and what was tuned.

Specific model’s architecture

Ask the candidate to walk through the process in determining a model’s architecture, the specific inductive biases it introduces that makes it good for specific problems etc.

What are activation functions?

Can understand non-linearity and name a few functions. ReLu, TanH, Sigmoid, Softmax, etc.

If you encounter exploding gradient convergence issues, how would you diagnose and deal with this?

Diagnosis: NA loss. Understands the necessity to scale features between 0 and 1 or batch normalization.
Other techniques would include gradient clipping. Weight initialization.

If you encounter vanishing gradient convergence issues e.g. NA loss, how would you diagnose and deal with this?

Diagnosis: lack of loss convergence / weight distribution skewed to 0.
Solution: reduce number of layers, appropriate weight initialization.
Adopt ReLu loss function. For sequential model, LSTM rather than RNN.
Skip-connections sometimes helps with gradient flows as well.