Wiki: Lecture 4 - Word Window Classification and Neural Networks


Lecture Video


Discussion pointers:

  • Why take the exponential?
  • Why is it important to normalise?
  • What does minimising the log probability do?
  • Why do we take the negative in the cross-entropy error?
  • How does the regularization term change the loss function?

Additional Learning Materials:


  • Replicate this Neural Network in Numpy Tutorial without looking at the notebook; complete exercises at the end of notebook.
  • Reply this thread with an overview of activation functions commonly used in Neural Networks