Mean squared error is a common loss function for which task?
Intro to ML: Neural Networks Lecture 2 Part 1

Quiz
•

Josiah Wang
•
Mathematics, Computers, Fun
•
University
•
42 plays
•
Hard
6 questions
Show all answers
1.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
Regression
Classification
None of these
Regression and Classification
Answer explanation
Mean squared error is the expected (squared) distance between the models predictions and the true values. It therefore is well suited to regression tasks where the label space is a continuous one. MSE assumes the data is normally distributed however, in binary classification tasks, the data is distributed according to a Bernoulli distribution.
2.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
Is the following statement True or False? Multi-class and Multi-label classification are the same thing.
True
False
Answer explanation
Multiclass classification - classification task where each instance needs to be assigned to one of two or more classes.
Mulitlabel classification - assign each instance a set of target labels. Think of this as predicting a series of properties of an instance which are not mutually exclusive.
3.
MULTIPLE CHOICE QUESTION
2 mins • 1 pt
The shape of the weight matrix, W of a Neural network linear layer is (x,y). A forward pass through this layer can be represented as follows:
Z=XW
Where X is the batched data of dimensions (batch size, input features) and W is the weight matrix. Select the correct assignments for (x, y)
x = batch size, y = number input features to that layer
x = number input features to that layer, y =batch size
x = input features, y = number of neurons in that layer
x = batch size, y = number of neurons in that layer
none of the above
Answer explanation
Every neuron in a fully connected neural layer is connected to every input via a weight. This creates a vector of dimensions equal to the input layer dimensions for every neuron in the current layer. Resulting in a stack of these vectors as deep as the number of neurons in the current layer. For Z=XW the neighbouring dimensions need to match.
4.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
If a neural network has a single output neuron, then the model may be used for:
Binary classification
Regression
Binary classification or regression
None of these
5.
MULTIPLE CHOICE QUESTION
5 mins • 1 pt
Here we have a computational graph representing a series of operations. The green text (text above the line) represents the forward pass (i.e. the values at each stage in the graph during forward propagation). The red values (the values which are positioned underneath the lines) represent gradient signals which have been passed back down the computational graph after some loss as been calculated. Calculate the missing gradients a, b and c using backpropagation (slides 20-31 onwards).
a= -0.2, b= 0.2, c=0.4
a= 0.2, b= -0.2, c=0.4
a= 0.4, b= 0.4, c=0.2
a= -0.2, b= -0.2, c=0.2
Answer explanation
The key with this computational graph is to apply the chain rule at each node. Please see the attached solution for more information.
6.
MULTIPLE SELECT QUESTION
3 mins • 1 pt
Which of the following are correct statements about the models which created the plots i) and ii)?
The model which created plot i) is demonstrating high bias and a higher capacity model is needed
The model which created plot ii) is demonstrating high bias and a higher capacity model is needed
The model which created plot ii) is demonstrating high variance and requires some form of regularisation to avoid over fitting.
The model which created plot i) is demonstrating high variance and requires some form of regularisation to avoid over fitting.
Answer explanation
i) is demonstrating high bias as 1.) the validation and training errors are high and at a similar level 2.) as we add more training data the training error increases. This happens as the capacity of the model limits its ability to fit all the modalities present in the larger dataset.
ii) is demonstrating high variance as the validation error is much higher than the training error. The model has overfit to spurious patterns in the training set which are not representative of the true distribution (represented by the validation set)
Similar Resources on Quizizz
11 questions
Strong and Weak Residual Plots

Quiz
•
11th Grade - University
10 questions
Quiz 12

Quiz
•
University
10 questions
Scatterplots and Regression

Quiz
•
9th Grade - University
7 questions
Data Mining

Quiz
•
University
10 questions
Introduction to Deep Learning

Quiz
•
University
10 questions
AP Statistics Unit 3 Review

Quiz
•
12th Grade - University
10 questions
MIID2380/IBMS1408 LEC10 2-65 Test

Quiz
•
University
10 questions
AP Statistics Unit 9

Quiz
•
12th Grade - University
Popular Resources on Quizizz
17 questions
CAASPP Math Practice 3rd

Quiz
•
3rd Grade
20 questions
math review

Quiz
•
4th Grade
21 questions
6th Grade Math CAASPP Practice

Quiz
•
6th Grade
13 questions
Cinco de mayo

Interactive video
•
6th - 8th Grade
20 questions
Reading Comprehension

Quiz
•
5th Grade
20 questions
Types of Credit

Quiz
•
9th - 12th Grade
10 questions
4th Grade Math CAASPP (part 1)

Quiz
•
4th Grade
45 questions
5th Grade CAASPP Math Review

Quiz
•
5th Grade
Discover more resources for Mathematics
22 questions
TSIA2 Math - TSI MATH 2.0 Review 1 (950ish)

Quiz
•
6th Grade - University
12 questions
Scientific Notation

Quiz
•
University
20 questions
Unit Circle & Trig

Quiz
•
10th Grade - University
40 questions
8th Grade Math Review

Quiz
•
8th Grade - University
20 questions
Math EOG Review

Quiz
•
KG - University
38 questions
Exponents EOY Review

Quiz
•
University
20 questions
5th Grade EOG Math Review

Quiz
•
KG - University
28 questions
3rd Grade Math Review

Quiz
•
KG - University