Deep Learning: Generative Models

Deep Learning: Generative Models

Assessment

Assessment

Created by

Josiah Wang

Mathematics, Science, Computers

University

33 plays

Hard

Student preview

quiz-placeholder

10 questions

Show answers

1.

Multiple Select

15 mins

1 pt

Which of the following statements justify the Maximum Likelihood approach ?

It returns a model that assigns high probability to observed data

It minimises the KL divergence KL[p_data || p_model]

It minimises the KL divergence KL[p_model || p_data]

It minimises the reconstruction error of the data

Answer explanation

Definition of the likelihood function as “the likelihood of the model parameters that explains the generation of the data”, so MLE corresponds to finding the “best explanation”.

With regards to both the KL options - refer to the definition of KL.

Maximum Likelihood minimises the reconstruction error only if the model likelihood itself describes a reconstruction process - think cross-entropy loss.

2.

Multiple Select

15 mins

1 pt

Which of the following statements, when combined together, explain why we cannot train VAEs using Maximum likelihood Estimation?

The decoder is parameterised by a neural network so it is highly non-linear

The latent variable is continuous

MLE requires evaluating the marginal distribution on data

There are too many datapoints in the dataset

Answer explanation

MLE requires the evaluation of

p(x) = p(xz)p(z)dzp\left(x\right)\ =\ \int_{ }^{ }p\left(x|z\right)p\left(z\right)dz to marginalise out the latent variable. This is intractable due to the nature of p(x|z)

The option 'The latent variable is continuous' is not true if picked alone -- consider probabilistic PCA where the latent variable is Gaussian and the “decoder” is linear.

Regarding the option about there being too many datapoints, this is about intractability due to large-scale data, not due to the intractability of marginal likelihood on each datapoint.

3.

Multiple Select

15 mins

1 pt

Which of the following statements are true for the VAE objective?

It is a lower-bound to the maximum likelihood objective

The gap between the VAE objective and the maximum likelihood objective is KL[p(z)||q(z|x)]

The KL term can always be viewed as a regulariser for the VAE encoder

The optimum of the VAE decoder is also the MLE optimum

Answer explanation

The gap between the VAE and Maximum Likelihood objective is not KL[p(z)||q(z|x)] -- check definitions.


The KL term acts as a reguraliser when the prior is fixed with no learnable parameters. If prior is learnable, the prior can be learned towards the q distribution so the regularisation effect is unclear.


The optimum of the VAE == the MLE optimum only if q is the true posterior, so the correctness of this statement depends on the form of q.

4.

Multiple Select

15 mins

1 pt

In the famous “Chinese room” turing test example, a man will be sitting inside a room doing English-to-Chinese translation, and the other volunteers outside the room will be asked to guess, based on the English-to-Chinese translation results, whether the man in the room understands Chinese or not. You are one of the volunteers. You know the man in the room is English so you assume a priori he does not understand Chinese with probability 0.8. Now given the translation result is correct, how would you guess whether he understands Chinese or not?

I’m sure he definitely understand Chinese

He probably doesn’t understand Chinese (with probability 0.8)

Give me more info about the correct translation rates for those who only speak English

Give me more info about the correct translation rates for those who speak both English and Chinese

Answer explanation

The goal of this question is to guide students to think about Bayes’ optimal classifier. This requires information about p(translation is correct | the man only speaks English) and p(translation is correct | the man speaks both English and Chinese).

5.

Multiple Choice

15 mins

1 pt

Which best represents the reparameterisation trick?

y = μ + σϵ y\ =\ \mu\ +\ \sigma\epsilon\ where ϵN(0, I)\epsilon\sim N\left(0,\ I\right)

y N(μ, σ)y\ \sim N\left(\mu,\ \sigma\right)

y N(E(x), ϵ)y\ \sim N\left(E\left(x\right),\ \epsilon\right)

None of the above

Answer explanation

You cannot backprop through a stochastic node. The reparamaterisation trick allows you to emulate sampling from a distribution however keeping the main computational graph ( μ\mu and σ\sigma ) deterministic and so differentiable.

Explore all questions with a free account

or continue with
Microsoft
Apple
Others
By signing up, you agree to our Terms of Service & Privacy Policy
Already have an account?

Find a similar activity

Create activity tailored to your needs using

Quizizz AI
Generative Models and Learning Quiz

10 questions

Generative Models and Learning Quiz

assessment

University

Quiz on Loss and Activation Functions in Deep Learning

15 questions

Quiz on Loss and Activation Functions in Deep Learning

assessment

University

Exploring Generative Models

13 questions

Exploring Generative Models

assessment

University

Diffusion Models

8 questions

Diffusion Models

assessment

University

Generative AI Quiz

11 questions

Generative AI Quiz

assessment

University

Generative AI Quiz (created by Howie)

15 questions

Generative AI Quiz (created by Howie)

assessment

University