SummerSchool-Quiz8 University Quiz

SummerSchool-Quiz8

Quiz

•

Computers

•

University

•

Hard

Irfan Ahmad

Used 2+ times

FREE Resource

9 questions

Show all answers

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

Transformer models do not have recurrent units but can still perform sequence modeling.

True

False

MULTIPLE CHOICE QUESTION

45 sec • 1 pt

As the number of training examples goes to infinity, your model will have:

Low bias

High Bias

Same Bias

Depends on the model’s variance

MULTIPLE CHOICE QUESTION

1 min • 1 pt

Compared to the encoder-decoder model which does not use an attention mechanism, we expect the attention model to have the greatest advantage when:

The input sequence length is large.

The input sequence length is small.

The vocabulary size is large.

The vocabulary size is small.

MULTIPLE CHOICE QUESTION

1 min • 1 pt

You have a friend whose mood is heavily dependent on the current and past few days’ weather. You’ve collected data for the past 365 days on the weather, which you represent as a sequence as x<1>, …, x<365>. You’ve also collected data on your friend’s mood, which you represent as y<1>, …, y<365>. You’d like to build a model to map from x→y. Should you use a Unidirectional RNN or Bidirectional RNN for this problem?

Bidirectional RNN, because this allows the prediction of mood on day t to take into account more information

Bidirectional RNN, because this allows backpropagation to compute more accurate gradients

Unidirectional RNN, because the value of y<t> depends only on x<1>,…,x<t>, but not on x<t+1>,…,x<365>

Unidirectional RNN, because the value of y depends only on x<t> , and not other days

MULTIPLE SELECT QUESTION

1 min • 2 pts

In beam search, if you increase the beam width, which of the following would you expect to be true?

Beam search will run more slowly

Beam search will use up more memory

Beam search will generally find better solutions

Beam search will converge after fewer steps

Beam search will run much faster as more options can be considered

MULTIPLE CHOICE QUESTION

45 sec • 1 pt

How does decoder module of the transformer model avoid seeing the tokens that do not appear yet in output sequence?

Multi-head attention

Positional encoding

Self attention

Masking future positions

MULTIPLE CHOICE QUESTION

45 sec • 1 pt

Which concept in transformer allows for inducing the sequence information in input tokens:

Multi-head attention

Positional encoding

Self attention

Masking future positions before the softmax step

MULTIPLE SELECT QUESTION

1 min • 2 pts

Which of the following is a symptom of overfitting?

Large estimated weights

Good generalization to previously unseen data

Simple decision boundary

Complex decision boundary

MULTIPLE CHOICE QUESTION

45 sec • 1 pt

Teacher forcing uses the actual output from the training dataset at time step t as input in the next time step (t+1), instead of the output generated by your model.

True

False

Similar Resources on Wayground

13 questions

Intro to IF statements

Quiz

•

KG - University

14 questions

HTML Quiz

Quiz

•

9th Grade - University

10 questions

PHP

Quiz

•

University

10 questions

0xDebug - Python

Quiz

•

University

12 questions

Operators in C

Quiz

•

University

10 questions

Expression in C Programming

Quiz

•

University

10 questions

Recap CSS

Quiz

•

University

7 questions

CSS Rules Quiz-unit 4 CodeHS

Quiz

•

7th Grade - University

Popular Resources on Wayground

25 questions

Equations of Circles

Quiz

•

10th - 11th Grade

30 questions

Week 5 Memory Builder 1 (Multiplication and Division Facts)

Quiz

•

9th Grade

33 questions

Unit 3 Summative - Summer School: Immune System

Quiz

•

10th Grade

10 questions

Writing and Identifying Ratios Practice

Quiz

•

5th - 6th Grade

36 questions

Prime and Composite Numbers

Quiz

•

5th Grade

14 questions

Exterior and Interior angles of Polygons

Quiz

•

8th Grade

37 questions

Camp Re-cap Week 1 (no regression)

Quiz

•

9th - 12th Grade

46 questions

Biology Semester 1 Review

Quiz

•

10th Grade