Search Header Logo

Transformers Quiz

Authored by Sowresh Mecheri-Senthil

Other

University

Transformers Quiz
AI

AI Actions

Add similar questions

Adjust reading levels

Convert to real-world scenario

Translate activity

More...

    Content View

    Student View

8 questions

Show all answers

1.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

What does autoregressive mean?

It refers to a model that predicts future values based on past values.

It refers to a model that uses external variables to make predictions.

It refers to a model that only considers the most recent data point.

It refers to a model that doesn't rely on historical data.

2.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

What is attention in the context of Machine Learning?

A mechanism that allows the model to generate shorter output sequences

A feature of the model that allows it to train faster

The encoder of a model

A mechanism that allows the model to focus on specific parts of the input

3.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

What is masking?

To increase the dimensionality of the input data

Hiding certain parts of the input sequence

Hiding some of the model's parameters

Hindering the model's ability to learn effectively

4.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

What does an embedding dimension/layer?

Transform discrete encoded tokens into continuous vector representations

A specific attention mechanism for transformers

The sequence length of input text

Increase the dimensionality of the input sequence

5.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

What is self-attention?

When every token attends to itself in the input sequence

When every token attends to every other token in the same input sequence

When every token attends to every other token in different input sequence

When every token attends only to itself

6.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

What is cross attention

When every token attends to itself in the input sequence

When every token attends to every other token in the same input sequence

When every token attends to every other token in different input sequence

When every token attends only to itself

7.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

What is the point of the skip connections found in the transformer architecture?

They reduce the number of computations required, speeding up training

They improve the interpretability of the model's predictions

They provide shortcuts for gradients to flow directly through the network, addressing the vanishing gradient problem

They increase the depth of the model, allowing it to learn more complex features

Access all questions and much more by creating a free account

Create resources

Host any resource

Get auto-graded reports

Google

Continue with Google

Email

Continue with Email

Classlink

Continue with Classlink

Clever

Continue with Clever

or continue with

Microsoft

Microsoft

Apple

Apple

Others

Others

Already have an account?