
Transformer Quiz
Quiz
•
Computers
•
Professional Development
•
Medium
Comprehensive Viva
Used 1+ times
FREE Resource
Enhance your content
10 questions
Show all answers
1.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
What is the primary advantage of using self-attention in Transformers?
It reduces the model size
It eliminates the need for labeled data
It allows for parallel processing of tokens
It restricts the context to local information only
2.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
In the Transformer model, what does the “multi-head” part of multi-head attention refer to?
Multiple outputs per token
Multiple attention layers stacked together
Multiple parallel attention computations with different projections
Attention heads used only during inference
3.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
What type of complexity does the self-attention mechanism have with respect to sequence length (n)?
O(n)
O(log n)
O(n²)
O(1)
4.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
In a standard Transformer, how many decoder blocks are used in the original architecture for machine translation?
4
6
8
12
5.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
Which of the following best describes how the Transformer decoder generates output during inference?
It attends to all positions in the input and output sequences
It attends to input tokens and already-generated output tokens
It uses only the encoder’s final output
It processes the entire sequence at once
6.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
How is the final representation of each token computed in multi-head self-attention?
By summing the outputs of all attention heads
By averaging token embeddings
By concatenating attention head outputs and projecting them
By selecting the maximum attention value
7.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
What is the key difference between encoder self-attention and encoder-decoder attention?
Encoder self-attention is masked, encoder-decoder attention is not
Encoder-decoder attention uses keys and values from the encoder, queries from the decoder
Encoder-decoder attention is only used during pre-training
Encoder-decoder attention uses keys and values from the decoder, queries from the encoder
Create a free account and access millions of resources
Create resources
Host any resource
Get auto-graded reports

Continue with Google

Continue with Email

Continue with Classlink

Continue with Clever
or continue with

Microsoft
%20(1).png)
Apple

Others
By signing up, you agree to our Terms of Service & Privacy Policy
Already have an account?
Similar Resources on Wayground
14 questions
Machine Learning and AI Quiz
Quiz
•
Professional Development
6 questions
CIT Hangouts
Quiz
•
Professional Development
10 questions
Understanding Visual Communication
Quiz
•
Professional Development
13 questions
DECI - Week 10 - round
Quiz
•
Professional Development
15 questions
Sec+ CH.1 Review Test
Quiz
•
Professional Development
15 questions
JavaScript
Quiz
•
Professional Development
10 questions
Construct 2 Basics
Quiz
•
2nd Grade - Professio...
10 questions
Multiple Regression Analysis
Quiz
•
University - Professi...
Popular Resources on Wayground
20 questions
Brand Labels
Quiz
•
5th - 12th Grade
10 questions
Ice Breaker Trivia: Food from Around the World
Quiz
•
3rd - 12th Grade
25 questions
Multiplication Facts
Quiz
•
5th Grade
20 questions
ELA Advisory Review
Quiz
•
7th Grade
15 questions
Subtracting Integers
Quiz
•
7th Grade
22 questions
Adding Integers
Quiz
•
6th Grade
10 questions
Multiplication and Division Unknowns
Quiz
•
3rd Grade
10 questions
Exploring Digital Citizenship Essentials
Interactive video
•
6th - 10th Grade