Data Preprocessing Techniques Quiz

Data Preprocessing Techniques Quiz

University

8 Qs

quiz-placeholder

Similar activities

PROBABILITY DISTRIBUTIONS

PROBABILITY DISTRIBUTIONS

University

10 Qs

Measures of Dispersion Quiz

Measures of Dispersion Quiz

University

10 Qs

NLP QUIZ

NLP QUIZ

University

10 Qs

Research Methodology

Research Methodology

University

10 Qs

Review

Review

University

9 Qs

Econometrics

Econometrics

University

10 Qs

Questionnaire Design

Questionnaire Design

University

10 Qs

What Is News, Pre-lesson activity

What Is News, Pre-lesson activity

6th Grade - University

12 Qs

Data Preprocessing Techniques Quiz

Data Preprocessing Techniques Quiz

Assessment

Quiz

Other

University

Hard

Created by

Anisha Mahato

Used 1+ times

FREE Resource

8 questions

Show all answers

1.

MULTIPLE CHOICE QUESTION

45 sec • 2 pts

What is tokenization in data preprocessing?

The process of reducing words to their root form

The process of converting words into valid dictionary forms

The process of splitting text into smaller units

The process of removing stop words

2.

MULTIPLE CHOICE QUESTION

45 sec • 2 pts

What does lemmatization consider that stemming does not?

The number of syllables in the word

The frequency of the word in the text

The length of the word

The context and part of speech of the word

3.

MULTIPLE CHOICE QUESTION

45 sec • 2 pts

What are stop words?

Words that carry significant meaning

Words that are always stemmed

Common words that can be removed from text/ Words that carry significant meaning

Words that are always lemmatized

4.

MULTIPLE CHOICE QUESTION

45 sec • 2 pts

Which of the following is NOT a preprocessing technique mentioned?

Lemmatization

Stemming

Tokenization

Normalization

5.

MULTIPLE CHOICE QUESTION

45 sec • 2 pts

When does overfitting happen?

Low bias and High Variance

High bias and Low Variance

Low bias and Low Variance

High bias and High Variance

6.

MULTIPLE CHOICE QUESTION

45 sec • 2 pts

Which of the following statements about unsupervised learning is FALSE?

It can identify patterns in the absence of labeled data

It requires a predefined target output

Clustering is a commonly used unsupervised learning method

It is often used for dimensionality reduction

7.

MULTIPLE CHOICE QUESTION

45 sec • 2 pts

In TF-IDF, we down-weigh commonly occurring words across multiple documents?

True

False

8.

MULTIPLE CHOICE QUESTION

45 sec • 2 pts

What is the main characteristic of the Bag of Words model?

It captures word order in a document

It converts text into numerical vectors based on word frequency

It assigns higher weights to rare words

It only considers adjacent word pairs in the text