
Advanced DF PySpark
Authored by Bianca Cirio
Computers
Professional Development
Used 1+ times

AI Actions
Add similar questions
Adjust reading levels
Convert to real-world scenario
Translate activity
More...
Content View
Student View
10 questions
Show all answers
1.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
Which method is used to create a new column in a DataFrame based on a condition?
withColumn()
select()
filter()
groupBy()
2.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
The lit function in PySpark can be used to create a column with a constant value, but it cannot be used within expressions involving other columns.
True
False
3.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
Pandas UDFs (also known as vectorized UDFs) in PySpark are generally faster than regular PySpark UDFs because they operate on a single row at a time.
True
False
4.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
In PySpark, MapType can be used to create a column containing key-value pairs, and both the keys and values must be of the same data type.
True
False
5.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
Which method is used to create an alias for a column in PySpark?
alias()
withColumn()
select()
groupBy()
6.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
What does the collect_list() function do in PySpark?
Collects all elements into a list and removes duplicates
Collects all elements into a list without removing duplicates
Collects all elements into a set and removes duplicates
Collects all elements into a set without removing duplicates
7.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
Which of the following is a valid transformation operation in PySpark?
collect()
show()
filter()
count()
Access all questions and much more by creating a free account
Create resources
Host any resource
Get auto-graded reports

Continue with Google

Continue with Email

Continue with Classlink

Continue with Clever
or continue with

Microsoft
%20(1).png)
Apple
Others
Already have an account?