
PySpark Quiz Round
Authored by Ankita Chatterjee
Other
Professional Development
Used 1+ times

AI Actions
Add similar questions
Adjust reading levels
Convert to real-world scenario
Translate activity
More...
Content View
Student View
11 questions
Show all answers
1.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
Which of the following is a transformation operation in PySpark?
count()
filter()
reduce()
collect()
2.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
Which of the following is true for RDD?
RDD is programming paradigm
RDD in Apache Spark is an immutable collection of objects
It is a database
None of the above
3.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
words_list = sc.parallelize ( ["pyspark", "quiz", "questions", "at", "quiz.com"] )
filtered_words = words_list.filter(lambda x: 'quiz' in x)
matched_words= filtered_words.collect()
print(matched_words)
[ "quiz", "quiz.com" ]
[ "quiz" ]
["quiz.com" ]
Error
4.
MULTIPLE CHOICE QUESTION
30 sec • 2 pts
Let us consider, we have a data frame "df". Then what does the expression '[.]{2,}' signify for the following transformation?
df = df.withColumn('var_addrss', sf.regexp_replace('var_addrss', '[.]{2,}', ''))
A single dot (".") followed by 2 integers
A single dot (".") followed by the integer '2'
Single dot (".") appearing twice consecutively
None of these
5.
MULTIPLE CHOICE QUESTION
30 sec • 2 pts
Let us consider, we have a data frame "df". Then what does the expression '^[0]*' signify for the following transformation?
df = df.withColumn('var_addrss', sf.regexp_replace('var_addrss', '^[0]*', ''))
The value starts with 0 OR followed by a sequence of 0s
The value starts with 0 and ends with 0
The value starts with 0 and followed by a sequence of 0s
The value starts with anything other than 0
6.
MULTIPLE SELECT QUESTION
45 sec • 1 pt
Let's assume we have the following data frame "df".
How to display the 'age' column in descending order?
display(df.orderBy(df.age.desc()))
display(df.sort(df.age.desc()))
display(df.orderBy(df.age, sort = desc()))
None of these
7.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
What will the data type of the columns for the following PySpark data frame "df"?
df = spark.read.format("csv").option("header", "true").option("inferSchema", "false").option("delimeter", ",").load("/mnt/temp/test.csv")
Data types of columns will be int
Data types of columns will be read as per the data types defined in the file
Data types of all columns will be string
None of the above
Access all questions and much more by creating a free account
Create resources
Host any resource
Get auto-graded reports

Continue with Google

Continue with Email

Continue with Classlink

Continue with Clever
or continue with

Microsoft
%20(1).png)
Apple
Others
Already have an account?