BIG Data Management-1

BIG Data Management-1

[ad_1]

BIG Data Management-1 

Assignment 1 

Due Date: 

Assignment type: Individual 

Submission documents: 

(A word/PDF document with results/screenshots of results, along with the code files to be uploaded on LMS) 

Assignment 1 contains three questions and will ask you to get familiar with aspects of Apache Spark. While first two questions require you to get familiar with Spark programming, the last question will ask you to understand an existing code and explain it in simple terms.

Q1. Consider the two data files (users.csv, transactions.csv). Users file has the following fields: 

a) UserID 

b) EmailID 

c) NativeLanguage 

d) Location 

 Transactions file has the following fields: 

a) Transaction_ID 

b) Product_ID 

c) UserID 

d) Price 

e) Product_Description 

By making use of Spark Core (i.e. without using Spark SQL) find out: 

a) Count of unique locations where each product is sold. 

b) Find out products bought by each user. 

c) Total spending done by each user on each product. 

Remember, you have to make use of Spark Core for this question. (15 Marks)

Q2. Consider the dataset file Olympics.csv. This file contains information about the Olympic games, players participating in the games, and details of medals won by them. Using Spark core and the data file, compute the following:

  1. Total medals that each country won in a particular sport (such as Gymnastics).
  2. In each Olympic games, how many medals has India won?
  3. Compute top 3 countries in terms of total medals by each Olympic games year.

Remember, you have to make use of Spark Core for this question. (10 Marks).

Q3. Consider the Movie Recommendation code and problem that was discussed during the class (Session 5 and 6). Please provide a brief write-up on the problem, steps needed to arrive at the solution (recommendation system), and how exactly those steps are implemented in the code. You can make use of the PPT file that discusses the broad solution. While you are doing so, please also mention what each line of code does (It is not sufficient to mention what each block of code does, you would have to provide explanation for each line). (25 Marks)

[Button id=”1″]


[ad_2]
Source link

"96% of our customers have reported a 90% and above score. You might want to place an order with us."

Essay Writing Service
Affordable prices

You might be focused on looking for a cheap essay writing service instead of searching for the perfect combination of quality and affordable rates. You need to be aware that a cheap essay does not mean a good essay, as qualified authors estimate their knowledge realistically. At the same time, it is all about balance. We are proud to offer rates among the best on the market and believe every student must have access to effective writing assistance for a cost that he or she finds affordable.

Caring support 24/7

If you need a cheap paper writing service, note that we combine affordable rates with excellent customer support. Our experienced support managers professionally resolve issues that might appear during your collaboration with our service. Apply to them with questions about orders, rates, payments, and more. Contact our managers via our website or email.

Non-plagiarized papers

“Please, write my paper, making it 100% unique.” We understand how vital it is for students to be sure their paper is original and written from scratch. To us, the reputation of a reliable service that offers non-plagiarized texts is vital. We stop collaborating with authors who get caught in plagiarism to avoid confusion. Besides, our customers’ satisfaction rate says it all.

© 2022 Homeworkcrew.com provides writing and research services for limited use only. All the materials from our website should be used with proper references and in accordance with Terms & Conditions.

Scroll to Top