Machine Learning – Collaborative Filtering
[ad_1]
You are to use collaborative filtering techniques to predict which political party voters who have not been polled will vote for in an upcoming election. Weassume that we have a large data store of voters and many attributes about the voters. Attributes include age_group (with values of young, middle, old),gender, income_bracket (with values of under_50K, 50_150K, 150_300K, over_300K), marital_status, number_of_children, profession (with many differentvalues), education_level (with values of no_high_school, high_school, bachelors, masters, doctor), number_of_automobiles, political_party, and state. Alsoassume that many voters have already been polled and the party they stated that they would vote for is also stored in the data store.a.Design a schema for a structured cloud table such as Accumulo to represent this data.b.Write pseudocode for determining similarity called VoterSimilarity() with signature:UserSimilarity similarity =VoterSimilarity(voterA, voterB);Assume that facts such as gender or state with totally different values either have a similarity value of 0 or 1. Assume that attributes that have values over aspectrum, such as age_group or education_level, have a value of 1 for an exact match; 0 for one end of the spectrum to the other such as young to old orno_high_school to doctor; or the fraction of difference for other measurements such as no_high_school to high_school is ¾, no_high_school to bachelors is½, high_school to doctor is ¼, etc. To determine overall similarity between two voters, just add up the similarity scores for each of the attributes for each ofthe two voters and compare their overall scores.c. Given a voter who has not been polled, write pseudocode to find all nearest neighbors of voters who have been polled that pass a certain threshold. Thesignature is:Neighborhood neighborhood = nearestNeighbors (threshold,voterA /* not polled */,allVoters /* all voters */);Assume that you have access to the VoterSimilarity() method from the previous section.d. Write pseudocode for selecting the how a voter will vote. A simple metric is to determine if the majority of the neighborhood were polled Democrat orRepublican.Vote vote = PredictVote (voterA, neighborhood);
"96% of our customers have reported a 90% and above score. You might want to place an order with us."
