SecurityandPrivacyIssuesinAnalytics
[ad_1]SIT719SecurityandPrivacyIssuesinAnalytics
Distinction/High Distinction Task 9.1: Location-basedPrivacyProtection
Overview
Trajectory data is powerful to many crowdsourcing tasks. For example, Uber and otherservices use the drivers’ geolocation to match the client’s requests. However, there areserious concerns aboutthe privacy ofpublishing the geolocation data.
In this Distinction/Higher Distinction Task, you will experiment with machine learningclassification algorithms. Please see more details in the Task description. Before attemptingthis task, please make sure you are already up to date with all previousCredit and Passtasks.
TaskDescription
Instructions:
Suppose that you are hired by a large company that uses the user’s geolocation data asreferences for allocating crowdsourcing tasks. The company has developed good algorithms forallocating tasks based on accurate geolocation data. One of the new business requirements isthat each client can visualize a few nearby crowdsourcing workers before the client finalizes theorder. Displaying the geolocation on Google Maps and alike services is doable. However, theaccurate geolocation data is sensitive and cannot be directly disclosed to the clients. Therefore,the boss has requested you to develop a demo system to protect the privacy of the geolocationdata.
Since this is a demo system, the famous dataset named “Gowalla” is provided to simulate thecrowdsourcing workers, which is available athttps://snap.stanford.edu/data/loc-gowalla.html. TheGowalla dataset consists of multiple users’ check-in data with timestamps in five columns, somesample data look like this:
[user][check-in time] [latitude] [longitude] [location id]1965142010-07-24T13:45:06Z 53.3648119 -2.2723465833 145064
196514 | 2010-07-24T13:44:58Z | 53.360511233 | -2.276369017 | 1275991 |
196514 | 2010-07-24T13:44:46Z | 53.3653895945 | -2.2754087046 | 376497 |
196514 | 2010-07-24T13:44:38Z | 53.3663709833 | -2.2700764333 | 98503 |
196514 | 2010-07-24T13:44:26Z | 53.3674087524 | -2.2783813477 | 1043431 |
196514 | 2010-07-24T13:44:08Z | 53.3675663377 | -2.278631763 | 881734 |
196514 | 2010-07-24T13:43:18Z | 53.3679640626 | -2.2792943689 | 207763 |
1965142010-07-24T13:41:10Z 53.364905 -2.270824 1042822
Youwill need to download the dataset,familiarizewith itbefore performing the following actions:
- Find three privacy protection methods to protect location-based data from at least three publishedpapers,includingthepaperspublishedonarxiv.org.
- Write a short literature review (approximately 500 words) to compare the identified methods in at leastthreeaspectsthatarerelevanttodataprivacyand utility.
- Identify meaningful performance metrics based on your critical literature review and comparisonbeforeproposinghowtomeasurethesemetrics.
- ImplementorapplytheexistingimplementationsofprivacyprotectionmethodsontheGowalladataset.
- ReporttheperformancemetricsthatareidentifiedinStep3.
- DemonstratetheproposedsolutionswithafewcasestudiesusingGooglemaps.
Asimpleillustrationofyourdemomaylooklikethefollowing,wherePiarethecrowdsourcingworkers,aiare the clients:
Once you have completed the above steps of the project, you need to deliver the outcome. Inreal-world, results are typically delivered as a product/tool/web-app or through a presentation orbysubmitting the report.However,in ourunit,wewill considera reportanda demo only.
Here, you need to write a report (at least 2,000 words including the above-mentioned literaturereview) based on the outcome and results you obtained by performing the above steps. Thereport will describe the literature review, the algorithms used, their working principle, keyparameters, and the results. Results should consider all the key performance measures andcomparative results in the form of tables, graphs, etc. The demo should be a 5-minutes long pre-recorded presentation.
Submit the PDF report and the demo PPT through OnTrack. You also need to submit the codeseparately (within the “Code for task 9.1” folder) under the assignment tab of the CloudDeakinPythonscript(s) during submission.
MarkingRubric:
Criteria | Unsatisfactory –Beginning | Developing | Accomplished | Exemplary | Total |
ReportFocus:Purpose/PositionStatement | 0-7points | 8-11points | 12-15points | 16-20points | /20 |
Fails to clearly relatethe report topic or is notclearly defined and/orthe report lacks focusthroughout. | The report is too broad inscope (outside of the titletopic) and/or the report issomewhat unclear andneeds to be developedfurther. Focal point is notconsistently maintainedthroughoutthereport. | The report providesadequate direction withsome degree of interestfor the reader. The reportstates the position, andmaintains the focal pointof the analysis for themostpart. | The report providesdirection for thediscussion part of theanalysis that is engagingand thought provoking,The report clearly andconcisely states theposition, and consistentlymaintainthefocalpoint. | ||
ComparativeanalysisandDiscussion | 0-15points | 16-20points | 21-24points | 25-30points | /30 |
Demonstrates a lack ofunderstanding andinadequate knowledgeof the topic. Analysis isverysuperficialand | Demonstrates generalunderstanding of pythonscripting. Analysis isgood and has addressedallcriteria.Comparative | Demonstrates good levelof understanding ofpythonscripting. Algorithms are fine-tunedandcomprisegood | Demonstrates superiorlevel of understanding ofpython scripting andalgorithms. Algorithmsarefine-tunedwithsome | ||
containsflaws.The | analysisispresented. | selectionofalgorithms. | noveltyorhybridizationor | ||
reportisalsonotclear. | Sufficientdiscussionis | Comparativeresultsare | advancedand/orrecent | ||
alsopresented. | presentedusingstandard | algorithm.Comparative | |||
performancemeasures. | resultsarepresented | ||||
usingperformance | |||||
measuresinawaythatit | |||||
providesveryclearand | |||||
meaningfulinsightsofthe | |||||
output. | |||||
Demonstration | 0-6points | 7-11points | 12-15points | 16-20points | /20 |
Demonstrationlacks | Demonstrationincludesa | Demoisworkingand | Professionallyconducted | ||
coherentideasandfails | workingsystem, | clearlyexplained, | demo,freefromerrors, | ||
todemonstratea | however,thebenefitsof | however,theremightbe | excellenttalkwithdeep | ||
workingsystem. | privacyprotectionarenot | occasionalmistakesor | knowledgeonprivacy | ||
clearlypresented. | difficultpointsto | protectionforlocation | |||
understand. | data. | ||||
WritingQuality&Adherence toFormatGuidelines | 0-10points | 11-17points | 18-21points | 22-30points | /30 |
Report shows a belowaverage/poor writingstyle lacking inelements of appropriatestandardEnglish. Frequent errors inspelling, grammar,punctuation,spelling,usage, and/orformatting. | Report shows an averageand/orcasualwritingstyle using standardEnglish. Some errors inspelling, grammar,punctuation, usage,and/orformatting. | Report shows aboveaverage writing style (canbe considered good) andclarity in writing usingstandard English. Minorerrorsingrammar, punctuation,spelling, | Article is well written andclear and standardEnglish characterized byelements of a strongwriting style. Basicallyfreefromgrammar, punctuation,spelling, | ||
usage,and/orformatting. | usage,orformatting | ||||
Authorhasdemonstrated | errors. | ||||
theuseofscientific | Authorhasdemonstrated | ||||
languageandresultsare | advanceduseof | ||||
wellexplained. | scientificlanguageand | ||||
resultsarewellexplained | |||||
withinsights. |
Rubric adopted from:Denise Kreiger, Instructional Design and Technology Services, SC&I, Rutgers University,4/2014
[Button id=”1″]