NIT3171 Classification algorithm

NIT3171 Classification algorithm

[ad_1]

NIT3171
Classification algorithm part
2 – Naive Bayesclassifier
Traffic classification
Bassam Saleh
bassam.saleh@vu.edu.au
• •
Classification algorithm part 2 – Naive Bayesclassifier
Traffic classification
OUTLINE
CLASSIFICATION ALGORITHM
PART 2 – NAIVE BAYESCLASSIFIER
Naive Bayesclassifiers – areafamily of simpleprobabilistic
classifiers basedon applyingBayes’theorem with strong
(naive) independence assumptionsbetween thefeatures.
CLASSIFICATIONALGORITHM
PART 2 – NAIVE BAYESCLASSIFIER
Example:gender classification
training data set
CLASSIFICATION ALGORITHM
PART 2 – NAIVE BAYESCLASSIFIER

Gender height weight
male 175 – 185 cm 75 kg
male 175 – 185 cm 75 kg
male 170 – 175 cm 70 kg
female 160 – 165 cm 55 kg
female 150 – 155 cm 50 kg
female 160 – 165 cm 55 kg

What gender would it be with the given measurements?
How muchconfidence do we haveto makesuchadecision?
CLASSIFICATION ALGORITHM
PART 2 – NAIVE BAYESCLASSIFIER

Gender height weight
? 160-165 cm 55 kg

Bayes’theorem isstated mathematically asthe following equation:
where A andBare events andP(B) ≠0.
P(A) andP(B) are the probabilities of observingA andBwithout
regard to each other.
P(A| B),aconditional probability, isthe probability of observing
eventA giventhat Bistrue.
P(B| A) isthe probability of observing event Bgiventhat A is
true.
CLASSIFICATION ALGORITHM
PART 2 – NAIVE BAYESCLASSIFIER
Abstractly, naiveBayesisaconditional probability model: givenaproblem
instance to be classified,represented byavector x (comprising of n
independent variables), it assignsto this instance probabilities
p(Ck | x)
for eachof kpossible outcomes or classesCk.
UsingBayes’theorem, the conditional probability canbe decomposedas
CLASSIFICATION ALGORITHM
PART 2 – NAIVE BAYESCLASSIFIER
p(Ck | x) = p(Ck) p(x | C k)
p(x)
In plain English,usingBayesianprobability terminology, the
aboveequation canbe written as
CLASSIFICATION ALGORITHM
PART 2 – NAIVE BAYESCLASSIFIER
prior ⇥likelihood
posterior =
evidence
Since p(Ck | x) _ p(Ck) p(x | Ck),
aBayesclassifier,isthe function that assignsaclasslabel yfor
somekasfollows:
y = argmax p(Ck)p(x |Ck)
k21,…,K
Recallthe previous example,
we havetwo classes,C1=male, C2=female
p(C1)=3/6=0.5, p(C2)=3/6=0.5.
x=160-165cm
p(C1)p(x|C1)=0.5*0=0
p(C2)p(x|C2)=0.5*0.33 = 0.165
y-> C2
CLASSIFICATION ALGORITHM
PART 2 – NAIVE BAYESCLASSIFIER

C1=male P(height|C1)
175 – 185 cm 2/3=0.66
170 – 175 cm 1/3=0.33
C2=female p(height|C2)
160 – 165 cm 2/3=0.66
165 – 155 cm 1/3=0.33

TRAFFIC CLASSIFICATION
Traffic classification – isanautomated processwhich
categorises computer network traffic according to various
parameters (for example, basedon port number or protocol)
into anumber of traffic classes.
TRAFFIC CLASSIFICATION
Network traffic – isthe amount of datamoving acrossa
network at agivenpoint of time. Network datain computer
networks ismostly encapsulatedin network packets,which
provide the load in the network.
Network packet – isaformatted unit of datacarried bya
packet-switched network.
TRAFFIC CLASSIFICATION
A packetconsistsof control information anduserdata,which is
alsoknown asthe payload.
Control information – provides datafor delivering the payload,
for example:source anddestination network addresses,error
detection codes,andsequencinginformation.Typically, control
information isfound in packetheadersand trailers.
TRAFFIC CLASSIFICATION
TRAFFIC CLASSIFICATION
TRAFFIC CLASSIFICATION
Where arenetworkpackets?
IPV4 Header
TRAFFIC CLASSIFICATION
TRAFFIC CLASSIFICATION
What canwe do with network traffic?
• Network traffic control – managing,prioritising, controlling or reducing
the network traffic
• Network traffic measurement – measuring the amount andtype of
traffic on a particular network ✓
• Network traffic simulation – to measure the efficiency of a
communications network
• Traffic generation model – isastochastic model of the traffic flows or
data sources in acommunication computer network.
This iswhere traffic classification canbe heavily leveraged, in oder for
effective bandwidth management.
TRAFFIC CLASSIFICATION
Usecasesof trafficclassification:
• Conduct trend analysisto estimate the sizeandorigins of capacity
demand trends, for network planning andbandwidth allocation.
• Dynamically mark packets requiring specific QoS (Quality of Service)
• Conduct dynamic accesscontrol, e.g.,detect forbidden applications,
Denial-of-Service (DoS) attacks, andC2 (command andcontrol)
communications
• Conduct intrusion detection to detect suspicious activities related to
cyber security breaches due to malicious usersor virus.
TRAFFIC CLASSIFICATION
Example,Trend analysis
TRAFFIC CLASSIFICATION
Traffic classification methods
• Port numbers
• Deep packet inspection (DPI) – bit pattern match
• Statistical classification – statistical analysis ✓
TRAFFIC CLASSIFICATION
Traffic classification basedon portnumbers
In the internet protocol suite, aport isanendpoint of
communication in anoperating system.A port isalways
associatedwith anIPaddressof ahost andthe protocol type
of the communication, andthus completes the destination or
origination network addressof acommunication session.A
port isidentified for eachaddressandprotocol bya16-bit
number,commonly known asthe port number.
TRAFFIC CLASSIFICATION
21: FileTransferProtocol (FTP)
22: SecureShell (SSH)
23:Telnetremote login service
25: SimpleMailTransferProtocol (SMTP)
53: Domain Name System(DNS) service
80: Hypertext TransferProtocol (HTTP) usedin the WWW
110:PostOffice Protocol (POP3)
119:Network News TransferProtocol (NNTP)
123:Network Time Protocol (NTP)
143:Internet MessageAccessProtocol(IMAP)
161:Simple Network Management Protocol (SNMP)
194:Internet RelayChat (IRC)
443:HTTPSecure (HTTPS)
TRAFFIC CLASSIFICATION
Port numbers basedtraffic classification’s features
• Fast
• Low resource-consuming
• Supported bymanynetworkdevices
• Does not implement the application-layer payload,soit does
not compromise the users’ privacy
• Useful only for the applications andservices, which usefixed
port numbers
• Easyto cheat bychangingthe port number in the system
TRAFFIC CLASSIFICATION
Traffic classification basedonDPI
DPIisaform of computer network packetfiltering that
examinesthe datapart (and possiblyalsothe header) of a
packet asit passesaninspection point, searchingfor protocol
non-compliance, viruses,spam,intrusions, or defined criteria to
decide whether the packet maypassor if it needsto be routed
to adifferent destination.
TRAFFIC CLASSIFICATION
Example,DPI
TRAFFIC CLASSIFICATION
DPI basedtraffic classification’s features
• Inspectsthe actualpayloadof the packet
• Detects the applications andservicesregardlessof the port number,
on which they operate
• Lacksupport for manyapplications, asSkype,which isbadly
supported bymostclassifiers
• Slow
• Requiresalot of processingpower
• Signaturesmust be kept up to date, asthe applications changevery
frequently
• Encryption makesin manycasesthis method impossible
TRAFFIC CLASSIFICATION
DPIbasedtraffic classification’stools
commercial
•PACE (R&S)
•NBAR (CISCO)
open-source
•OpenDPI
•L7-filter
•nDPI
•Libprotoident
TRAFFIC CLASSIFICATION
Traffic classification basedon statisticalanalysis
Statistical traffic classification works on features extracted from
packetsheadersor payloads,leveragingvarious statistical/data
mining/machine learning techniques to classify traffic.
TRAFFIC CLASSIFICATION
Statistical traffic classification’sfeatures
• Relieson statistical analysisof attributes suchasbyte
frequencies,packet sizesandpacketinter-arrival times.
• Very often usesclassification algorithms suchasK-means,
NaiveBayesclassifier,C4.5,C5.0,J48,or randomforest
• Fasttechnique (compared to DPI classification)
• It candetect the classof yet unknown applications.
TRAFFIC CLASSIFICATION
Statistical traffic classification example (Naive Bayesclassifier)
TRAFFIC CLASSIFICATION

Category method bytes
Game CONNECT 501
Game CONNECT 880
Game CONNECT 895
Game CONNECT 895
Game CONNECT 830
Web AD GET 1495
Web AD CONNECT 885
Web AD GET 1620

What would be the category with the given packet?
TRAFFIC CLASSIFICATION

Category method bytes
? CONNECT 855

C1=“Game” C2=“Web AD”
p(C1)=5/8=0.625, p(C2)=3/8=0.375
x=CONNECT, 855
p(C1)p(x|C1)=0.625*p(CONNECT|C1)*p(885|
C1)=0.625*1*(4/5)=0.5
p(C2)p(x|C2)=0.375*p(CONNECT|C2)*p(885|
C2)=0.375*0.33*0.33=0.04
Sincep(C1)p(x|C1) > p(C2)p(x|C2),
method=CONNECT bytes=855 => category=C1
TRAFFIC CLASSIFICATION
Questions?

[Button id=”1″]

[ad_2]

Source link

"96% of our customers have reported a 90% and above score. You might want to place an order with us."

Essay Writing Service
Affordable prices

You might be focused on looking for a cheap essay writing service instead of searching for the perfect combination of quality and affordable rates. You need to be aware that a cheap essay does not mean a good essay, as qualified authors estimate their knowledge realistically. At the same time, it is all about balance. We are proud to offer rates among the best on the market and believe every student must have access to effective writing assistance for a cost that he or she finds affordable.

Caring support 24/7

If you need a cheap paper writing service, note that we combine affordable rates with excellent customer support. Our experienced support managers professionally resolve issues that might appear during your collaboration with our service. Apply to them with questions about orders, rates, payments, and more. Contact our managers via our website or email.

Non-plagiarized papers

“Please, write my paper, making it 100% unique.” We understand how vital it is for students to be sure their paper is original and written from scratch. To us, the reputation of a reliable service that offers non-plagiarized texts is vital. We stop collaborating with authors who get caught in plagiarism to avoid confusion. Besides, our customers’ satisfaction rate says it all.

© 2022 Homeworkcrew.com provides writing and research services for limited use only. All the materials from our website should be used with proper references and in accordance with Terms & Conditions.

Scroll to Top