<ns0:uwmetadata xmlns:ns0="http://phaidra.univie.ac.at/XML/metadata/V1.0" xmlns:ns1="http://phaidra.univie.ac.at/XML/metadata/lom/V1.0" xmlns:ns10="http://phaidra.univie.ac.at/XML/metadata/provenience/V1.0" xmlns:ns11="http://phaidra.univie.ac.at/XML/metadata/provenience/V1.0/entity" xmlns:ns12="http://phaidra.univie.ac.at/XML/metadata/digitalbook/V1.0" xmlns:ns13="http://phaidra.univie.ac.at/XML/metadata/etheses/V1.0" xmlns:ns2="http://phaidra.univie.ac.at/XML/metadata/extended/V1.0" xmlns:ns3="http://phaidra.univie.ac.at/XML/metadata/lom/V1.0/entity" xmlns:ns4="http://phaidra.univie.ac.at/XML/metadata/lom/V1.0/requirement" xmlns:ns5="http://phaidra.univie.ac.at/XML/metadata/lom/V1.0/educational" xmlns:ns6="http://phaidra.univie.ac.at/XML/metadata/lom/V1.0/annotation" xmlns:ns7="http://phaidra.univie.ac.at/XML/metadata/lom/V1.0/classification" xmlns:ns8="http://phaidra.univie.ac.at/XML/metadata/lom/V1.0/organization" xmlns:ns9="http://phaidra.univie.ac.at/XML/metadata/histkult/V1.0">
  <ns1:general>
    <ns1:identifier>o:36303</ns1:identifier>
    <ns1:title language="en">Detecting outliers in claim data using machine learning</ns1:title>
    <ns1:language>en</ns1:language>
    <ns1:description language="en">In general, outliers are items that are significantly different from the majority of
other items they may be compared to. In this chapter, we use the term anomaly
synonymously, though also often to indicate a single unusual property of an item:
the individual oddities that make an item an outlier.
Outlier detection refers to the process of finding items that are unusual611. For
tabular data, this usually means identifying unusual rows in a table; for image
data, unusual images; for text data, unusual documents, and similarly for other
types of data. The specific definitions of “normal” and “unusual” can vary, but at
a fundamental level, outlier detection operates on the assumption that the
majority of items within a dataset can be considered normal, while those that
differ significantly from the majority may be considered unusual, or outliers. For
instance, when working with a database of claims, we assume that the majority
of claims represent normal behaviour, and our goal would be to locate the claim
that stands out as distinct from these.
In statistical theory, outliers can arise due to measurement errors, an error in data
transmission, elements outside the population being incorrectly included in the
population, a flaw in an assumed theory, or genuine variability. Identifying
outliers is crucial, as they can skew statistical analysis and lead to misleading
conclusions. Applications of outlier detection include fraud detection, bot
detection on social media612, network security, financial auditing, regulatory
oversight of financial markets, medical diagnosis, astronomy, data quality and
the development of autonomous vehicles.
It’s important to note that not all outliers are necessarily problematic, and in fact,
many are not even interesting. Outlier detection can be seen as being not merely
interested in removing noise but also in finding interesting database objects
deviating in their behaviour considerably from the majority and, as such,
providing new insights. Inconsistency can mean that the data object is
contaminated from a different distribution than the model considered to describe the data. But inconsistency could also mean that the pre-supposed model is not
describing the data as well as was assumed when selecting the model. Both
conclusions can bear rather significant repercussions on the interpretation of the
given observations.
There are several ways to find fraudulent records in tabular data. If we have
economic or financial data, we can use a group of methods based on Benford&apos;s
law.
613 However, its weakness is sensitivity to sample size. In this chapter, we
will show how the same task can be solved using unsupervised learning
techniques for detecting outliers from insurance data. By applying both
techniques, the pool of suspicious records would be smaller, which could lead to
savings in audit work.</ns1:description>
    <ns1:keyword language="en">Key words: machine learning, outliers, insurance</ns1:keyword>
    <ns2:identifiers>
      <ns2:resource>91552100</ns2:resource>
      <ns2:identifier>170432009</ns2:identifier>
    </ns2:identifiers>
    <ns2:identifiers>
      <ns2:resource>1552100</ns2:resource>
      <ns2:identifier>978-86-403-1879-2</ns2:identifier>
    </ns2:identifiers>
  </ns1:general>
  <ns1:lifecycle>
    <ns1:upload_date>2025-06-11T10:03:06.906Z</ns1:upload_date>
    <ns1:status>44</ns1:status>
    <ns2:peer_reviewed>no</ns2:peer_reviewed>
    <ns1:contribute seq="0">
      <ns1:role>46</ns1:role>
      <ns1:entity seq="0">
        <ns3:firstname>Dragan</ns3:firstname>
        <ns3:lastname>Azdejković</ns3:lastname>
        <ns3:institution>Univerzitet u Beogradu Ekonomski fakultet</ns3:institution>
        <ns3:conor>12650343</ns3:conor>
        <ns3:orcid>0000-0002-1399-4302</ns3:orcid>
      </ns1:entity>
      <ns1:date>2025</ns1:date>
    </ns1:contribute>
  </ns1:lifecycle>
  <ns1:technical>
    <ns1:format>application/pdf</ns1:format>
    <ns1:size>822694</ns1:size>
    <ns1:location>https://phaidrabg.bg.ac.rs/o:36303</ns1:location>
  </ns1:technical>
  <ns1:rights>
    <ns1:cost>no</ns1:cost>
    <ns1:copyright>yes</ns1:copyright>
    <ns1:license>18</ns1:license>
  </ns1:rights>
  <ns1:classification>
    <ns1:purpose>70</ns1:purpose>
    <ns7:keyword language="en" seq="0">004.85</ns7:keyword>
  </ns1:classification>
  <ns1:organization>
    <ns8:hoschtyp>1556235</ns8:hoschtyp>
    <ns8:orgassignment>
      <ns8:faculty>11A03</ns8:faculty>
    </ns8:orgassignment>
  </ns1:organization>
  <ns12:digitalbook>
    <ns12:name_magazine language="sr">Innovations in insurance : from traditional to modern market</ns12:name_magazine>
    <ns12:from_page>385</ns12:from_page>
    <ns12:to_page>404</ns12:to_page>
    <ns12:publisherlocation>Belgrade</ns12:publisherlocation>
    <ns12:publisher>University of Belgrade, Faculty of economics and business, Publishing centre</ns12:publisher>
    <ns12:releaseyear>2025</ns12:releaseyear>
  </ns12:digitalbook>
</ns0:uwmetadata>