
<ns0:uwmetadata xmlns:ns0="http://phaidra.univie.ac.at/XML/metadata/V1.0" xmlns:ns1="http://phaidra.univie.ac.at/XML/metadata/lom/V1.0" xmlns:ns10="http://phaidra.univie.ac.at/XML/metadata/provenience/V1.0" xmlns:ns11="http://phaidra.univie.ac.at/XML/metadata/provenience/V1.0/entity" xmlns:ns12="http://phaidra.univie.ac.at/XML/metadata/digitalbook/V1.0" xmlns:ns13="http://phaidra.univie.ac.at/XML/metadata/etheses/V1.0" xmlns:ns2="http://phaidra.univie.ac.at/XML/metadata/extended/V1.0" xmlns:ns3="http://phaidra.univie.ac.at/XML/metadata/lom/V1.0/entity" xmlns:ns4="http://phaidra.univie.ac.at/XML/metadata/lom/V1.0/requirement" xmlns:ns5="http://phaidra.univie.ac.at/XML/metadata/lom/V1.0/educational" xmlns:ns6="http://phaidra.univie.ac.at/XML/metadata/lom/V1.0/annotation" xmlns:ns7="http://phaidra.univie.ac.at/XML/metadata/lom/V1.0/classification" xmlns:ns8="http://phaidra.univie.ac.at/XML/metadata/lom/V1.0/organization" xmlns:ns9="http://phaidra.univie.ac.at/XML/metadata/histkult/V1.0">
  <ns1:general>
    <ns1:identifier>o:30266</ns1:identifier>
    <ns1:title language="en">Creating a Stop Word Dictionary in Serbian</ns1:title>
    <ns1:language>en</ns1:language>
    <ns1:description language="en">Abstract:
By using natural language processing techniques, it is possible to get a lot of information
from the extraction of document topics through mapping of document key words or
content-based classification of documents, etc. To get this information, an important step is to
separate words that carries informative value in a sentence from those words that do not affect
its meaning. By using dictionaries of stop words specific to each natural language, the marking
of words that do not carry meaning in the sentence is achieved. This paper presents creating a
stop word dictionary in Serbian. The influence of stop words to  text processing is presented
on three different data set. It is shown that by using proposed dictionary of Serbian stop words
the data set dimension is reduced from 15% to 39%, while the quality of the obtained n-gram
language models is improved.

</ns1:description>
    <ns1:keyword language="en">Keywords: stop words, Serbian, text mining, natural language processing, normalization.</ns1:keyword>
    <ns2:identifiers>
      <ns2:resource>1552099</ns2:resource>
      <ns2:identifier>10.5937/SPSUNP2101017M</ns2:identifier>
    </ns2:identifiers>
    <ns2:identifiers>
      <ns2:resource>1552101</ns2:resource>
      <ns2:identifier>2217-5539</ns2:identifier>
    </ns2:identifiers>
  </ns1:general>
  <ns1:lifecycle>
    <ns1:upload_date>2023-06-23T11:47:48.738Z</ns1:upload_date>
    <ns1:status>44</ns1:status>
    <ns2:peer_reviewed>no</ns2:peer_reviewed>
    <ns1:contribute seq="0">
      <ns1:role>46</ns1:role>
      <ns1:entity seq="0">
        <ns3:firstname>Ulfeta A. </ns3:firstname>
        <ns3:lastname>Marovac</ns3:lastname>
        <ns3:institution>Državni univerzitet u Novom Pazaru</ns3:institution>
        <ns3:conor>86849033</ns3:conor>
        <ns3:orcid>0000-0001-7232-3755</ns3:orcid>
      </ns1:entity>
      <ns1:entity seq="1">
        <ns3:firstname>Aldina R. </ns3:firstname>
        <ns3:lastname>Avdić</ns3:lastname>
        <ns3:institution>Državni univerzitet u Novom Pazaru</ns3:institution>
        <ns3:type>person</ns3:type>
        <ns3:orcid>0000-0003-4312-3839</ns3:orcid>
      </ns1:entity>
      <ns1:entity seq="2">
        <ns3:firstname>Adela B.</ns3:firstname>
        <ns3:lastname>Ljajić</ns3:lastname>
        <ns3:type>person</ns3:type>
        <ns3:orcid>0000-0001-7326-059X</ns3:orcid>
      </ns1:entity>
    </ns1:contribute>
  </ns1:lifecycle>
  <ns1:technical>
    <ns1:format>application/pdf</ns1:format>
    <ns1:size>157429</ns1:size>
    <ns1:location>https://phaidrabg.bg.ac.rs/o:30266</ns1:location>
  </ns1:technical>
  <ns1:rights>
    <ns1:cost>no</ns1:cost>
    <ns1:copyright>yes</ns1:copyright>
    <ns1:license>21</ns1:license>
  </ns1:rights>
  <ns1:classification>
    <ns1:purpose>70</ns1:purpose>
  </ns1:classification>
  <ns1:organization>   
 <ns8:hoschtyp>92000001</ns8:hoschtyp>
    <ns8:orgassignment>
      <ns8:faculty>20A01</ns8:faculty>
    </ns8:orgassignment>
  </ns1:organization>
  <ns12:digitalbook>
    <ns12:name_magazine language="en">Scientific Publications of the State University of Novi Pazar Series A: Applied Mathematics, Informatics and mechanics</ns12:name_magazine>
    <ns12:volume>13</ns12:volume>
    <ns12:booklet>1</ns12:booklet>
    <ns12:from_page>17</ns12:from_page>
    <ns12:to_page>25</ns12:to_page>
    <ns12:releaseyear>2021</ns12:releaseyear>
  </ns12:digitalbook>
</ns0:uwmetadata>
