Please use this identifier to cite or link to this item: https://etd.cput.ac.za/handle/20.500.11838/3300
DC FieldValueLanguage
dc.contributor.advisorDe la Harpe, A.C., Profen_US
dc.contributor.advisorBytheway, A.J., Profen_US
dc.contributor.advisorUys, C.S., Dren_US
dc.contributor.authorFitzgerald, Kyle Andrewen_US
dc.date.accessioned2021-07-02T13:06:37Z-
dc.date.available2021-07-02T13:06:37Z-
dc.date.issued2019-
dc.identifier.urihttp://etd.cput.ac.za/handle/20.500.11838/3300-
dc.descriptionThesis (Doctor of Information and Communication Technology: Information Technology)--Cape Peninsula University of Technology, 2019en_US
dc.description.abstractChallenges exist for information retrieval systems in handling mismatching vocabularies in queries and candidate source documents. As a result, these information retrieval systems may retrieve some documents that are non-relevant and miss some that are relevant. This increases the time for research by forcing additional perusal of unsatisfactory results, and additional searches using alternative vocabularies, which renders information retrieval systems less effective than they could be, and inhibits productive research. The aim of this research was to design, build, and rigorously pilot test a hybrid indexing method that maintains phrase-term word ordinality and word proximity, and to compare the effectiveness of this method with the traditional inverted indexing method. The objectives were to prove statistically that the hybrid indexing method: i) increases the effectiveness of retrieving only those documents that are judged relevant by the user; ii) reduces errors in incorrect identification of user judged relevant documents, thus reducing the number of documents for the user to peruse; and iii) increases the rejection quality of user non-relevant documents, thus providing confidence to the user in the judgement of the information retrieval system. Finally, to determine whether this hybrid indexing method solves the problem of mismatching vocabulary between a query and a document, and satisfies the information needs of the user by retrieving only those documents from the collection relevant to the user. It must be noted that the results from the statistical analysis in this research are not the contribution to knowledge, as the statistics are used to prove that the hybrid indexing method worked. This indexing method is the contribution to the body of knowledge. The strategy used was based on design science research performing both an exploratory and an explanatory study. Quantitative data were collected from the results of processing search queries through two information retrieval systems (one using the hybrid indexing method and the other the inverted indexing method) and from the results of a questionnaire completed by five participants during an experiment. The quantitative data were converted to binary and tested statistically using the mean averages for precision, recall, and specificity, and the Kappa coefficient. The hybrid indexing method was presented and proved, with significance, to increase system effectiveness and specificity. Based on the results, the vocabulary mismatch problem between a query and a document was solved, but the information needs of the user were not satisfied.en_US
dc.language.isoenen_US
dc.publisherCape Peninsula University of Technologyen_US
dc.subjectHybrid token indexen_US
dc.subjectInformation retrieval -- Researchen_US
dc.subjectInformation storage and retrieval systemsen_US
dc.titleHybridised indexing for research based information retrievalen_US
dc.typeThesisen_US
Appears in Collections:Information Technology - Doctoral Degree
Files in This Item:
File Description SizeFormat 
Fitzgerald_Kyle_205118801_Vol_1.pdfMain Thesis File4.53 MBAdobe PDFView/Open
Fitzgerald_Kyle_205118801_Vol._2.pdfAppendices File5.94 MBAdobe PDFView/Open
Show simple item record

Page view(s)

258
Last Week
0
Last month
5
checked on Nov 17, 2024

Download(s)

209
checked on Nov 17, 2024

Google ScholarTM

Check


Items in Digital Knowledge are protected by copyright, with all rights reserved, unless otherwise indicated.