Please use this identifier to cite or link to this item:
https://etd.cput.ac.za/handle/20.500.11838/3300
Title: | Hybridised indexing for research based information retrieval | Authors: | Fitzgerald, Kyle Andrew | Keywords: | Hybrid token index;Information retrieval -- Research;Information storage and retrieval systems | Issue Date: | 2019 | Publisher: | Cape Peninsula University of Technology | Abstract: | Challenges exist for information retrieval systems in handling mismatching vocabularies in queries and candidate source documents. As a result, these information retrieval systems may retrieve some documents that are non-relevant and miss some that are relevant. This increases the time for research by forcing additional perusal of unsatisfactory results, and additional searches using alternative vocabularies, which renders information retrieval systems less effective than they could be, and inhibits productive research. The aim of this research was to design, build, and rigorously pilot test a hybrid indexing method that maintains phrase-term word ordinality and word proximity, and to compare the effectiveness of this method with the traditional inverted indexing method. The objectives were to prove statistically that the hybrid indexing method: i) increases the effectiveness of retrieving only those documents that are judged relevant by the user; ii) reduces errors in incorrect identification of user judged relevant documents, thus reducing the number of documents for the user to peruse; and iii) increases the rejection quality of user non-relevant documents, thus providing confidence to the user in the judgement of the information retrieval system. Finally, to determine whether this hybrid indexing method solves the problem of mismatching vocabulary between a query and a document, and satisfies the information needs of the user by retrieving only those documents from the collection relevant to the user. It must be noted that the results from the statistical analysis in this research are not the contribution to knowledge, as the statistics are used to prove that the hybrid indexing method worked. This indexing method is the contribution to the body of knowledge. The strategy used was based on design science research performing both an exploratory and an explanatory study. Quantitative data were collected from the results of processing search queries through two information retrieval systems (one using the hybrid indexing method and the other the inverted indexing method) and from the results of a questionnaire completed by five participants during an experiment. The quantitative data were converted to binary and tested statistically using the mean averages for precision, recall, and specificity, and the Kappa coefficient. The hybrid indexing method was presented and proved, with significance, to increase system effectiveness and specificity. Based on the results, the vocabulary mismatch problem between a query and a document was solved, but the information needs of the user were not satisfied. | Description: | Thesis (Doctor of Information and Communication Technology: Information Technology)--Cape Peninsula University of Technology, 2019 | URI: | http://etd.cput.ac.za/handle/20.500.11838/3300 |
Appears in Collections: | Information Technology - Doctoral Degree |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Fitzgerald_Kyle_205118801_Vol_1.pdf | Main Thesis File | 4.53 MB | Adobe PDF | View/Open |
Fitzgerald_Kyle_205118801_Vol._2.pdf | Appendices File | 5.94 MB | Adobe PDF | View/Open |
Page view(s)
258
Last Week
0
0
Last month
5
5
checked on Nov 17, 2024
Download(s)
209
checked on Nov 17, 2024
Google ScholarTM
Check
Items in Digital Knowledge are protected by copyright, with all rights reserved, unless otherwise indicated.