Please use this identifier to cite or link to this item: https://etd.cput.ac.za/handle/20.500.11838/3086
DC FieldValueLanguage
dc.contributor.advisorKabaso, Boniface, Dr-
dc.contributor.authorAnikwue, Arinze-
dc.date.accessioned2020-04-30T10:08:08Z-
dc.date.available2020-04-30T10:08:08Z-
dc.date.issued2019-
dc.identifier.urihttp://hdl.handle.net/20.500.11838/3086-
dc.descriptionThesis (MTech (Information Technology))--Cape Peninsula University of Technology, 2019en_US
dc.description.abstractThe proliferation of data from sources like social media, and sensor devices has become overwhelming for traditional data storage and analysis technologies to handle. This has prompted a radical improvement in data management techniques, tools and technologies to meet the increasing demand for effective collection, storage and curation of large data set. Most of the technologies are open-source. Big data is usually described as very large dataset. However, a major feature of big data is its velocity. Data flow in as continuous stream and require to be actioned in real-time to enable meaningful, relevant value. Although there is an explosion of technologies to handle big data, they are usually targeted at processing large dataset (historic) and real-time big data independently. Thus, the need for a unified framework to handle high volume dataset and real-time big data. This resulted in the development of models such as the Lambda architecture. Effective decision-making requires processing of historic data as well as real-time data. Some decision-making involves complex processes, depending on the likelihood of events. To handle uncertainty, probabilistic systems were designed. Probabilistic systems use probabilistic models developed with probability theories such as hidden Markov models with inference algorithms to process data and produce probabilistic scores. However, development of these models requires extensive knowledge of statistics and machine learning, making it an uphill task to model real-life circumstances. A new research area called probabilistic programming has been introduced to alleviate this bottleneck. This research proposes the combination of modern open-source big data technologies with probabilistic programming and Lambda architecture on easy-to-get hardware to develop a highly fault-tolerant, and scalable processing tool to process both historic and real-time big data in real-time; a common solution. This system will empower decision makers with the capacity to make better informed resolutions especially in the face of uncertainty. The outcome of this research will be a technology product, built and assessed using experimental evaluation methods. This research will utilize the Design Science Research (DSR) methodology as it describes guidelines for the effective and rigorous construction and evaluation of an artefact. Probabilistic programming in the big data domain is still at its infancy, however, the developed artefact demonstrated an important potential of probabilistic programming combined with Lambda architecture in the processing of big data.en_US
dc.language.isoenen_US
dc.publisherCape Peninsula University of Technologyen_US
dc.subjectBig Dataen_US
dc.subjectbig data processingen_US
dc.subjectprobabilistic reasoningen_US
dc.subjectprobabilistic programmingen_US
dc.subjectLambda architectureen_US
dc.titleReal-time probabilistic reasoning system using Lambda architectureen_US
dc.typeThesisen_US
Appears in Collections:Information Technology - Master's Degree
Files in This Item:
File Description SizeFormat 
Anikwue_Arinze_214177149.pdf1.89 MBAdobe PDFThumbnail
View/Open
Show simple item record

Page view(s)

716
Last Week
1
Last month
9
checked on Nov 27, 2024

Download(s)

1,467
checked on Nov 27, 2024

Google ScholarTM

Check


Items in Digital Knowledge are protected by copyright, with all rights reserved, unless otherwise indicated.