Please use this identifier to cite or link to this item: https://etd.cput.ac.za/handle/20.500.11838/3086
Title: Real-time probabilistic reasoning system using Lambda architecture
Authors: Anikwue, Arinze 
Keywords: Big Data;big data processing;probabilistic reasoning;probabilistic programming;Lambda architecture
Issue Date: 2019
Publisher: Cape Peninsula University of Technology
Abstract: The proliferation of data from sources like social media, and sensor devices has become overwhelming for traditional data storage and analysis technologies to handle. This has prompted a radical improvement in data management techniques, tools and technologies to meet the increasing demand for effective collection, storage and curation of large data set. Most of the technologies are open-source. Big data is usually described as very large dataset. However, a major feature of big data is its velocity. Data flow in as continuous stream and require to be actioned in real-time to enable meaningful, relevant value. Although there is an explosion of technologies to handle big data, they are usually targeted at processing large dataset (historic) and real-time big data independently. Thus, the need for a unified framework to handle high volume dataset and real-time big data. This resulted in the development of models such as the Lambda architecture. Effective decision-making requires processing of historic data as well as real-time data. Some decision-making involves complex processes, depending on the likelihood of events. To handle uncertainty, probabilistic systems were designed. Probabilistic systems use probabilistic models developed with probability theories such as hidden Markov models with inference algorithms to process data and produce probabilistic scores. However, development of these models requires extensive knowledge of statistics and machine learning, making it an uphill task to model real-life circumstances. A new research area called probabilistic programming has been introduced to alleviate this bottleneck. This research proposes the combination of modern open-source big data technologies with probabilistic programming and Lambda architecture on easy-to-get hardware to develop a highly fault-tolerant, and scalable processing tool to process both historic and real-time big data in real-time; a common solution. This system will empower decision makers with the capacity to make better informed resolutions especially in the face of uncertainty. The outcome of this research will be a technology product, built and assessed using experimental evaluation methods. This research will utilize the Design Science Research (DSR) methodology as it describes guidelines for the effective and rigorous construction and evaluation of an artefact. Probabilistic programming in the big data domain is still at its infancy, however, the developed artefact demonstrated an important potential of probabilistic programming combined with Lambda architecture in the processing of big data.
Description: Thesis (MTech (Information Technology))--Cape Peninsula University of Technology, 2019
URI: http://hdl.handle.net/20.500.11838/3086
Appears in Collections:Information Technology - Master's Degree

Files in This Item:
File Description SizeFormat 
Anikwue_Arinze_214177149.pdf1.89 MBAdobe PDFThumbnail
View/Open
Show full item record

Page view(s)

579
Last Week
579
Last month
579
checked on Feb 16, 2022

Download(s)

1,301
checked on Feb 16, 2022

Google ScholarTM

Check


Items in Digital Knowledge are protected by copyright, with all rights reserved, unless otherwise indicated.