Big Data: Week 9 - Data Analysis Recap

LoadingLoading previews...
Data Analysis Recap
HTML Creative Commons: Attribution-Noncommercial-No Derivative Works 4.0
View
    Data Analysis Recap
    Data Analysis Recap
    1 file in this resource
    Summary: In the last three weeks, we have talked about data exploring, data pre-processing and data analysis. In the data exploring stage, we use approaches such as visual exploration or statistics to understand what is in a dataset and the characteristics of the data. These characteristics can include size or amount of data, completeness of the data (e.g., the number of missing values), correctness of the data (e.g., outliers), possible relationship amongst data elements or variables in the data (e.g., correlations, distribution). Based on the information obtained through data exploring, data pre-processing aims to improve the quality of data by dealing with missing values, removing unusable parts of the data, correcting poorly formatted elements and defining relevant relationships across datasets. Common approaches for data pre-processing include missing value imputation, data scaling, normalization, Feature selection and Dimension reduction. After data exploring and data pre-processing, data are prepared for analysis. Machine learning algorithms are widely used in data analysis. There are various machine learning algorithms, which can be grouped into different types. Supervised learning and unsupervised learning are two major types of machine learning. In this module, we introduced three supervised machine learning algorithms, i.e., Linear regression, SVM (Support Vector Machine) and NN (Neural Network), and one unsupervised machine learning algorithm: K-means Clustering.
    Creators:
    Divisions: Academic > School of Computing, Engineering and Built Environment > Department of Computing
    Copyright holder: Copyright © Glasgow Caledonian University
    Viewing permissions: World
    Depositing User:
    Date Deposited: 04 Jan 2019 15:23
    Last Modified: 08 Jan 2020 14:10
    URI: https://edshare.gcu.ac.uk/id/eprint/4382

    Actions (login required)

    View Item View Item

    Toolbox

    There are no actions available for this resource.