-
Data Security
You might see some news like that:
In the big data era, the size of data been breached become huge: from millions to billions. More and more people were affected in even one single data breach event. In the big data context, we are concerned about not only how to collecting, storing, and analysing data, but also how to securing the organisation and its customers’ information.
Recent technologies, such as IoT, social networks, cloud computing, and data analytics, make today possible to collect huge amounts of data. However, for data to be used to their full power, data security and privacy are critical. Data security and privacy have been widely investigated even before the big data era. However, today we face new issues in securing and protecting data. Some of those challenges arise from increasing privacy concerns with respect to the use of such huge amount of data, and from the need of reconciling privacy with the use of data. Other challenges arise because the deployments of new data collection and processing devices, such as those used in IoT systems, increase the attack potential.
-
What is Data Security
Data security means protecting data, such as a database, from destructive forces and from unwanted actions of unauthorized users. Data security also protects data from corruption. Data security is the main priority for organizations of every size and genre.
- Physical security, network security and security of computer systems and files all need to be considered to ensure security of data and prevent unauthorised access, changes to data, disclosure or destruction of data.
- Physical data security
Physical data security requires controlling access to rooms and buildings where data, computers or media are held; logging the removal of, and access to, media or hardcopy material in store rooms; transporting sensitive data only under exceptional circumstances, even for repair purposes, e.g. giving a failed hard drive containing sensitive data to a computer manufacturer may cause a breach of security. - Network security
Network security requires not storing confidential data, such as those containing personal information, on servers or computers connected to an external network, particularly servers that host internet services; firewall protection and security-related upgrades and patches to operating systems to avoid viruses and malicious code. - Security of Personal Data
Personal Data means data relating to a living individual who is or can be identified either from the data or from the data in conjunction with other information that is in, or is likely to come into, the possession of the data controller. Where the safeguarding of personal data is involved, data security is based on national legislation, e.g., the Data Protection Act 2018
(🔗 http://www.legislation.gov.uk/ukpga/2018/12/contents/enacted ).
Personal data may also exist in non-digital format, for example as patient records, signed consent forms, or interview cover sheets containing names, addresses and signatures. These should be protected in the same secure way as digital files and stored separately from data, whether in digital or non-digital format.
- The Data Protection Act 2018 requires that everyone responsible for using personal data has to follow strict rules called ‘data protection principles’. They must make sure the information is:
- used fairly, lawfully and transparently
- used for specified, explicit purposes
- used in a way that is adequate, relevant and limited to only what is necessary
- accurate and, where necessary, kept up to date
- kept for no longer than is necessary
- handled in a way that ensures appropriate security, including protection against unlawful or unauthorised processing, access, loss, destruction or damage
- Data that contain personal information should be treated with higher levels of security than data which do not. Personal data security can be made easier by:
- anonymising or aggregating data
- separating data content according to security needs
- removing personal information, such as names and addresses, from data files and storing them separately
- encrypting data containing personal information before they are stored – encryption is certainly needed before transmission of such data.
How confidential data or data containing personal information are stored may need to be addressed during informed consent procedures. This ensures that the persons to whom the personal data belong are informed and give their consent as to how the data are stored or transmitted. Organisations handling personal data need to register with the Information Commissioner.
When you are deleting data, remember:
DELETING FILES AND REFORMATTING A HARD DRIVE WILL NOT PREVENT THE POSSIBLE RECOVERY OF DATA THAT HAVE PREVIOUSLY BEEN ON THAT HARD DRIVE.
-
Risks to Internet privacy
Privacy means the ability of an individual or group to seclude themselves or information about themselves and thereby reveal themselves selectively; a personal, subjective condition one person cannot decide for another what his or her sense of privacy should be.
Internet privacy is a subset of data privacy. It includes the ability to control what information one reveals about oneself over the Internet, and to control who can access that information. Internet privacy can entail either PII (Personally Identifying Information) or non-PII information such as a site visitor's behaviour on a website. One scenario is online social networks, where participants are offering self-profile in order to contact or being contacted. And potential risks range from identity theft to online and physical stalking, embarrassment, discrimination and blackmailing.
- Your personal information can be revealed by ISP (internet service provider), while they are usually prevented to do so due to social pressure and law, or by email, Internet browser, cookie, etc. Every time you visit a website, there’s a chance that several different corporations are following your every move. They are seeing what you click on, which pages you visit, and where you head next after you visit their page. Below are some of those corporations:
- Internet Service Providers - they are capable to observe any Internet-related activity of the user
- Cookies (parcels of text sent by a server) tracking and maintaining specific information of the user
- Data logging – it may include recording times when the computer is use, or which web sites are visited
- Spyware programs
- Web bug
- Web bug
- Phishing
- Malicious proxy server
- Search engines
- Internet privacy concerns
- Data are often collected silently - web allows large quantities of data to be collected inexpensively and unobtrusively
- Data from multiple sources may be merged - Non-identifiable information can become identifiable when merged
- Data collected for business purposes may be used in civil and criminal proceedings
- Users given no meaningful choice
-
The State of Security
- Data security is an issue in a world where everything is connected. Data security becomes more important when using cloud computing at all “levels”:
- infrastructure-as-a-service (IaaS)
- platform-as-a-service (PaaS)
- software-as-a-service (SaaS)
In today’s world of (Network-, Host-, and Application-level) infrastructure:
“75% of all attacks are now aimed at the application layer while 90% of security dollars are spent on network layer” – Gartner Report
‘92% are application vulnerabilities instead of network vulnerabilities’ – NIST
“The battle between hackers and security professionals has moved from the network layer to the applications themselves” – Network World
“64% of application developers are not confident in their ability to write secure applications” – Microsoft Development Research
‘Hacking has moved from a hobbyist pursuit with a goal of notoriety TO A CRIMINAL pursuit with a goal of money” – Counterplane Internet security
- Examples of Common Attackers includes:
- Hackers (to vandalise or steal $$)
- Terrorists (Disrupt operations)
- Foreign Governments (steal proprietary information)
- Competitors (steal proprietary information)
- Malicious Employee (Disrupt operations)
- Unaware Employees
-
The CIA Triad
The CIA triad of information security a simple but widely-applicable security model. It is an information security benchmark model used to evaluate the information security of an organization. The CIA triad of information security implements security using three key areas related to information systems including confidentiality, integrity and availability.
- Confidentiality
Confidentiality provides the ability to hide information from those people unauthorised to view it. It is perhaps the most obvious aspect of the CIA triad when it comes to security; but correspondingly, it is also the one which is attacked most often.
Cryptography and Encryption methods are an example of an attempt to ensure confidentiality of data transferred from one computer to another. - Integrity
Integrity describes the ability to ensure that data is an accurate and unchanged representation of the original secure information.
One type of security attack is to intercept some important data and make changes to it before sending it on to the intended receiver. - Availability
Availability is important to ensure that the information concerned is readily accessible to the authorised viewer at all times.
Some types of security attack attempt to deny access to the appropriate user, either for the sake of inconveniencing them, or because there is some secondary effect. For example, by breaking the web site for a particular search engine, a rival may become more popular.
- Confidentiality
-
Some definitions
- Here are some commonly used term in security domain:
- Vulnerability – A weakness or flaw that may provide an opportunity to a threat agent
- Threat Agent – an entity that may act on vulnerability
- Threat – Any potential danger
- Risk – the likelihood of a threat agent exploiting a discovered vulnerability
- Exposure – an instance of being compromised by a threat agent
- Countermeasure – an administrative operation, or logical mitigation against potential risks
Authentication and Authorisation
- Authentication is the process of determining the identity of a user. Three general methods are used in authentication: In order to verify your identity, you can provide
- Something you know (e.g., password, pin)
- Something you have (token, e.g. phone)
- Something you are (something about you, i.e. fingerprint (biometric))
Authorisation is the process of applying access control rules to a user process, determining whether or not a particular user process can access an object. The subject may be human or non-human, such as a process or another object. The subject may also be categorized by privilege level such as an administrative user, manager, or anonymous user.
Accounting and NonRepudiation
Accounting, also known as auditing, is a means of measuring activity. It can be done by logging crucial elements of activity as they occur. Audit logs are a balancing act as they require resources to create, store and review.
NonRepudiation is the concept of preventing a subject from denying a previous action with an object in a system.
When authentication, authorisation and auditing are properly configured, the ability to prevent repudiation by a subject with respect to an action and an object is ensured.
-
Data Governance
- Data governance (DG) refers to the overall management of the availability, usability, integrity, and security of the data employed in an enterprise. A sound data governance program includes:
- a governing body or council,
- a defined set of procedures,
- a defined set of procedures,
Data governance combines the disciplines of data quality, data management, data policy management, business process management and risk management into a methodology that ensures important data assets are formally managed throughout an enterprise. It brings together cross-functional teams to make interdependent rules or to resolve issues or to provide services to data stakeholders.
- These cross-functional teams – Data Stewards and/or Data Governors – generally come from the Business side of operations. They set policy that IT and Data groups will follow as they:
- establish their architectures,
- implement their own best practices,
- and address requirements.
Data Governance can be considered the overall process of making this work.
Data Stewardship
Data Stewardship is concerned with taking care of data assets that do not belong to the stewards themselves. Data Stewards represent the concerns of others. Some may represent the needs of the entire organization. Others may be tasked with representing a smaller constituency, e.g., a business unit, department, or even a set of data themselves.
In some organizations, Data Stewards are senior representatives of stakeholder groups. As members of a Data Stewardship Council, they convene to make decisions about the treatment of data assets. In some other organizations Data Stewards operate independently, ensuring that the rules and controls are applied to data appropriately.
An accountability-focused definition of Data Stewardship is:
“the set of activities that ensure data-related work is performed according to policies and practices as established through governance.”
Access Management
Access Management is a discipline that focuses on ensuring that only approved roles are able to create, read, update, or delete data – and only using appropriate and controlled methods.
Data Governance programs often focus on supporting Access Management by aligning the requirements and constraints posed by Governance, Risk Management, Compliance, Security, and Privacy efforts.
Assurance
Assurance includes activities designed to reach a measure of confidence. Assurance is different from audit, which is more concerned with compliance to formal standards or requirements.
Control
Control is a means of managing a risk or ensuring that an objective is achieved. Controls can be preventative, detective, or corrective and can be fully automated, procedural, or technology-assisted human-initiated activities. They can include actions, devices, procedures, techniques, or other measures.
Change Control
Change Control includes a formal process used to ensure that a process, product, service, or technology component is modified only in accordance with agreed-upon rules. Many organizations have formal Change Control Boards that review and approve proposed modifications to technology infrastructures, systems, and applications.
Data Governance programs often strive to extend the scope of change control to include additions, modifications, or deletions to data models and values for reference/master data.
-
Data Architecture
Data Architecture is a discipline, process, and program focusing on integrating sets of information. It belongs to the four Enterprise Architectures (with Application Architecture, Business Architecture, and System Architecture).
Data Modelling
Data modelling is the discipline, process, and organizational group that conducts analysis of data objects used in a business or other contexts. Data modelling identifies the relationships among these data objects, and creates models that depict those relationships.
Data Model
A data model documents and organizes data, deal with how it is stored and accessed, and the relationships among different types of data.
Data model may be abstract or concrete.
Enterprise Architecture (EA)
EA is a comprehensive framework used to manage and align an organization’s business processes, information technology (IT) software and hardware, local and wide area networks, people, operations and projects with the organization’s overall strategy.
- Enterprise Architecture is often subdivided into four architectural domains:
- Application Architecture,
- Business Architecture,
- Data Architecture,
- and Systems Architecture.
Other types of architectures (security, compliance, controls, etc.) may be considered as part of EA, or they may be aligned with EA.
In some organizations, EA is primarily focused on Business Architectures and Business Process Management.
Information Architecture
In its broadest definition, Information Architecture is a discipline, process, and/or program focusing on the design and organization of data, unstructured information, and documents. In the context of Enterprise Architecture, it is a synonym for Data Architecture, which is one of the four Enterprise Architectures. In the context of designing documents and web pages, it is the structuring of large sets of information, as opposed to the development of the content of any content unit within the larger set.
Data is the heart of every business. The implication of data security problems can be very significant. Identifying vulnerability and preventing potential dangers are better than cure: it is much more expensive to fix later on. For every software company, fixing an issue after release is 50 to 200 times more expensive than fixing it in the test cycle.
-
Further reading
E. Bertino, E. Ferrari, “Big Data Security and Privacy”, A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years, Springer International Publishing 2017
N. Joshi and B. Kadhiwala, "Big data security and privacy issues — A survey," 2017 Innovations in Power and Advanced Computing Technologies (i-PACT), Vellore, 2017, pp. 1-5.
Y. Gahi, M. Guennoun and H. T. Mouftah, "Big Data Analytics: Security and privacy challenges," 2016 IEEE Symposium on Computers and Communication (ISCC), Messina, 2016, pp. 952-957.