01
  • 1.1 Overview

    1.1.1 Readings

    The Linux Command Line; 2013; W. E. Shotts Jr (http://linuxcommand.org/tlcl.php)
    Shell Scripting: Expert Recipes for Linux, Bash, and More; 2011; Steve Parker; Wrox; ISBN: 978-0-470-02448-5
    C Primer Plus 6th Ed; 2014; Stephen Prata; Addison Wesley; ISBN-13: 9780-0-321-92842-9
    Advanced Programming in the UNIX Environment 3rd Ed.; 2013; Stevens & Rago; Addison Wesley; ISBN-13: 978-0-321-63773-4
    The Linux Programming Interface; Michael Kerrisk; 2010; No Starch Press; ISBN-13: 978-1-59327-220-3

    1.1.2 Topics covered during this course

    Operating Systems Concepts
    Processes
    File Systems
    Users and Security
    Threads
    TCP/UDP, IP and Sockets

    Programming the bash shell in detail

    C programming on Linux

    Systems Programming on Linux with C

  • 1.2 Operating system functionality

    Modern operating systems will provide most or all the following services:
    multitasking
    multiuser
    main memory management
    disk storage management
    inter-process communication mechanisms
    windowing environment - GUI
    I/O device control and various internal/external bus standards (e.g. SCSI, USB)
    a "low-level" system call interface accessible
    API libraries
    file utility programs
    command shells
    miscellaneous systems tools: assemblers, scripting languages, compilers, mail, editors...
    programming tools
    system administration and accountancy tools
    communication protocols e.g. TCP/IP
  • 1.3 UNIX

    1.3.1 Overview

    Been around since 1968.

    Main use today is enterprise servers hosting web servers, file servers, databases, proprietary mission critical (legacy) software.
    in decline - Gartner: 16% server market share in 2012 declining to 9% by 2017
    appealing for highly resilient and reliable enterprise systems running on proprietary hardware (proprietary CPUs)
    legacy systems
    not going away yet but in general decline...

    1.3.2 Vendors

    IBM - AIX on POWER CPUs

    HP - HP-UX on PA-RISC

    Oracle (Sun Microsystems) - Solaris on SPARC

    Apple OS X - Intel x64 (usually on Mac but also on mainstream PC hardware to a limited extent)

    1.3.3 The GNU Project

    In the mid-1980s Richard Stallman started the GNU project intended to provide a free UNIX implementation.

    This initially encouraged the cooperative development of software tools for existing proprietary UNIX versions.

    Under the terms of the GNU General Public License (GPL) software produced must be made available as source code and freely distributable. GPL licensing applies also to any subsequent modifications thereof.

    Popular GNU Project software includes the GNU compiler collection (including the C compiler), glibc (the GNU C library) and the bash shell.
    … all software used in this course.
  • 1.4 Linux

    By the early 1990s virtually all UNIX software tools had been (re)implemented as GNU project equivalents but there was no completed kernel.

    The kernel part of the project was dropped and Linus Torvald’s Linux kernel was adopted under GPL instead.

    therefore strictly “Linux” should only mean the kernel rather than the whole OS...

    The Linux kernel source was engineered to be independent of a specific CPU but runs mostly on x86-64 PCs and servers.

    also SPARC, POWER, HP PA-RISC...

    Linux again conforms to the wider UNIX POSIX standards.

    The bottom line is that from both a programmer’s and administrator’s point of view Linux looks and behaves like mainstream UNIX.

    The industry has an enormous amount of IT support and software development professionals with UNIX experience who can learn and use Linux very quickly.

    It is attractive to many (including government) to have a free, community created OS which can run on mainstream, cheap hardware away from the control of a single company.

    The main future for UNIX/Linux appears to be Linux VMs for server software dynamically created in the cloud.

    e.g. on Amazon Web Services (AWS) or Microsoft Azure

    1.4.1 Linux Distros

    The term “Linux” is commonly used to mean the kernel, plus a wide range of other software (tools and libraries) that together make a complete OS.
    in the very early days of Linux, the user was required to assemble all of this software, create a file system, and correctly place and configure all of the software on that file system.
    this demanded considerable time and expertise.
    as a result, a market opened for Linux distributors, who created packages (distributions) to automate most of the installation process, creating a file system and installing the kernel and other required software for that distribution’s intended purpose.

    As of July 2016 on 🔗 http://distrowatch.com/ the 3 most popular distros are:-

    Debian

    Ubuntu

    Mint:
    based on Ubuntu but a more complete desktop experience.
    has an option to use the lightweight Xfce desktop (better for use on a VM as less resource hungry)
    the most popular distro currently
    in the labs....
  • 1.5 UNIX/Linux Architecture

    A Diagram of the UNIX/Linux Architecture
    Figure 3: UNIX / Linux Architecture

    The kernel manages the machines hardware:
    memory resident after booting.
    is secure.
    contains the bodies of system calls which you can call from your program.
    handles I/O through device drivers.
    creates and manages processes.
    written in C and assembler.

    OS applications you write are written in C and call system:
    ls, ps, cat, man etc. etc. are all written in C.
    so is the shell (bash).

    Although it is possible to call system calls directly from your application it is easier to do so through library routines which add some extra functionality:
    again, these libraries are used in a C program.

    Alternatively, new applications can be written by plugging together the existing OS applications using the programming features of a shell:
    e.g. the bash shell
  • 1.6 Shells generally

    UNIX/Linux will usually support several shell alternatives.
    these support textual interaction with the computer through terminals

    Each will have a scripting language allowing system administration scripts to be written.

    the default shell is bound to your account details

    Alternatives around today are:-
    POSIX shell (sh or dash on modern Linux)
    bash
    an advanced modern shell maintaining backwards compatibility to the Korn Shell and earlier Bourne Shell, itself somewhat based on Algol 68 syntax
    csh – the “C-shell”
    the shell scripting parts are more based on a C-like syntax

    1.6.1 Lab Material

    Inevitably what is presented in a module of this nature is a subset of important points with some detail in the important areas.

    In both bash and C programming there are more details, features, way to do it etc. than can be possibly presented or that you will be able to take-in in the time available.

    With bash you will not be given explanations of what a lot of commands do.
    You will get directed to learning resources where you can explore and learn the details.
    in particular but not exclusively the lab material
    it is your responsibility to go and learn
  • 1.7 Shells: the bash shell (we will be using this)

    5 different “command types":

    Standard commands are programs e.g. type find
    therefore these will have an entry in the file system
    a typical place would be /bin or /usr/bin but could be anywhere...
    these create a child process when run – i.e. fork a process (see later)
    include shell script programs
    Builtins are for efficiency part of the shell implementation.
    include pwd, printf, echo, cd, getopts...
    see:- http://www.thegeekstuff.com/2010/08/bash-shell-builtin-commands/
    to list: compgen -b
    Functions are a way to build reusable units of shell script code.
    no child process is created when they are run
    usually added by some tools on installation so some already there
    or could be created by you in your scripts
    try: declare –f, declare -F
    Keywords are built-in shell script programming constructs and are reserved words e.g. for, while, do, select and !
    to list: compgen -k

    Aliases will be set up for some common commands.
    can be set for commands and builtins – see alias
    e.g. type ls
    to list: compgen -a
    The sequence of directories to be searched for a program is determined by the PATH variable.
    i.e. echo $PATH
    /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
    : is the path separator

    it will not include the current working directory (.) by default for security reasons and hence the need for ./myscript.sh etc.

    1.7.1 Globbing and Regular Expressions

    In shells the * and ? wildcard characters are used for file name expansion, sometimes also called globbing.
    there are a whole lot of other globbing constructs introduced in the lab material

    However regular expressions are used by other tools.
    in a limited sense they may look the same as globbing but they are not
    regular expressions in the shell are often associated with the grep family of commands to find matching text in a file or in output

    For more differences and a bit more on regular expressions: http://www.tldp.org/LDP/GNU-Linux-Tools-Summary/html/x11655.htm

    cat myfile | grep '^s.*n$'
    this command searches the file myfile for lines starting with an 's' and ending with an 'n' and prints them to the standard output
    note the regular expression is in quotes to switch off globbing

    1.7.2 Where things are

    You will need to know where things are on your computer:
    find: extremely useful with many options.
    find / -name <filename> 2>/dev/null
    go work out what this does!
    find will be discussed in Lecture 4
    whereis and which:
    whereis –b <command> - where executable is
    which <command> - searches your path as defined in $PATH and returns where the executable for the command to be executed is located
    locate:
    locate <command>
    a lot more sophisticated than whereis and is based on an installation database
    what's installed:
    dpkg –l (distro specific)

    See also:
    🔗 http://linux.about.com/od/commands/fl/How-To-Find-Linux-Commands-And-Programs-Using-Whereis.htm

    1.7.3 Linux Manual - Sections of the Manual Pages

    The manual sections are traditionally defined as follows: -
    User commands (Programs) Those commands that can be executed by the user from within a shell.
    System calls Those functions which wrap operations performed by the kernel.
    Library calls All library functions excluding the system call wrappers (Most of the libc functions).
    Special files (devices) Files found in /dev which allow to access to devices through the kernel.
    File formats and configuration files Describes various human-readable file formats and configuration files.
    Games Games and funny little programs available on the system.
    Overview, conventions, and miscellaneous Overviews or descriptions of various topics, conventions and protocols, character set standards, the standard filesystem layout, and miscellaneous other things.
    System management commands Commands like mount(8), many of which only root (the superuser) can execute.

    Note usage:-
    man passwd – the passwd command is passwd(1) – 1 is default section
    man 5 passwd – the password file format i.e. passwd(5)
    man 2 syscalls – list all the system calls

    Other than man command there are:
    whatis – display one-line manual page descriptions
    whatis find
    apropos – search the manual pages
    apropos find
    info – read the additional info documents
    info find
    --help
    find --help
    note the double dash “--”
    may not be any though - dependent on command
    Use stackoverflow.com
    someone will have asked already what you want to know...

    1.7.4 The Concept of a Process

    A process is defined as an executing image of a program including its data and register values and stack.
    more about this in later lectures but it also a useful concept in understanding the shell

    A shell like bash is a C program which starts as a process when you create a terminal window.
    to emphasise : it’s just a compiled C program itself as are all the other programs like find and file etc.
    remember some commonly used functionality is implemented in the shell program itself like builtins and these don’t fork a new process as described in the next slides this is to speed up execution

    When a command is executed the shell forks a child process to run the new command.
    the shell parent process is, by default, suspended until the child exits
    this behaviour is therefore synchronous.
    the child process could be a command like find, your own compiled C program or a script which will be executed with an interpreter
    the script could typically be a Python or Perl script or could be a shell script program e.g. a bash script etc.

    When a command is executed the shell forks a child process to run the new command.
    values are passed into the child process via command line arguments
    or picked up from environment variables that have been marked with export
    new variables and changed values are lost when the child process exits
    the child process can also access standard files and special files connected to the terminal (standard output etc.)
    the child process will return an exit status value to the parent process on termination which can be set and picked up programmatically