GCU Maths Computing Worksheet

for incoming Cyber Security & Software Development students

Summer 2021

Learning to write like a computer

Binary, ASCII and how computers represent data

Computers and electronic systems store information in a variety of formats, but they all share one fundamental construction. At the most basic level the data is stored via an on/off status of lots of single locations of physical memory. In a Solid State Drive (SSD), for example, the on/off states are represented by the state of an individual tiny transistor.

We normally call these two states for a single bit of data 0 or 1, and indeed use a technical term to describe such a single piece of data: it is called a bit.

When you learned to count as a child you memorized the numbers and words for the integers in order:

\[ \text{Zero, One, Two, Three, Four, Five, Six, Seven, Eight, Nine, Ten, Eleven, Twelve,} \ldots \]

\[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, \ldots \]

You’ve probably already forgotten how strange it was to you as a child that after NINE there wasn’t a number to represent TEN, instead we re-use the 1 and 0 digits. This is because we use what is known as Base Ten (also called Decimal), where the 1 digit shifted left by one represents TEN. Interestingly in Base Ten there are ten different digits used (0, 1, 2, 3, 4, 5, 6, 7, 8 and 9) but none of them actually represents ten itself!

Since computers work fundamentally by editing 0s and 1s it actually makes enormous sense for efficiency for them to work in Base Two (also known as Binary).


Task One:

  • Investigate how to count to ten in binary. You may find this webpage useful (wikiHow link) or the first six minutues of this video (Intro to Binary), or find one yourself online!
  • In Decimal the numbers \(1, 11, 111, 1111,\) etc.. differ by ten, one hundred, one thousand etc… What do the binary numbers \(1, 11, 111, 1111,\) etc… differ by?
  • What happens when you add 1 to 999999 in Base Ten? What happens when you add 1 to 111111 in Base Two?

Divisibility is an interesting topic, especially in computing because when you ask a computer to divide 4 by 3 you can quickly get into difficulty with how to even store the answer! (Can you see why?)

When doing arithmetic and calculations it’s often very important to be able to identify if a number can be divided equally into 2, 3, 4, 5, etc.. i.e. if it’s even, a multiple of 3, multiple of 4 etc..


Task Two:

  • How can you tell if a number in Base Ten is even? i.e. is it in the 2-times-table?
  • How can you tell if a number in Base Ten is exactly divisible by four? i.e. is it in the 4-times-table? (Search online for a good rule if you don’t know one)

Now for Base Two:

Try and answer the same two questions above, but in binary not base ten. Along the way see if you can answer these two questions below.

  • Is even-ness harder or easier to determine than in Base Ten?
  • What about for divisibility by four?

Now you’ve seen the basics of binary, we shall learn a little about how computers store more than just numbers. In order to store letters, words, and symbols on computers all that is needed is an agreed code which matches numbers to letters. The most well known code for doing this is known as ASCII (pronounced ass-kee). In order to store the word Hello on a computer all you need to do is first tell the computer that you are about to provide an ASCII code, and then provide the five numbers which represent H, then e, then l then l then o in consecutive order.

For clarity, files on computers contain some initial information which tells the computer what format/code (e.g. ASCII) is used for the upcoming data, then they contain a long list of numbers which need to be converted according the named format/code in order to convert the long string of numbers into something readable. To encode the word Cat you go to the ASCII table and discover that C, a and t equal 67, 97 and 116 respectively, then you store these three numbers. Though since computers only store binary values, these numbers need to be first converted into binary before storage.


Task Three:

  • Use the table of ASCII values (below) to convert the word Hello into ASCII. What five decimal numbers are needed (and in what order)? Hint: the important columns are the ‘Decimal’ and ‘Char’ (Character) columns. You need to go beyond row 60 to reach the letter characters.
  • Use the binary column of the ASCII table to convert these five numbers into binary. You should have five eight-digit binary numbers for a total of 40-bits of storage space required.

Finally, for a few deeper questions which you might have some ideas about.

Task Four:

  • Look carefully at the location of the letters A,B,C,…,Z in ASCII what’s special about these numbers when written in binary?
  • What about the same question for the letters a,b,c,…,z?
  • Can you see how letters and their capital versions are related? Is the relationship easier to spot in decimal or binary?
  • Look carefully at the ASCII table and notice that :;<=>? and @ are placed after 9 but before A. Then [\]^_ and ’ are inserted after Z but before a. Can you guess why?

Extension:

As the Internet developed standards needed to be developed to allow for a much wider range of characters than just default English letters (called Latin characters really). Eight binary digits is clearly not enough to cope with letters and symbols in all alphabets of the world.

Investigate online what role Unicode had as an extension to ASCII, and then how UTF-8 was introduced to allow an even wider range of characters while still preserving the underlying ASCII format.


ASCII Lookup Table

ASCII Lookup Table

Decimal Hex Binary HTML Char Description
0 00 00000000 &#0; NUL Null
1 01 00000001 &#1; SOH Start of Header
2 02 00000010 &#2; STX Start of Text
3 03 00000011 &#3; ETX End of Text
4 04 00000100 &#4; EOT End of Transmission
5 05 00000101 &#5; ENQ Enquiry
6 06 00000110 &#6; ACK Acknowledge
7 07 00000111 &#7; BEL Bell
8 08 00001000 &#8; BS Backspace
9 09 00001001 &#9; HT Horizontal Tab
10 0A 00001010 &#10; LF Line Feed
11 0B 00001011 &#11; VT Vertical Tab
12 0C 00001100 &#12; FF Form Feed
13 0D 00001101 &#13; CR Carriage Return
14 0E 00001110 &#14; SO Shift Out
15 0F 00001111 &#15; SI Shift In
16 10 00010000 &#16; DLE Data Link Escape
17 11 00010001 &#17; DC1 Device Control 1
18 12 00010010 &#18; DC2 Device Control 2
19 13 00010011 &#19; DC3 Device Control 3
20 14 00010100 &#20; DC4 Device Control 4
21 15 00010101 &#21; NAK Negative Acknowledge
22 16 00010110 &#22; SYN Synchronize
23 17 00010111 &#23; ETB End of Transmission Block
24 18 00011000 &#24; CAN Cancel
25 19 00011001 &#25; EM End of Medium
26 1A 00011010 &#26; SUB Substitute
27 1B 00011011 &#27; ESC Escape
28 1C 00011100 &#28; FS File Separator
29 1D 00011101 &#29; GS Group Separator
30 1E 00011110 &#30; RS Record Separator
31 1F 00011111 &#31; US Unit Separator
Decimal Hex Binary HTML Char Description
32 20 00100000 &#32; space Space
33 21 00100001 &#33; ! exclamation mark
34 22 00100010 &#34; " double quote
35 23 00100011 &#35; # number
36 24 00100100 &#36; $ dollar
37 25 00100101 &#37; % percent
38 26 00100110 &#38; & ampersand
39 27 00100111 &#39; ' single quote
40 28 00101000 &#40; ( left parenthesis
41 29 00101001 &#41; ) right parenthesis
42 2A 00101010 &#42; * asterisk
43 2B 00101011 &#43; + plus
44 2C 00101100 &#44; , comma
45 2D 00101101 &#45; - minus
46 2E 00101110 &#46; . period
47 2F 00101111 &#47; / slash
48 30 00110000 &#48; 0 zero
49 31 00110001 &#49; 1 one
50 32 00110010 &#50; 2 two
51 33 00110011 &#51; 3 three
52 34 00110100 &#52; 4 four
53 35 00110101 &#53; 5 five
54 36 00110110 &#54; 6 six
55 37 00110111 &#55; 7 seven
56 38 00111000 &#56; 8 eight
57 39 00111001 &#57; 9 nine
58 3A 00111010 &#58; : colon
59 3B 00111011 &#59; ; semicolon
60 3C 00111100 &#60; < less than
61 3D 00111101 &#61; = equality sign
62 3E 00111110 &#62; > greater than
63 3F 00111111 &#63; ? question mark
Decimal Hex Binary HTML Char Description
64 40 01000000 &#64; @ at sign
65 41 01000001 &#65; A  
66 42 01000010 &#66; B  
67 43 01000011 &#67; C  
68 44 01000100 &#68; D  
69 45 01000101 &#69; E  
70 46 01000110 &#70; F  
71 47 01000111 &#71; G  
72 48 01001000 &#72; H  
73 49 01001001 &#73; I  
74 4A 01001010 &#74; J  
75 4B 01001011 &#75; K  
76 4C 01001100 &#76; L  
77 4D 01001101 &#77; M  
78 4E 01001110 &#78; N  
79 4F 01001111 &#79; O  
80 50 01010000 &#80; P  
81 51 01010001 &#81; Q  
82 52 01010010 &#82; R  
83 53 01010011 &#83; S  
84 54 01010100 &#84; T  
85 55 01010101 &#85; U  
86 56 01010110 &#86; V  
87 57 01010111 &#87; W  
88 58 01011000 &#88; X  
89 59 01011001 &#89; Y  
90 5A 01011010 &#90; Z  
91 5B 01011011 &#91; [ left square bracket
92 5C 01011100 &#92; \ backslash
93 5D 01011101 &#93; ] right square bracket
94 5E 01011110 &#94; ^ caret / circumflex
95 5F 01011111 &#95; _ underscore
Decimal Hex Binary HTML Char Description
96 60 01100000 &#96; ` grave / accent
97 61 01100001 &#97; a  
98 62 01100010 &#98; b  
99 63 01100011 &#99; c  
100 64 01100100 &#100; d  
101 65 01100101 &#101; e  
102 66 01100110 &#102; f  
103 67 01100111 &#103; g  
104 68 01101000 &#104; h  
105 69 01101001 &#105; i  
106 6A 01101010 &#106; j  
107 6B 01101011 &#107; k  
108 6C 01101100 &#108; l  
109 6D 01101101 &#109; m  
110 6E 01101110 &#110; n  
111 6F 01101111 &#111; o  
112 70 01110000 &#112 p  
113 71 01110001 &#113; q  
114 72 01110010 &#114; r  
115 73 01110011 &#115; s  
116 74 01110100 &#116; t  
117 75 01110101 &#117; u  
118 76 01110110 &#118; v  
119 77 01110111 &#119; w  
120 78 01111000 &#120; x  
121 79 01111001 &#121; y  
122 7A 01111010 &#122; z  
123 7B 01111011 &#123; { left curly bracket
124 7C 01111100 &#124; | vertical bar
125 7D 01111101 &#125; } right curly bracket
126 7E 01111110 &#126; ~ tilde
127 7F 01111111 &#127; DEL delete