A table which maps each character
with a unique binary number.
ASCII
American Standard Code for
Information Interchange
7 bits, so holds
128 characters.
More recently extended ASCII was
released, 8 bits, 256 characters.
Only contains basic English characters
and punctuation.
Not enough: advent of World
Wide Web created need for
universal system.
Unicode
Was brought in to replace Unicode and
include most characters from across
the world.
Note: this includes Emojis!
Includes the characters for
over 20 countries.
16 bit so can code for around 6.5 thousand characters.
Constantly updated and maintained
by the Unicode Consortium.
Error Checking
Parity Bit
Where an additional bit is added to the end of a binary number to
show whether there are an even or an odd number of 1 digits.
Unreliable as if an even number of bits flip or
the parity bit flips, it doesn't detect the error.
Majority Voting
Data is sent multiple (typically 3) times and if there is a
discrepancy, the computer goes with the majority.
Very reliable and can correct data
without retransmitting, however,
uses a lot of bandwidth.
Check Digit
An additional digit is added to
the end of a number which is
derived from the number.
A simple method is by simply adding up all
the digits, and adding up the digits on the
sum, repeat this until you are left with a
single number.
This is unreliable as if the order
changes, the check digit remains the
same.
A more advanced method would
multiply each number by a weight, so
as to ensure that order is needed. An
example of this is the modulo-11
method.
Graphics
Bit Maps
A bit map graphic is a 2D array of pixels.
Each pixel holds a colour value, stored as a
binary number, many thousands of these
pixels typically make up an image.
A problem with bit maps is that
quality degrades when you
zoom in, additionally they take
up a large amount of space.
Each pixel will have a colour depth, which is the
number of bits allocated to represent the colour of a
pixel, the higher the colour depth, the more colours
that can be expressed.
Vector Graphics
A vector graphic is an image generated from a set of instructions
and mathematical formulae. This generates the image consisting
of geometric shapes, relative to a point of origin.
These are not suitable for photos as
they simplify things significantly,
however, they are good for diagrams
and CAD/CAM images.
A vector graphic typically takes up little
space and can be easily scaled up
without losing quality.
Audio
Audio is converted to digital data by
sampling the sound wave many times per
second.
The higher the number of samples per second (measured in
Hz), the higher the quality of the sound and the truer it is to
the original.
The computer then synthesises a wave by extrapolating the data sampled, this is usually
indistinguishable from the original for a human ear.
Nyquist's Theorem states that to faithfully
recreate sound, you must record at least twice
the highest frequency.
Humans hear 20Hz to 20kHz.
The resolution of sound is the number
of bits allocated to each sample, hence,
the more bits allocated, the more
pressure levels each sample can
represent.
Compression
Lossy compression
Some data is discard, to reduce file size,
however, the new file is usually indistinguishable
from the original.
An example would be replacing the hundreds of different
shades of blue in an image of the sky, with just a few shades.
A human would probably not notice the difference.
JPEG files use this.
It can be useful when transmitting images across
the internet, as it reduces the bandwidth
significantly. This is especially important for
people on slow connections.
Lossless
compression
Compression of data in such a way that the original
data can be reacquired in it's totality, after
decompressing. i.e. no data is discarded.
Examples would be run length encoding, where for example, if in an image, there are many pixels of the same colour, the computer just stores this as x*"blue" pixels. Rather
than storing "blue pixel", "blue pixel"... This doesn't lose accuracy, it just encodes repeated data more simply. Another example is dictionary-based encoding, where
commonly occurring strings in a text file are coded for in a simpler way, perhaps by assigning some token. A simple way to understand this could be replacing every "and"
with an "&" this takes up 67% less space than writing "and". Again, no loss of accuracy as the original data can be easily reacquired.
Dictionary based encoding can be used on binary data also, if it is considered as a string of 1s and 0s.