Lab 5. Arrays

How to submit your code

Each program (the source code .cpp file) should be submitted through Blackboard (Course Materials > Lab).

You can submit all your programs at the end of the lab session in one submission. This way, we can hopefully avoid the situation when you are quickly writing your program, immediately uploading it to Blackboard, but then, say 10 minutes later, realizing that there is a bug in it.

Basically, submit when you are sure that it will be your final version.

Each program should start with a comment that contains your name and a short program description, for example:

/*
  Author: Your Name 
  Description: Lab 1. Task 1. Hello world
*/

You can submit incomplete programs, but their “incomplete” status must be clearly mentioned in the comment to the program. In this case, also briefly describe what is implemented, and what is not.

Lab 5. Arrays

Task 1. Counting frequencies of integers read from a file

Write a program that reads from the standard input (using the method we employed in the Lab 2). However, this time the goal is to count how often different integers occures in the file. Luckily, we know that the file contains only integers in the range 0 <= x < 1000.

So, for example, if we declare a global constant

const int N = 1000;

Then we can use an array to count how many times each integer in the range 0 <= x < N occures in the file. As we know, such an array can be initialized as follows (all elements of the array are set equal to zero):

int count [N] = {0};

When reading from the file, the program should do two things:

  1. For every integer value i read from the file, increment the corresponding counter count[i].
  2. Keep track of the maximum integer max that the file contains.

After the entire file is read, please print a table

i     counter
-------------
0     1
1     16
2     4
3     10
4     14
5     23
...
max   10    <- the table should go all the way to the max value,
               but not further than that (all numbers beyond this value
               are zeroes anyway, so they are not interesting to us)

It’s suggested to use the tab character (denoted by '\t' in C++) to make two columns line up:

cout << i << '\t' << count[i] << endl;

Example.

If we take the following date file 1.dat as an input for our program, it should print the following table:

i   count
-------------
0   7
1   7
2   11
3   6
4   7
5   10
6   6
7   5
8   5
9   7
10  9
11  4
12  3
13  1
14  3
15  4
16  3
17  0
18  2

Please feel free to create your own data files to test your code.

Task 2. Histogram

Adapt the program from the previous task to print a bar chart (a histogram), instead of simply printing the values of the counter.

Example (for the data file 1.dat):

i   count
-------------
0   ||||||| 7
1   ||||||| 7
2   ||||||||||| 11
3   |||||| 6
4   ||||||| 7
5   |||||||||| 10
6   |||||| 6
7   ||||| 5
8   ||||| 5
9   ||||||| 7
10  ||||||||| 9
11  |||| 4
12  ||| 3
13  | 1
14  ||| 3
15  |||| 4
16  ||| 3
17   0
18  || 2

Task 3. Histogram printed in the descending order from the most frequent to the least frequent.

Adapt the previous program to print the numbers in the descending order.

Although the task can be accomplished using a sorting algorithm, one can use an a bit easier technique instead.

It’s sufficient to simply allocate an array of booleans that tells you if a particular number was printed or not:

bool printed[N] = {false};

When initialized like this, all the elements are set equal to false (to be precise, they all are technically equal to zero, which is interpreted as false).

When printing the histogram, the program can look for an integer i with the largest count[i] value that has not been printed yet (its printed[i] == false). And once this value i is printed, its flag printed[i] should be set to true. Thus we will not print the same value twice.

Example (for the data file 1.dat):

i   count
-------------
2   ||||||||||| 11
5   |||||||||| 10
10  ||||||||| 9
0   ||||||| 7
1   ||||||| 7
4   ||||||| 7
9   ||||||| 7
3   |||||| 6
6   |||||| 6
7   ||||| 5
8   ||||| 5
11  |||| 4
15  |||| 4
12  ||| 3
14  ||| 3
16  ||| 3
18  || 2
13  | 1
17   0

What else can be done using this (or a similar) program?

This section is not a task, but supposed to give you a somewhat bigger perspective on the program we have written.

We can take a large plain text file (for example, some classic public domain book), and count the frequencies of the letters in the text.

The problem is related to the old encription method called substitution cipher, when to break an encoded message, you need to count the frequencies of the letters in this encrypted message. Read the wikipedia article about the technique, if you want to learn more.