File I-O, formatted I-O

From Programming In C

Table of contents

File input and output

So far, any communication with a running program has been via scanf(), for input from the keyboard, and printf(), for output to the screen. As programs become larger and more complex it is common to write the output of a series of calculations to a file rather than to the screen. Similarly, if a large amount of data is to be processed, it quickly becomes impractical to enter it all by hand at the keyboard and so input from a separate file is necessary.

Output to a file

In order to write output to a file we need to use three functions:

  1. fopen() which identifies the file to be written to;
  2. fprintf() is the file equivalent of printf() and actually does the writing of information to the output file;
  3. fclose() should be used after all information has been written to the output file. This successfully closes the file.

Note that it is essential to have the include file stdio.h referenced at the top of your program in order to use any of these functions.

The fopen() function is used in the following way:

#include<stdio.h>
FILE *ofile;
ofile = fopen("filename","mode");

Note that ofile is the name (chosen by you) of a pointer to an object of type FILE. This same name should be used whenever the file is referenced in the same computer program. The actual filename is given in the first field of the fopen() function and should be contained within double quotes. The second field of the fopen() function is known as the mode which determines how the file is opened. Valid fopen() modes include:

Mode Meaning
w Open text file for writing
a Open text file for appending
wb Open binary file for writing
ab Open binary file for appending

Note that opening a file for writing causes it to be created if it doesn't already exist and overwritten if it already exists. Opening a file for appending causes it to be created if it doesn't already exist, if it does exist all writing occurs at the end of the file and no overwriting takes place.

The format of the fprintf() function is almost identical to that of the printf() function except that it has an additional parameter (the first parameter) which is the name of the file pointer used in the fopen() function call, its syntax is therefore:

fprintf(file_pointer, control_string, list_of_variables_to_write_out);

Finally the fclose() function should be called, this takes one argument only, namely the file pointer name used in the fopen() function call.

The example below takes the program which calculates all the perfect numbers between 400 and 10000 as developed in Topic 4 and writes the output to a file called perfect.txt rather than to the screen.

#include<stdlib.h> 
#include<stdio.h>
#include<math.h>

int main (void)
{ 
    FILE *outfile;        // this is the pointer to the FILE object
    const int istart = 400;
    const int ifinish = 10000;
    int i,j,sum;
    outfile=fopen("perfect.txt","w"); // opens a file called perfect.txt for writing to 

    for (i=istart; i<=ifinish; i++)
    {
        sum=1;
        for (j=2;j<=(i/2);j++)
        {
            if (!(i%j)) sum+=j;
        }
    if ( sum == i ) fprintf(outfile, "%d is a perfect number\n",i);
    }
    fclose(outfile);
    system("pause");
    return 0;
}

Input from a file

To read data from a file instead of data entered at the keyboard a similar set of three functions are required, these are:

  1. fopen() which identifies the file to be read from;
  2. fscanf() is the file equivalent of scanf() and actually does the reading of data from the input file;
  3. fclose() should be used after all information has been read from the input file. This successfully closes the file.

Here the fopen() and fclose() functions are exactly the same as described above for use when writing data to an output file. The fopen() function is used in the same way as before except this time mode can take the following values:

Mode Meaning
r Open text file for reading
rb Open binary file for reading
r+ Open text file for reading and writing
w+ Open binary file for writing and reading

Use of the final two options require additional calls to functions such as fflush(), fseek(), fsetpos() and rewind() which are beyond the scope of this course and so should be avoided.

The difference between fscanf() and scanf() is essentially the same as that described above for fprintf() and printf(), i.e. fscanf() has an additional first argument which is the file pointer name used originally in the fopen() function.

The usage of fopen(), fscanf() and fclose() for data input is illustrated in the example below which assumes that the file vectors.txt exists and consists of 100 lines each with 3 real numbers representing the x, y and z components of a vector. For each vector the dot product is calculated and written to the screen.

#include<stdio.h>
#include<stdlib.h>
#include<math.h>
 
double dotprod (double x, double y, double z);
int main(void)
{
   FILE *infile;
   int i;
   double xvec, yvec, zvec;
    
   infile=fopen("vectors.txt","r");
   for (i = 0; i <100; i++) // Loop 100 times
   {  
      fscanf(infile, "%lf %lf %lf", &xvec, &yvec, &zvec );
      printf(" Dot product is %lf \n", dotprod(xvec, yvec, zvec) );  
   }
   fclose(infile);
   system("pause");
   return 0;  
}
 
double dotprod (double x, double y, double z)
{
   double dot = sqrt(x*x + y*y + z*z);
   return dot;
}

In the example above we knew exactly how many lines there were in the input file vectors.txt and so were able to loop over exactly that number. What if we don't know how many lines there are and we just want to loop until we have exhausted the data in the file? This is easy to do since fscanf() is a function which returns a value which is equal to the number of successful conversions it has made. If this value is zero or negative then there is no more data. Therefore it is possible to both read in data and check if there are still valid data with a while loop which looks like:

while(fscanf(....) > 0)
{
}

The program above could thus be rewritten to work for a file vectors.txt which has any number of lines in the following way:

#include<stdio.h>
#include<stdlib.h>
#include<math.h>
 
double dotprod (double x, double y, double z);
int main(void)
{
   FILE *infile;   
   double xvec, yvec, zvec;
    
   infile=fopen("vectors.txt","r");
   while(fscanf(infile, "%lf %lf %lf", &xvec, &yvec, &zvec ) > 0)
      printf(" Dot product is %lf \n", dotprod(xvec, yvec, zvec) );  
   fclose(infile);
   system("pause");
   return 0;  
}
 
double dotprod (double x, double y, double z)
{
   double dot = sqrt(x*x + y*y + z*z);
   return dot;
}

Reminder: in this case since there is only a single executable statement after the while loop and so there is no need to define a code block via { and }.

File checking

Before using an input/output file it is worth checking that the file has been correctly opened first. A call to fopen() may result in an error due to a number of reasons including:

  • A file opened for reading does not exist;
  • A file opened for reading is read protected;
  • A file is being opened for writing in a folder or directory where you do not have write access.

It is clearly desirable to "trap" any such error when the file is opened rather than attempting to read from or write to a non-existent file. To do this it should be noted that the fopen() function returns a value which can be tested. If the value is NULL then the attempted file operation has been unsuccessful. So, for example, the call to the fopen() function in the program immediately above could be modified to read:

   if ((infile=fopen("vectors.txt","r")) == NULL)
   {
      printf("Error: input file cannot be opened \n");
      system("pause");
      return 1;
   }

The call to fopen() function for the opening of the output file should be similarly modified.

Use of multiple input and output files

The examples above illustrate using a single input or output file within a program. It is of course possible to use multiple input and output files simultaneously within the same program. An excellent example of this is included in the notes for Topic 7 where there are 2 input files of vectors called vector1.txt and vector2.txt and an output file called cross.txt where the vector cross product of the the vectors in the 2 input files is written.

Formatted output

In Session 2 we were introduced to the so called conversion characters that we use with the printf() and fprintf() statements to control the output of data to the terminal or to a file. These conversion characters format the output with a default format. For example, writing out a variable of type float or double with conversion character %f results in six digits to the right of the decimal point, this is padded with zeros where necessary.

More control over the format of output is possible using one or more optional characters that appear between the % and the conversion character. The characters that may be used are:

  • a (positive) integer that specifies the minimum number of spaces taken up by the output. If this integer is larger than what is to be printed out then the extra space is filled (or "padded") with blanks. If the integer is smaller then what is to be printed out it is automatically extended to allow printing to take place without truncation.(Applies to all conversion characters)
  • a precision which is denoted by a full stop followed by a positive integer.
    • For integer (%d) output this precision specifies the minimum number of digits to be printed;
    • For character (%s) output this precision specifies the maximum number of characters to be printed (truncation may occur);
    • For %e and %f conversions the precision specifies the number of digits to the right of the decimal point;
    • For %g conversions the precision specifies the maximum number of significant digits;
  • a minus sign, this denotes that this particular output is to written out left-adjusted (the default, where there is no minus sign, is to format the output as right-adjusted) (Applies to all conversion characters) ;
  • A plus sign, forces a "+" character to be written out before the number for all non-negative numbers. (Applies to %d, %e, %f and %g)
  • A zero. This causes padding to take place with zeros instead of spaces. (Applies to %d, %e, %f and %g)

The following illustrates the use of these printf() and fprintf() format modifiers:

Suppose a fragment of code includes the following initialisations:

int a = 78;
double b = 209.1067;
char c = 'C';
char *s = "Sheffield";

The table below shows how different printf() formatting works

Statement Outputs as Comment
printf("%d",a); "78" field width defaults to size of integer
printf("%05d",a); "00078" 5 wide, padded with zeros
printf("%-+.4d",a); "+78 " left-aligned, minimum 4 wide, print the "+", pad with spaces
printf("%f",b); "209.106700" default (6 decimal places, padded with zeros)
printf("%-10.5f",b); "209.10670 " left-aligned, 10 wide, 5 decimal places, padded with spaces
printf("%6.3f",b); "209.107" minimum 6 wide (changed to 7), 3 decimal places
printf("%6.3e",b); "2.091e+02" exp format, minimum 6 wide, 3 decimal places
printf("%-9.5e",b); "2.09107e+02" exp format, left-aligned, minimum 9 wide, 5 decimal places
printf("%c",c); "C" default width is same as character length
printf("%4c",c); "   C" minimum of 4 spaces, right-aligned, padded with spaces
printf("%-2c",c); "C " minimum of 2 spaces, left-aligned, padded with spaces
printf("%s",s); "Sheffield" default width is same as character length
printf("%-15s",s); "Sheffield      " minimum of 15 spaces, left-aligned, padded with spaces
printf("%-.6s",s); "Sheffi" left-aligned, 6 character precision, note truncation
printf("%11.3s",s); "        She" right-aligned, 11 spaces minimum, 3 character precision, note truncation

Formatted input

The scanf() function takes 2 parameters, the so-called control string and then a list of one or more variables to be read in, fscanf() takes and additional first parameter which is the pointer to the input file. In Topic 2 we were introduced to a number of conversion characters which are used in the control string to dictate what sort of variable type is to be read in. The control string may also contain ordinary text which is to be matched against what is being read in. As an example of this imagine you have a file which is full of dates and times of the form:

25-Nov-2005 03:45:16
28-Oct-2002 12:37:01

This could be read in using the following code:

int date, year, hour, min, sec;
char month[3];
 
// some code here for file handling
 
fscanf("%d-%3s-%d %d:%d:%d", &date, month, &year, &hour, &min, &sec);

It is important to note here, as mentioned earlier in the course, that when reading in a string you do not refer to the string as &string_name but simply as string_name since, in this case, month itself is a pointer.

When reading in strings note the following behaviours:

  • %s skips white space until it finds a non-white space character and starts reading in until it comes to either another white space or the end of the file;
  • %6s for example skips white space until it finds a non-white space character and starts reading in until it comes to either another white space or the end of the file or it has finished reading 6 characters (whichever comes first).

Two further possibilities exist, namely

  • an optional integer before the conversion character which determines the maximum width to be scanned (as used above for reading in month);
  • an optional * character after the % which means what is matched does not appear in the variable list and so is thrown away.

These two options may be compounded as is illustrated below. Here the example is exactly the same as that immediately above but in this case we only want to read in date, hour, min and sec. In this case, the scanf() statement could be changed to read for example:

scanf("%d%*9s%d:%d:%d", &date, &hour, &min, &sec);

Note, the 9s above could equally be replaced by 10s.

In the examples above control characters are usually written with a space between, e.g. "%lf %lf %lf". In principle, this could equally be written without the spaces, i.e. "%lf%lf%lf" however experience shows that where individual items of data to be read in are separated by spaces then it is good practise to include those spaces in the list of control characters.

Further Reading

A Book on C Kelley and Pohl

  • pages 503-516 (file input and output),
  • pages 493-498 (more on printf),
  • pages 499-503 (more on scanf)

Return to the course summary

phy225: Course Details

general


compilers


2011-12 assessments


past exam papers

  • 2008-9 (http://physics-database.group.shef.ac.uk/exampapers/2008-09/PHY225%20LT.pdf)
  • 2009-10 (http://physics-database.group.shef.ac.uk/exampapers/2009-10/PHY225%20Exam%20Sem%201%202009-10.pdf)
  • 2010-11 (http://physics-database.group.shef.ac.uk/exampapers/2010-11/PHY225_10_11.pdf)