Writing more advanced code in C/C++ might not always be a trivial task. Due to this, you might have to run an application several times before you manage to eliminate all of its problems. Now with most applications, some input is required at a basic level at least. The quantity of required input can turn out to be quite large, so typing inside the console each time is definitely a good way to waste your precious time.

However, programming in C++ should be about looking over the details and seeing the big picture instead of handling trivial tasks like typing in the same input repeatedly. The way to evade this kind of practice is to use input that will not be cleared after the console closes. Off course, we are talking about files, preferably in text format.

Throughout this article, you will learn how to achieve good data flow in both directions (in and out) with the help of streams. These are members of a greater collection of streams in the library section named Iostream.

I strongly recommend reading my introductory articles about streams that appeared here on the Developer Shed Network under the names Introduction to Streams and Iostream Library and Basic IO in C++ because these will provide you with a better understanding of streams if you are not yet familiar with them. However, it is your call if you want to read them or not. The articles in this series should be just as useful independent of each other.

If you do want to know more about these topics, however, I advise you to search for my article about streams in general called Introduction to Streams or, for a little more detail about the structure of the library, see the article titled Iostream Library and Basic IO in C++ (this also treats the cin/cout streams).

The header where all of this is included is named <fstream>, so be sure to include it each time you want to use the library. Adding the using namespace std will make your work easier, and the readability of your code will increase. To start using them, however, like anything in C++, you need to declare its type.

Basic Tasks

Inside the Iostream, you declare the three types of existing file streams: the input only stream is declared as “ifstream,” the output only stream as “ofstream,” and a combination of the two goes under the name of “fstream.”

You have two different methods of opening/binding files to these streams. The first and the most convenient place to do so is in the constructor. Here you can simply pass the file name that also contains the extension type, and let the rest be resolved internally.

ofstream out (“In.txt”);

Of course, this will be created in the default directory (that is where the application is running), but you can also give the path as a parameter to specify directly where to create the file. Alternatively, this tells the ifstream where to find the file and try to open it. Note that it has an additional constructor within as a second argument; you can pass on some open flags, but we will talk about this later.

The second way you can bind a file to the stream is by calling the open member function. This has the same capabilities as the constructor and will act just like it. You provide the input place, while the library will take care of all of the processes needed to open it for reading or writing into.

ofstream out;

out.open(“In.txt”); // this will have the same effect as the //upper code

Here I should note that both types require having as an argument a “C style” string array, mainly for historical reasons. In any event, to get your path inside an STL-based string you can/need simply ask the string for a char* ( or wchar_t*) pointer using the “c_str()” member function as follows:

string name(“C:In.txt”); // note that = for the string

ofstream out(name.c_str());

Sometimes the file cannot be created (not enough space) or just cannot be opened to be written to because it is read only. Checking to see if any errors occurred in the opening process can be done easily by checking to see if the stream is valid.

if(!out)

{

// here we react to the invalid file open

}

Whenever an error occurs during any process of a stream (including a file stream), a fail flag inside the class will be set to false. Until this is turned on, any task will be invalid, and most of the functions will just return an eof, just as if the stream has no more to offer. You can reset this by calling the clear function:

out.clear() // reset the error state of the stream

Generally, closing a stream is not required explicitly, because when the program finishes, during the destruction of the stream, the file will be automatically closed. It is advisable, however, to do this explicitly whenever you are not using the file. Let others use it or just avoid clumsy errors that could come if you decide later to perform other tasks on a different file and reuse this stream.

Yes, you read that right! You can re-use a declared stream once you close it and call the clear function in case any errors have popped up earlier. Let’s assume we have a couple of file names inside a vector of strings and we want to read all of the data within a single string. Act as follows.

std::vector<string> files;

// fill up the vector

std::vector<string>::iterator it = files.begin();

std::vector<string>::iterator end = files.end();

ifstream in;

string temp;

while( it != end)

{

in.open (it->c_str());

if(in)

while(in >> temp /* = in*/)

{// do whatever you want with the input

}

in.close();

in.clear();

}

You can observe that after we read in a value from the stream, the while function will check not the input data, but the input stream. To comprehend this you should know that the line could be translated to a member function like this: in.operator<<(temp).

This function returns the stream itself to allow a chain construction of the insertion and extraction operators. Therefore, the “while” that will check the stream is reduced to checking to see if no error flag is turned on.

The way C++ tracks what have you written/read into a file is as follows: at the process of opening a file, besides associating with the stream one existing file on the HDD and opening that in an appropriate way, it will also create inside the stream object a file position indicator (FPI). This will help the stream in tracking where you are inside the file and what have you written/read recently.

Ignoring some input characters is possible by using the ignore() member function, which gets as an argument the number of characters to ignore. This is for those cases when you may have some invalid input and you want to just pass over it, not get the ugly error state every time. Moving through the file is possible with the seekg(), seekp(), tellg().

The Mode Flags

Each time you open a file via a stream, the open procedure and how the file behaves are all controlled via some internal flags of the fstream class. Even when you specify nothing clearly, the default values will be used. The file modes are just integer values essentially, so we can combine them with a bitwise operator.

Here is a list of what’s available and their effects on the file stream:

=> in – open for input

=> out – open for output

=> app – seek to the end before every write

=> ate -seek to the end immediately after the open

=> trunc – truncate an existing stream when opening it

=> binary – do the I/O procedures in binary mode (the stream will consist of ones and zeros)

Of course, not all of them have the same effect on the three types of file streams, and some do not even have a specific sense. The trunk flag will eventually delete the previous data inside the file. Opening an ofstream file with just the out flag works essentially just as if you had declared the trunk flag also, as it would clear anything that had been there before.

The solution, if you only want to append into the file, is to use the app flag with the ofstream. In the case of the fstream, the trunc is no longer there by default, so all of the previous data inside the file will remain.

The Binary Flag

The binary flag may seem a little unnecessary at first. It exists for performance reasons. Windows works much faster in this system, as it does not have to worry about the characters, whitespaces, and so on. Here it just reads a sequence of bits; there is less effort invested, and it is more quickly done. Everyone is happy.

Here is a prime example of this. How do you find out how many characters are inside a file? You need nothing else, just the number of characters, and you need it really fast. The complete solution is below. I used all that we have learned until now, and in addition, the read function.

This is a low-level operating function that lets us read chunks of bytes instead proper strings. However, it is using C arrays instead of strings, so I had to construct one that will serve as a buffer for the application:

#include <iostream>

#include <fstream>

#include <iomanip>

#include <vector>

#include <iterator>

#include <time.h>

using namespace std;

#define BUFFER_SIZE 100000 // how many bytes to read at once

#define ADD_ITEM 100000 // how many additional items to add

int main()

{

// open the file

fstream inputFile(“C:In.txt”,

std::ios_base::out | std::ios_base::app);

// this will result in different random numbers at //different run time

srand((unsigned)time(0));

 

// just add the new characters -> convert to char the ASCI

for (unsigned long int i = 0; i < ADD_ITEM; ++i )

{

inputFile << (char) ((rand()%125)+3) ;

 

}

// prepare the file for opening a different file

// or the same just with different flags

inputFile.clear();

inputFile.close();

 

// now open for read

inputFile.open( “C:In.txt”,

std::ios_base::binary | std::ios_base::in);

inputFile >> noskipws;

string number; // this will hold the last input

// allocate space for the buffer

char* block = (char*) calloc(BUFFER_SIZE+1, sizeof(char));

 

int size = 0; // first just count the number of reads

while ( inputFile.read( block, BUFFER_SIZE) ) // read

{

block[BUFFER_SIZE] = ”; // assure no false data at end

number = block; // save the block

size++; // increase the read count

}

number.append( block, inputFile.gcount() );//what remains //in the end add also

if(size) //calculate the number of read characters

size = (size-1)*(BUFFER_SIZE )+ number.size() ;

else

size = number.size();

 

cout << size << endl; // print the result

inputFile.close(); //close the file

 

}

504049

Press any key to continue . . .

Here is the file size with which I ended up after the completion of the program. As you can see here, the number of characters determine the size of the file on the disk, but this is OS-dependent; if you build  your application on this, it will not be portable.

The problem may be absurd, or you may find better solutions (and if you do so, please post it on the blog!), but I think it illustrates where the binary flag comes in handy and shows a good way to read large text files into the memory fast.