Files and File Modes in Linux C

Most resources on a *nix system can be accessed as a file, per the everything-is-a-file philosophy. Files come in a number of types, such as regular storage files, named and unnamed pipes, directories, devices, symbolic links and sockets.

Regular files are regular files, pipes are a data channel, directories contain a list of files stored in the directory, device files provide an interface to devices, symbolic links store a path to another file, and sockets are like pipes, but pipes that allow processes on separate machines to communicate.

Most files on a Linux system are either files or directories.

The stat() function can be used to access a file’s metadata. The stat() function accepts a pathname and stores the file’s metadata in a stat structure, which is supplied to it as the second argument. The stat() function’s prototype is stored in sys/stat.h, which also includes the definition for the stat structure (via bits/stat.h).  

Basically, we need to include sys/stat.h. Then, we need to declare a struct stat. Finally, we call stat(), passing as the first parameter the file we want to get the metadata of, and as the second argument we will pass the address of the struct stat variable we declared.

#include <stdio.h>
#include <sys/stat.h>


int main(void){

    char *szFileName = "test.txt";
    struct stat SMetaData;

    stat(szFileName, &SMetaData);

    //inode num is st_ino
    //of type ino_t (unsigned long)
    printf("%s \tinode: %lu\n", szFileName, SMetaData.st_ino);

    return 0;
}

The file’s type and it’s permissions are encoded together in one field in the stat structure, the st_mode field. The file’s mode is 16 bits in length, with the four high-order bits representing the file’s type, and the remaining lower 12 bits representing access permissions and their modifiers.

#include <stdio.h>
#include <sys/stat.h>

int main(void){

    char *szPath = "test.txt";
    mode_t sMode;

    struct stat SMeta;

    stat(szPath, &SMeta);

    sMode = SMeta.st_mode;

    printf("The mode for %s is %d\n", szPath, sMode);

    for(int i = 15; i >= 0; i--){
        if(sMode & (1 << i)){
            putchar('1');
        } else {
            putchar('0');
        }
        if(i==12){
            putchar(' ');
        }
    }

    putchar('\n');    

    return 0;
}

The bitmask for all file type fields is S_IFMT. By anding this value with the st_mode value, we can extract the file type information from the file’s mode field. We recall here that anding is used to apply a mask to a binary value. The binary AND operation returns a 1 when both bits are on, and a 0 where either bit is off, and uses the & operator.

#include <stdio.h>
#include <sys/stat.h>

int main(void){

    char *szPath = "test.txt";
    struct stat SBuff;

    stat(szPath, &SBuff);

    printf("inode: %lu \t", SBuff.st_ino);
    printf("name: %s \t", szPath);
    printf("type: ");

    switch(SBuff.st_mode & S_IFMT){
        case S_IFREG:
            printf("Regular File\n");
            break;
        case S_IFDIR:
            printf("Directory\n");
            break;
        case S_IFBLK:
            printf("Block Device\n");
            break;
        case S_IFCHR:
            printf("Character Device\n");
            break;
        case S_IFSOCK:
            printf("Socket\n");
            break;
        case S_IFLNK:
            printf("Symbolic Link\n");
            break;
    }

    return 0;

}

Again, the file type is encoded in the st_mode field of the stat structure. There are a set of macros that can help us to decipher the file type from the st_mode field. Each macro returns true if the file type is found in the mode, false if otherwise.

#include <stdio.h>
#include <sys/stat.h>

void printType(mode_t iMode);

int main(void){

    char *caArray[] = {"dir", "/run/avahi-daemon/socket", "test.txt", "/dev/sda", "/dev/tty0"};
    struct stat SBuffer;

    for(int i = 0; i < 5; i++){
        stat(caArray[i], &SBuffer);
        printf("File Name: %s\t", caArray[i]);
        printf("Inode: %lu\t", SBuffer.st_ino);
        printf("Type: ");
        printType(SBuffer.st_mode);
        printf("\n");
    }

    return 0;
}


void printType(mode_t iMode){
    if(S_ISREG(iMode)){
        printf("regular file");
        return;
    }
    if(S_ISDIR(iMode)){
        printf("directory");
        return;
    }
    if(S_ISBLK(iMode)){
        printf("block device");
        return;
    }
    if(S_ISCHR(iMode)){
        printf("character device");
        return;
    }
    if(S_ISSOCK(iMode)){
        printf("socket");
        return;
    }
}

The one thing to be aware of is that the stat() follows symbolic links; if we want to get metadata regarding the symbolic link itself, we should use the lstat() function. We will be returning to this at a later time.

 

 

 

Advertisements

Data and Memory in C

Computers keep track of memory by using addresses. Computers maintain addresses of bytes.  A byte is 8 bits arranged in order in memory. Each bit represents a position whose value is 2 to the power of n, where n is the numeric value of the bit’s position, starting from 0. By convention, the leftmost bit is called the most significant bit, as this has the largest value (2^7), whereas the rightmost bit is called the least significant bit, as it has the value of 2^0, which is 1.

We use bytes to form memory objects. A memory object, not to be confused with an object that is an instantiation of a class, is a region of memory made up of a contiguous collection of bytes in order to store a value. All objects have addresses, which is the address of the object’s first byte in memory. On microcomputers, this byte is the smallest.

A variable is thus a memory object that has a name, and has an amount of storage determined by its type. Variables are declared, whereby the compiler is told that a variable will be used and what its data type is. A variable can be assigned an initial value when it is declared. The C language has a number of types, such as char, which stores an ASCII character, an int, which stores a whole number, an unsigned int, which stores a positive whole number, and a double, which stores a double-precision floating-point number.

The char type specifies an ASCII character. Any ASCII character is guaranteed to have a positive value. The char keyword sets the type of the memory object and the identifier, which is the variable’s name.

#include    <stdio.h>    
#include    <stdlib.h>
/* ------ main function ------- */

int main ( int argc, char *argv[] )
{
char cX = 'c';
char cY = 100;

printf("%c \t %c\n", cX, cY);

return EXIT_SUCCESS;
}

/* ----------  end of main function  ---------- */

The expression ‘c’ is called a character constant. It yields the numeric value that is stored to represent the character. Note that it uses a pair of single quotation marks, not double quotation marks.  Double quotation marks are used to delimit a string literal. A string literal is stored as an array of characters. We can access it either via a char pointer or a char array.

#include    <stdio.h>
#include    <stdlib.h>
/* ------ main function ------- */

int main ( int argc, char *argv[] )
{
char *szStringOne = "El Psy Congroo";
char szStringTwo[] = "Nice Boat!";

printf("%s\n", szStringOne);
printf("%s\n", szStringTwo);

return EXIT_SUCCESS;
}

/* ----------  end of main function  ---------- */

An equals sign in C, as in most languages, assigns the value of an expression on the right side to the memory object on left side.

Formatting characters, such as newline or tab, are represented via escape sequences. For instance, you can’t put a newline in a string literal by pressing Enter, as that only puts a newline in the source code! The escape sequence for a tab is \t and the escape sequence for a newline is \n.  Single and double quotation marks can also be represented via escape sequences, \’ and \”, respectively.

While the size of an int variable isn’t specified, it is typically 32 bits, which was the word size on x86 machines. Note that an int value is still 32 bits. Long or at least long long values should be 64 bits in length. Pointers are 64 bits as well, at least on x64 machines

#include     <stdio.h>
#include    <stdlib.h>
/* ------ main function ------- */

int main ( int argc, char *argv[] )
{
int iValue;
int *piValue;
long long llValue;

printf("size of an int = %lu\n", sizeof(iValue));
printf("size of a pointer = %lu\n", sizeof(piValue));
printf("Size of a long long = %lu\n", sizeof(llValue);

return EXIT_SUCCESS;
}

/* ----------  end of main function  ---------- *

Computers keep track of memory by using addresses. Computers maintain addresses of bytes.  A byte is 8 bits arranged in order in memory. Each bit represents a position whose value is 2 to the power of n, where n is the numeric value of the bit’s position, starting from 0. By convention, the leftmost bit is called the most significant bit, as this has the largest value (2^7), whereas the rightmost bit is called the least significant bit, as it has the value of 2^0, which is 1.

We use bytes to form memory objects. A memory object, not to be confused with an object that is an instantiation of a class, is a region of memory made up of a contiguous collection of bytes in order to store a value. All objects have addresses, which is the address of the object’s first byte in memory. On microcomputers, this byte is the smallest.

A variable is thus a memory object that has a name, and has an amount of storage determined by its type. Variables are declared, whereby the compiler is told that a variable will be used and what its data type is. A variable can be assigned an initial value when it is declared. The C language has a number of types, such as char, which stores an ASCII character, an int, which stores a whole number, an unsigned int, which stores a positive whole number, and a double, which stores a double-precision floating-point number.

The char type specifies an ASCII character. Any ASCII character is guaranteed to have a positive value. The char keyword sets the type of the memory object and the identifier, which is the variable’s name.

#include    <stdio.h>    
#include    <stdlib.h>
/* ------ main function ------- */

int main ( int argc, char *argv[] )
{
char cX = 'c';
char cY = 100;

printf("%c \t %c\n", cX, cY);

return EXIT_SUCCESS;
}

/* ----------  end of main function  ---------- */

The expression ‘c’ is called a character constant. It yields the numeric value that is stored to represent the character. Note that it uses a pair of single quotation marks, not double quotation marks.  Double quotation marks are used to delimit a string literal. A string literal is stored as an array of characters. We can access it either via a char pointer or a char array.

#include    <stdio.h>
#include    <stdlib.h>
/* ------ main function ------- */

int main ( int argc, char *argv[] )
{
char *szStringOne = "El Psy Congroo";
char szStringTwo[] = "Nice Boat!";

printf("%s\n", szStringOne);
printf("%s\n", szStringTwo);

return EXIT_SUCCESS;
}

/* ----------  end of main function  ---------- */

An equals sign in C, as in most languages, assigns the value of an expression on the right side to the memory object on left side.

Formatting characters, such as newline or tab, are represented via escape sequences. For instance, you can’t put a newline in a string literal by pressing Enter, as that only puts a newline in the source code! The escape sequence for a tab is \t and the escape sequence for a newline is \n.  Single and double quotation marks can also be represented via escape sequences, \’ and \”, respectively.

While the size of an int variable isn’t specified, it is typically 32 bits, which was the word size on x86 machines. Note that an int value is still 32 bits. Long or at least long long values should be 64 bits in length. Pointers are 64 bits as well, at least on x64 machines

 #include     <stdio.h>
#include    <stdlib.h>
/* ------ main function ------- */

int main ( int argc, char *argv[] )
{
int iValue;
int *piValue;
long long llValue;

printf("size of an int = %lu\n", sizeof(iValue));
printf("size of a pointer = %lu\n", sizeof(piValue));
printf("Size of a long long = %lu\n", sizeof(llValue);

return EXIT_SUCCESS;
}

/* ----------  end of main function  ---------- */

Computers keep track of memory by using addresses. Computers maintain addresses of bytes.  A byte is 8 bits arranged in order in memory. Each bit represents a position whose value is 2 to the power of n, where n is the numeric value of the bit’s position, starting from 0. By convention, the leftmost bit is called the most significant bit, as this has the largest value (2^7), whereas the rightmost bit is called the least significant bit, as it has the value of 2^0, which is 1.

We use bytes to form memory objects. A memory object, not to be confused with an object that is an instantiation of a class, is a region of memory made up of a contiguous collection of bytes in order to store a value. All objects have addresses, which is the address of the object’s first byte in memory. On microcomputers, this byte is the smallest.

A variable is thus a memory object that has a name, and has an amount of storage determined by its type. Variables are declared, whereby the compiler is told that a variable will be used and what its data type is. A variable can be assigned an initial value when it is declared. The C language has a number of types, such as char, which stores an ASCII character, an int, which stores a whole number, an unsigned int, which stores a positive whole number, and a double, which stores a double-precision floating-point number.

The char type specifies an ASCII character. Any ASCII character is guaranteed to have a positive value. The char keyword sets the type of the memory object and the identifier, which is the variable’s name.

#include    <stdio.h>    
#include    <stdlib.h>
/* ------ main function ------- */

int main ( int argc, char *argv[] )
{
char cX = 'c';
char cY = 100;

printf("%c \t %c\n", cX, cY);

return EXIT_SUCCESS;
}

/* ----------  end of main function  ---------- */

The expression ‘c’ is called a character constant. It yields the numeric value that is stored to represent the character. Note that it uses a pair of single quotation marks, not double quotation marks.  Double quotation marks are used to delimit a string literal. A string literal is stored as an array of characters. We can access it either via a char pointer or a char array.

#include    <stdio.h>
#include    <stdlib.h>
/* ------ main function ------- */

int main ( int argc, char *argv[] )
{
char *szStringOne = "El Psy Congroo";
char szStringTwo[] = "Nice Boat!";

printf("%s\n", szStringOne);
printf("%s\n", szStringTwo);

return EXIT_SUCCESS;
}

/* ----------  end of main function  ---------- */

An equals sign in C, as in most languages, assigns the value of an expression on the right side to the memory object on left side.

Formatting characters, such as newline or tab, are represented via escape sequences. For instance, you can’t put a newline in a string literal by pressing Enter, as that only puts a newline in the source code! The escape sequence for a tab is \t and the escape sequence for a newline is \n.  Single and double quotation marks can also be represented via escape sequences, \’ and \”, respectively.

While the size of an int variable isn’t specified, it is typically 32 bits, which was the word size on x86 machines. Note that an int value is still 32 bits. Long or at least long long values should be 64 bits in length. Pointers are 64 bits as well, at least on x64 machines

#include     <stdio.h>
#include    <stdlib.h>
/* ------ main function ------- */

int main ( int argc, char *argv[] )
{
int iValue;
int *piValue;
long long llValue;

printf("size of an int = %lu\n", sizeof(iValue));
printf("size of a pointer = %lu\n", sizeof(piValue));
printf("Size of a long long = %lu\n", sizeof(llValue);

return EXIT_SUCCESS;
}

/* ----------  end of main function  ---------- *

Computers keep track of memory by using addresses. Computers maintain addresses of bytes.  A byte is 8 bits arranged in order in memory. Each bit represents a position whose value is 2 to the power of n, where n is the numeric value of the bit’s position, starting from 0. By convention, the leftmost bit is called the most significant bit, as this has the largest value (2^7), whereas the rightmost bit is called the least significant bit, as it has the value of 2^0, which is 1.

We use bytes to form memory objects. A memory object, not to be confused with an object that is an instantiation of a class, is a region of memory made up of a contiguous collection of bytes in order to store a value. All objects have addresses, which is the address of the object’s first byte in memory. On microcomputers, this byte is the smallest.

A variable is thus a memory object that has a name, and has an amount of storage determined by its type. Variables are declared, whereby the compiler is told that a variable will be used and what its data type is. A variable can be assigned an initial value when it is declared. The C language has a number of types, such as char, which stores an ASCII character, an int, which stores a whole number, an unsigned int, which stores a positive whole number, and a double, which stores a double-precision floating-point number.

The char type specifies an ASCII character. Any ASCII character is guaranteed to have a positive value. The char keyword sets the type of the memory object and the identifier, which is the variable’s name.

#include    <stdio.h>    
#include    <stdlib.h>
/* ------ main function ------- */

int main ( int argc, char *argv[] )
{
char cX = 'c';
char cY = 100;

printf("%c \t %c\n", cX, cY);

return EXIT_SUCCESS;
}

/* ----------  end of main function  ---------- */

The expression ‘c’ is called a character constant. It yields the numeric value that is stored to represent the character. Note that it uses a pair of single quotation marks, not double quotation marks.  Double quotation marks are used to delimit a string literal. A string literal is stored as an array of characters. We can access it either via a char pointer or a char array.

#include    <stdio.h>
#include    <stdlib.h>
/* ------ main function ------- */

int main ( int argc, char *argv[] )
{
char *szStringOne = "El Psy Congroo";
char szStringTwo[] = "Nice Boat!";

printf("%s\n", szStringOne);
printf("%s\n", szStringTwo);

return EXIT_SUCCESS;
}

/* ----------  end of main function  ---------- */

An equals sign in C, as in most languages, assigns the value of an expression on the right side to the memory object on left side.

Formatting characters, such as newline or tab, are represented via escape sequences. For instance, you can’t put a newline in a string literal by pressing Enter, as that only puts a newline in the source code! The escape sequence for a tab is \t and the escape sequence for a newline is \n.  Single and double quotation marks can also be represented via escape sequences, \’ and \”, respectively.

While the size of an int variable isn’t specified, it is typically 32 bits, which was the word size on x86 machines. Note that an int value is still 32 bits. Long or at least long long values should be 64 bits in length. Pointers are 64 bits as well, at least on x64 machines

 #include     <stdio.h>
#include    <stdlib.h>
/* ------ main function ------- */

int main ( int argc, char *argv[] )
{
int iValue;
int *piValue;
long long llValue;

printf("size of an int = %lu\n", sizeof(iValue));
printf("size of a pointer = %lu\n", sizeof(piValue));
printf("Size of a long long = %lu\n", sizeof(llValue);

return EXIT_SUCCESS;
}

/* ----------  end of main function  ---------- */

As we have seen, the printf() function can handle both fixed and variable parts. The function takes at least one argument, the format string that contains both literal text and conversion specifiers. The values stored in optional arguments sent after the format string literal are inserted in the places indicated by the conversion specifiers. The literal text is printed unchanged.

The %d conversion specifier is used for int values. The %c specifier is used for char values. The %u specifier is used for unsigned values. The %x specifier converts an unsigned integer to hexadecimal notation.

#include    <stdio.h>
#include    <stdlib.h>
/* ------ main function ------- */

int main ( int argc, char *argv[] )
{
unsigned x = 0;

for(x; x < 100; x++){
printf("%d \t %x \n", x, x);
}

return EXIT_SUCCESS;
}

/* ----------  end of main function  ---------- */

As we have seen, the printf() function can handle both fixed and variable parts. The function takes at least one argument, the format string that contains both literal text and conversion specifiers. The values stored in optional arguments sent after the format string literal are inserted in the places indicated by the conversion specifiers. The literal text is printed unchanged.

The %d conversion specifier is used for int values. The %c specifier is used for char values. The %u specifier is used for unsigned values. The %x specifier converts an unsigned integer to hexadecimal notation.

#include    <stdio.h>
#include    <stdlib.h>
/* ------ main function ------- */

int main ( int argc, char *argv[] )
{
unsigned x = 0;

for(x; x < 100; x++){
printf("%d \t %x \n", x, x);
}

return EXIT_SUCCESS;
}

/* ----------  end of main function  ---------- */

Floating point numbers are real numbers; they include the types float, double, and long double. Each of these types varies in the precision with which a particular value is stored and expressed. A long double may or may not be more precise than a double, and float value types are rarely used.

User-Defined Functions in C++

There are two types of user defined functions in C++, value-returning functions that have a data type, and void functions that do not return a value.

To use a library function, we need to know the name of the header file that contains the functions’ specification. We must include this header file in our program using the include statement.

We can use the value returned by a value-returning function in one of three ways, by saving the value for further computation, using the value immediately in a calculation, or printing the value. This means that value-returning functions are typically used in assignment statements, output statements, and as a parameter itself in another function call.

#include <iostream>
#include <cmath>

using namespace std;


int returnInt();

int main(void){
    
    int x = 7;
    int y = pow(x, 3);

    cout << "x = " << x << endl;
    cout << "x ^ 3 = " << y << endl;
    cout << "returnInt = " << returnInt() << endl;


    return 0;
}


int returnInt(){
    return 47;
}

A function definition contains a list of formal parameters, which are the variables declared in  function heading. As we have seen above, a function’s formal parameter list can be empty. The return type is also declared. The statements enclosed between curly braces make up the body of the function.

Once a value-returning function has processed the arguments, the function returns the value via the return statement. The return statement can return either  a variable, a literal value, or even an expression.

#include <iostream>

using namespace std;

int returnLiteral();
double returnVariable(int x);
int returnExpression(int x, int y);

int main(void){

    cout << "returnLiteral() = " << returnLiteral() << endl;
    cout << "returnVariable(73) = " << returnVariable(73) << endl;
    cout << "returnExpression(2600, 5200) = " << returnExpression(2600, 5200) << endl;

    return 0;

}


int returnLiteral(){
    return 42;
}

double returnVariable(int x){
    double y = x + 1.491625;
    return y;
}

int returnExpression(int x, int y){
    return x + y;
}

Note that in C++, return is  reserved word.

As in the programs above, user-defined functions are typically placed after the main() function. However, you must declare a function before you can use it. To get around this rule, we place function prototypes before any function definition. A function prototype is simply the function header, without the body of the function. Technically, we do not need to include the variable names in the function prototype, but why not just leave them in? It makes the code more readable, for one.

#include <iostream>
#include <cmath>

using namespace std;

int larger(double x, double y);

int main(void){

    double x, y;

    cout << "Enter value x: ";
    cin >> x;

    cout << "Enter value y: ";
    cin >> y;

    switch(larger(x, y)){
        case -1:
            cout << "x (" << x << ") is larger." << endl;
            break;
        case 1:
            cout << "y (" << y << ") is larger." << endl;
            break;
        case 0:
            cout << "x (" << x << ") and y (" << y << ") are equal." << endl;
            break;
    }

    return 0;

}


int larger(double x, double y){
    double epsilon = 0.00000001;
    
    if(fabs(x - y) < epsilon){
        return 0; //equal
    }

    if(x > y){
        return -1; //y is not larger
    }

    return 1; //y is larger
}

Note that once a function has been defined, it can be used multiple times.

 

 

 

Data Structures and File Streams

NULL is defined in the stdio.h header file. NULL is defined as either 0 or (void*)0; either way, the two values evaluate to logical false and, when compared to each other, return true. Still, it’s not a good idea to initialize a pointer to the constant value 0.

#include <stdio.h>
#include <stdlib.h>

/* ------ main function ------- */

int main ( int argc, char *argv[] )
{
    if(NULL){
        printf("NULL is true.\n");
    } else {
        printf("Null is false.\n");
    }

    if(0==NULL){
        printf("0 == NULL\n");
    } else {
        printf("0 != NULL\n");
    }


    return EXIT_SUCCESS;
}

/* ----------  end of main function  ---------- */

 Remember, when using pointers, we ought to use NULL, and not 0.

When data is read as a character stream, the application requests each character sequentially, each of which is returned as an integer (not a char) until end-of-file is reached. When EOF is reached,  a negative value, which cannot represent an ASCII character, is returned.

The main concept behind stream I/O is the files can be seen as ordered sequences of characters, which can likewise be processed in a sequential manner.

The getchar() and putchar() functions are usually implemented as macros, which makes them even faster.

 Our next program will read a flat text file and use the information to populate an array of structures. The flat text file, named sales.txt, is as follows:

Atlanta 782038.46
Austin 299772.84
Chicago 816093.76
Dallas 751815.11
Denver 936772.82
Houston 759756.27
Indianapolis 574831.60
Kansas City 844308.93
Los Angeles 969770.37
Mexico City 510140.09
Minneapolis 787488.02
Montreal 316987.23
New York 657590.42
Pittsburgh 898253.56
St. Louis 677488.73
San Francisco 513126.19
Toronto 905377.23

Each line will correspond to a structure that contains two fields, the regional office name, a char array, and the amount of sales, a double value.

The following program will read through this list using the fgetc() function, which retrieves a single character from the stream at a time. We will also use the ungetc() function to return a character to the stream; we will do this when implementing a peek() function of our own design.

#include     <stdio.h>
#include    <stdlib.h>
#include    <string.h>

#define TRUE 1
#define FALSE 0
#define OFFICE_NAME_LENGTH 32
#define BUFFER_SIZE 256

struct sales
{
    char office[OFFICE_NAME_LENGTH+1];
    double revenue;
};



char buffer[BUFFER_SIZE];

char peek(FILE *fp);

/* ------ main function ------- */

int main ( int argc, char *argv[] )
{
    struct sales SalesInfo[20];
    double total = 0;
    int ch, ch2;
    int i = 0;
    int entry = 0;
    int word = 0;
    //flags
    int endOfWord = FALSE;
    int endOfLine = FALSE;

    FILE *fp = NULL;

    fp = fopen("sales.txt", "r");

    //let's do an explicit test
    if(fp==NULL){
        printf("Could not open file.\n");
        return EXIT_FAILURE;
    }


    while((ch=fgetc(fp))!=EOF){
        //item is finished when we
        //hit end of line
        //or space without a number after it
        if(ch=='\n'){
            endOfWord = TRUE;
            endOfLine = TRUE;
        }
        if(ch=='\r'){
            continue;
        }
        //check to see if space is just part of
        //city name or else if it is between
        //the name and number
        if(ch==' '){
            ch2 = peek(fp);
            if('0' <= ch2 && ch2 <= '9'){
                endOfWord = TRUE;
            }
        }

        if(endOfWord==FALSE){
            buffer[i++] = ch;
        } else {
            //put string termination character
            buffer[i] = '';
            if(word % 2 == 0){
                strcpy(SalesInfo[entry].office, buffer);
            } else {

                SalesInfo[entry].revenue = atof(buffer);
            }
            i = 0;
            word++;    
            //reset end of word flag
            endOfWord = FALSE;
        }
        //check for end of line
        if(endOfLine==TRUE){
            //new line, new entry
            entry++;
            //reset flag
            endOfLine=FALSE;
        }        
    }//end while loop

    for(i = 0; i < entry; i++){
        //format office field to be 32 chars in length
        //format double to display two digits to the right
        //of the decimal
        printf("%32s  $ %-32.2f\n", SalesInfo[i].office, SalesInfo[i].revenue);
    }

    fclose(fp);

    return EXIT_SUCCESS;
}



/* ----------  end of main function  ---------- */
char peek(FILE *fp){
    char c = fgetc(fp);
    ungetc(c, fp);
    return c;
}

When dealing with flat text files that store data, it’s best to have some sort of delimiter, that is to say a special character that separates the fields. Commas and colons are the two delimiters most often used. For our final program, we will use colons.

Atlanta:GA:172381.48:32854.08:Schwartz
Chicago:IL:451776.83:9516.12:Burton
Dallas:TX:286582.26:56528.12:Han
Denver:CO:381825.96:69108.36:Lopez
Houston:TX:588617.94:22489.38:Patel
Kansas City:MO:394415.29:25890.03:Johnson
Los Angeles:CA:578920:99141.30:Verma
Minneapolis:MN:377488.50:23511.77:Harris
Raleigh-Durham:NC:210149.24:81298.44:Kadivar
San Francisco:CA:751622.91:18520.31:Shen
St. Louis:MO:395114.47:68191.38:Foshay

We will name the above file sales2.txt. It consists of eleven lines, with each line being a record. Each line is divided into five fields. The first field contains the city, the second field contains the state, the third field contains the revenue, the fourth field contains the expenses, and the last field contains the last name of the manager.

#include    <stdio.h>
#include    <stdlib.h>
#include    <string.h>

int countRecords(FILE *pFile);

struct SRecord{
    char szCity[32];
    char szState[2];
    double dRevenue;
    double dExpenses;
    char szManager[32];
};

char buffer[256];

/* ------ main function ------- */

int main ( int argc, char *argv[] )
{
    int iCounter = 0;
    int iNumRecords;
    int iField = 0;
    int iRecord = 0;
    int iChar = 0;

    FILE *pFile = fopen("sales2.txt", "r");

    if(pFile==NULL){
        printf("Problem opening file.\n");
        EXIT_FAILURE;
    }

    iNumRecords = countRecords(pFile);

    printf("There are %d records.\n", iNumRecords);

    struct SRecord *pSRecords = malloc(sizeof(struct SRecord)*iNumRecords);

    
    while((iChar=fgetc(pFile))!=EOF){
        if(iChar=='\n'|| iChar==':'){
            buffer[iCounter]='';
            switch(iField){
                case 0:
                    strcpy(pSRecords[iRecord].szCity, buffer);
                    break;
                case 1:
                    strcpy(pSRecords[iRecord].szState, buffer);
                    break;
                case 2:
                    pSRecords[iRecord].dRevenue = atof(buffer);
                    break;
                case 3:
                    pSRecords[iRecord].dExpenses = atof(buffer);
                    break;
                case 4:
                    strcpy(pSRecords[iRecord].szManager, buffer);
                    break;
            }
            //move to next record
            //reset field count
            if(iChar=='\n'){
                iField = 0;
                iRecord++;
            } else {
                //move to the next field
                iField++;
            }
            //reset buffer counter
            iCounter = 0;
        } else {
            buffer[iCounter++]=iChar;
        }
    }//end while loop


    for(iRecord = 0; iRecord < iNumRecords; iRecord++){
        printf("%32s %3s",pSRecords[iRecord].szCity, pSRecords[iRecord].szState);
        printf("%12.2f \t %12.2f\n", pSRecords[iRecord].dRevenue, pSRecords[iRecord].dExpenses);
    }

    //free the memory
    free(pSRecords);
            

    return EXIT_SUCCESS;
}

/* ----------  end of main function  ---------- */

//count the number of records
int countRecords(FILE *pFile){
    int iReturnVal = 0;
    int iTemp;
    //count number of newlines
    while((iTemp=fgetc(pFile))!=EOF){
        if(iTemp=='\n'){
            iReturnVal++;
        }    
    }//end while loop

    rewind(pFile);

    return iReturnVal;
}
// ---- end countRecords() -----

In the program above, we dynamically allocated the array of structs based on the number of lines we counted in the file. Note that we rewound the file pointer back to the beginning of the file after counting the number of lines.