------------------------------------------------
         "Understanding Linked Lists"
------------------------------------------------
  C/O :: arp of DynamicHell Development Team
------------------------------------------------
  http://dynamichell.org | irc.dynamichell.org
------------------------------------------------

This tutorial is aimed at users with a basic understanding of C and, more 
importantly, pointers.  This technique is portable and can be used on any
operating system with a standard C library and compiler.

There are many data structures available to a programmer.  Perhaps the most
common being the linked-list.  A linked-list is exactly what it's description
suggests: a collection of lists (data structures) all linked together.

Linked-lists are linked data structures, joined together by pointers to the 
next member of a list (or if a double linked-list) by pointers to the previous 
and the next node in the list. 

Linked-lists are reasonably fast to traverse as all a process must do is
access the pointer to the next node until the required node, or the
end is found (to add another node).  This is much faster than accessing an
array of structures, incrementing the index, checking the structure, and 
repeating the process.  Furthermore, linked-lists allow the programmer to 
forget the number of nodes in the data structure, unlike an array which is
statically allocated.

The following diagram shows a visual representation of a linked-list:

Node    :  === 
Pointer :  ---

ZERO---===---===---===---===---===---===---===---NULL

The root node is usually allocated but unused (possibly zeroed).  The end of
the list is indicated by a NULL pointer.  The NULL pointer  is important as 
it indicates to any program traversing the list its end; the program knows 
where the list ends and can act appropriately.

Implementing a Linked-list (C)
==============================

The traditional structure of a node (the instrument for holding our data) is as
follows:

struct node {
       char *data;
       struct node *next;  /* Pointer to next node */
};


Of course we need a means of allocating our nodes for our dynamic linked-list.
A simple function such as the following will suffice:

struct node *node_alloc(void)
{
	struct node *tmp = calloc(1, sizeof(struct node));
	if(tmp == NULL)
	       exit(EXIT_FAILURE);

	return tmp;
}


We also need to be able to add data to our newly created structs. A function
which calls the above node_alloc and adds data to the list follows:

void node_add_data(struct node *rootnode, char *data)
{
	struct node *tmp;

	for(tmp = rootnode; tmp->next != NULL; tmp = tmp->next);

	/* Traverse the list until node->next is NULL which we will
	   add to. When adding we do not traverse until node == NULL,
	   otherwise we are at the end of the list and unable to 
	   connect to a previous node. 
	 */ 

	 if(tmp->next == NULL) { /* Sanity Check */
                 tmp->next = node_alloc(); /* Connect to the list */
		 tmp = tmp->next;
		 tmp->data = strdup(data);
		 tmp->next = NULL;         /* Create the new end */
	 } else
		 exit(EXIT_FAILURE);
}


With functions for creating nodes and adding data (nodes) to the list, we are
now ready to write a function suitable for reading the data from the list.

void display_list(struct node *rootnode)
{
	struct node *tmp;


	for(tmp = rootnode; tmp != NULL; tmp = tmp->next) {

		if(tmp->data != 0) { /* For our root node, explanation later */
	               printf("Data is %s\n", tmp->data);
		}
	}
}


We are almost finished.  The only thing left to do is to clean up once we have
finished using our linked list.  The following function is a typical example of
how to free a linked list and its data.

void free_list(struct node *rootnode)
{
	struct node *tmp = rootnode;
	struct node *fwd;

	while(tmp != NULL) {
		fwd = tmp->next;
		free(tmp->data);
		free(tmp);	
		tmp = fwd;
	}
}

Notice that throughout all of these examples we never actually modify rootnode
directly.  If we did this then we would have no pointer to our root node, and 
thus, our linked-list would break.


Final Code
==========

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

struct node {
       char *data;
       struct node *next;  /* Pointer to next node */
};

void display_list(struct node *rootnode);
struct node *node_alloc(void);
void node_add_data(struct node *rootnode, char *data);
void free_list(struct node *rootnode);

int main(void)
{
	struct node *rootnode = node_alloc(); /* Root node, which remains 
						 zeroed, and is not used to 
						 store data.`         `   */

	node_add_data(rootnode, "This is our data");
	node_add_data(rootnode, "This is another data");

	display_list(rootnode);

	free_list(rootnode);

	return EXIT_SUCCESS;
}

struct node *node_alloc(void)
{
	struct node *tmp = calloc(1, sizeof(struct node));
	if(tmp == NULL) {
		fprintf(stderr, "calloc()\n");
	       exit(EXIT_FAILURE);
	}
	return tmp;
}

void node_add_data(struct node *rootnode, char *data)
{
	struct node *tmp;

	for(tmp = rootnode; tmp->next != NULL; tmp = tmp->next);

	/* Traverse the list until node->next is NULL which we will
	   add to. When adding we do not traverse until tmp == NULL,
	   otherwise we are at the end of the list and unable to 
	   connect to a previous node. 
	 */ 

	 if(tmp->next == NULL) { /* Sanity Check */
                 tmp->next = node_alloc(); /* Connect to the list */
		 tmp = tmp->next;
		 tmp->data = strdup(data);
		 tmp->next = NULL;         /* Create the new end */
	 } else {
		 fprintf(stderr, "Oh shit!\n");
		 exit(EXIT_FAILURE);
	 }
}	

void display_list(struct node *rootnode)
{
	struct node *tmp;

	for(tmp = rootnode; tmp != NULL; tmp = tmp->next) {

		if(tmp->data != 0) { /* For our root node, explanation later */
	               printf("Data is %s\n", tmp->data);
		}
	}
}

void free_list(struct node *rootnode)
{
	struct node *tmp;
	struct node *fwd;

	while(tmp != NULL) {
		fwd = tmp->next;
		free(tmp->data);
		free(tmp);	
		tmp = fwd;
	}
}


Summary
=======

Linked lists are relatively fast data structures.  They are easy to implement 
and are used throughout the software industry.  Examples include such high 
profile projects as the Linux Kernel.

The root node of the list is never directly accessed, only pointers to it, so 
not to damage the data structure.

The end of the list is indicated by a NULL pointer, and the programmer must 
remember to set it.

Rather than traversing to the end of the list when adding, the programmer must
move to the last node, so that the next node is still connected to the list.
Moving to the NULL pointer would change the address of the pointer which the 
previous last node pointer to; breaking the list.

However, when wanting to read all the nodes in the list it is perfectly 
acceptable to traverse the whole list, down to the NULL pointer.

When printing data from the linked list you must remember to ignore data from
the root node.  An earlier example which tested whether the data was zero is 
an acceptable method.



Copryright (C) 2006.  Alastair Poole.

Verbatim copying and distribution of this entire article are permitted 
worldwide, without royalty, in any medium, provided this notice, and the 
copyright notice, are preserved.