main-content

This textbook explains the concepts and techniques required to write programs that can handle large amounts of data efficiently. Project-oriented and classroom-tested, the book presents a number of important algorithms supported by examples that bring meaning to the problems faced by computer programmers. The idea of computational complexity is also introduced, demonstrating what can and cannot be computed efficiently so that the programmer can make informed judgements about the algorithms they use. Features: includes both introductory and advanced data structures and algorithms topics, with suggested chapter sequences for those respective courses provided in the preface; provides learning goals, review questions and programming exercises in each chapter, as well as numerous illustrative examples; offers downloadable programs and supplementary files at an associated website, with instructor materials available from the author; presents a primer on Python for those from a different language background.

### Chapter 1. Python Programming 101

Abstract
This first chapter introduces Python to readers that have never programmed in Python before. It also provides instructions for installing Python on your computer. The Wing Integrated Development Environment (IDE) is recommended for students learning to program. Wing IDE 101 is the free version of the IDE for educational use. The chapter goes on to introduce classes, objects, and object-oriented programming by developing a drawing application using the Tkinter graphical user interface library. An understanding of object-oriented programming concepts presented in this chapter is necessary before reading the rest of the text. The PyList datatype presented in this chapter is also used in subsequent chapters including Chap. 2. Even if the reader has worked through an introductory text or has a basic understanding of Python, there are concepts from this first chapter that are used throughout the text.
Kent D. Lee, Steve Hubbard

### Chapter 2. Computational Complexity

Abstract
In the last chapter we developed a drawing program. To hold the drawing commands we built the PyList container class which is a lot like the built-in Python list class, but helps illustrate our first data structure. When we added a drawing command to the sequence we called the append method. It turns out that this method is called a lot. In fact, the flower picture in the first chapter took around 700 commands to draw. You can imagine that a complex picture with lots of free-hand drawing could contain thousands of drawing commands. When creating a free-hand drawing we want to append the next drawing command to the sequence quickly because there are so many commands being appended. How long does it take to append a drawing command to the sequence? Can we make a guess? Should we care about the exact amount of time? In this chapter you’ll learn how to answer these questions and you’ll learn what questions are important for you as a computer programmer. First you’ll read about some principles of computer architecture to understand something about how long it takes a computer to do some simple operations. With that knowledge you’ll have the tools you’ll need to make informed decisions about how much time it might take to execute some code you have written.
Kent D. Lee, Steve Hubbard

### Chapter 3. Recursion

Abstract
Don’t think too hard! That’s one of the central themes of this chapter. It’s not often that you tell computer programmers not to think too hard, but this is one time when it is appropriate. You need to read this chapter if you have not written recursive functions before. Most computer science students start by learning to program in a style called imperative programming. This simply means that you are likely used to thinking about creating variables, storing values, and updating those values as a program proceeds. In this chapter you are going to begin learning a different style of programming called functional programming. When you program in the functional style, you think much more about the definition of what you are programming than how you are going to program it. Some say that writing recursive functions is a declarative approach rather than an imperative approach. You’ll start to learn what that means for you very soon. When you start to get good at writing recursive functions you’ll be surprised how easy it can be!
Kent D. Lee, Steve Hubbard

### Chapter 4. Sequences

Abstract
Computers are really good at dealing with large amounts of information. They can repeat a task over and over again without getting bored. When they repeat a task they are generally doing the same thing to similar data or objects. It is natural to want to organize those objects into some kind of structure so that our program can easily switch from one object to the next. How objects are added to a sequence or collection and how we move from one item to the next has some impact on how we might want to organize the collection of data in a program. In this chapter we look at different ways of organizing data into a sequence. We’ll also examine how to use Python to make working with sequences convenient. Operator overloading in Python lets us build sequences that we can manipulate with intuitive operations. Finally, we also examine how the organization of a sequence affects the computation complexity of operations on it.
Kent D. Lee, Steve Hubbard

### Chapter 5. Sets and Maps

Abstract
In the last chapter we studied sequences which are used to keep track of lists of things where duplicate values are allowed. For instance, there can be two sixes in a sequence or list of integers. In this chapter we look at sets where duplicate values are not allowed. After examining sets we’ll move on to talk about maps. Maps may also be called dictionaries or hash tables.The term hash table actually suggests an implementation of a set or map. The primary focus of this chapter is in understanding hashing. Hashing is a very important concept in Computer Science because it is a very efficient method of searching for a value. To begin the chapter we’ll motivate our interest in hashing, then we’ll develop a hashing algorithm for finding values in a set. We’ll also apply hashing to the building of sets and maps. Then we’ll look at an important technique that uses hashing called memoization and we’ll apply that technique to a couple of problems.
Kent D. Lee, Steve Hubbard

### Chapter 6. Trees

Abstract
When we see a tree in our everyday lives the roots are generally in the ground and the leaves are up in the air. The branches of a tree spread out from the roots in a more or less organized fashion. The word tree is used in Computer Science when talking about a way data may be organized. Trees have some similarilties to the linked list organization found in Chap. 4. In a tree there are nodes which have links to other nodes. In a linked list each node has one link, to the next node in the list. In a tree each node may have two or more links to other nodes. A tree is not a sequential data structure. It is organized like a tree, except the root is at the top of tree data structures and the leaves are at the bottom. A tree in computer science is usually drawn inverted when compared to the trees we see in nature. There are many uses for trees in computer science. In this chapter we’ll explore trees and when it makes sense to build and or use a tree in a program. Not every program will need a tree data structure. Nevertheless, trees are used in many types of programs. A knowledge of them is not only a necessity, proper use of them can greatly simplify some types of programs.
Kent D. Lee, Steve Hubbard

### Chapter 7. Graphs

Abstract
Many problems in Computer Science and Mathematics can be reduced to a set of states and a set of transitions between these states. A graph is a mathematical representation of problems like these. In the last chapter we saw that trees serve a variety of purposes in Computer Science. Trees are graphs. However, graphs are more general than trees. Abstracting away the details of a problem and studying it in its simplest form often leads to new insight. As a result, many algorithms have come out of the research in graph theory. Graph theory was first studied by mathematicians. Many of the algorithms in graph theory are named for the mathematician that developed or discovered them. Dijkstra and Kruskal are two such mathematicians and this chapter covers algorithms developed by them.Representing a graph can be done one of several different ways. The correct way to represent a graph depends on the algorithm being implemented. Graph theory problems include graph coloring, finding a path between two states or nodes in a graph, or finding a shortest path through a graph among many others. There are many algorithms that have come from the study of graphs. To understand the formulation of these problems it is good to learn a little graph notation which is presented in this chapter as well.
Kent D. Lee, Steve Hubbard

### Chapter 8. Membership Structures

Abstract
In Chap. 5 we covered data structures that support insertion, deletion, membership testing, and iteration. For some applications testing membership may be enough. Iteration and deletion may not be necessary. The classic example is that of a spell checker. Consider the job of a spell checker. A simple one may detect errors in spelling while a more advanced spell checker may suggest alternatives of correctly spelled words.
Clearly a spell checker is provided with a large dictionary of words. Using the list of words the spell checker determines whether a word you have is in the dictionary and therefore a correct word. If the word does not appear in the dictionary the word processor or editor may underline the word indicating it may be incorrectly spelled. In some cases the word processor may suggest an alternative, correctly spelled word. In some cases, the word processor may simply correct the misspelling. How do these spell checkers/correctors work? What kind of data structures do they use?
Kent D. Lee, Steve Hubbard

### Chapter 9. Heaps

Abstract
The word heap is used in a couple of different contexts in Computer Science. A heap sometimes refers to an area of memory used for dynamic (i.e. run-time) memory allocation. Another meaning, and the topic of this chapter, is a data structure that is conceptually a complete binary tree. Heaps are used in implementing priority queues, the heapsort algorithm, and some graph algorithms. Heaps are somewhat like binary search trees in that they maintain an ordering of the items within the tree. However, a heap does not maintain a complete ordering of its items. This has some implications for how a heap may be used.
Kent D. Lee, Steve Hubbard

### Chapter 10. Balanced Binary Search Trees

Abstract
In Chap. 6 binary search trees were defined along with a recursive insert algorithm. The discussion of binary search trees pointed out they have problems in some cases. Binary search trees can become unbalanced, actually quite often. When a tree is unbalanced the complexity of insert, delete, and lookup operations can get as bad as $$\Theta (n)$$. This problem with unbalanced binary search trees was the motivation for the development of height-balanced AVL trees by G.M. Adelson-Velskii and E.M. Landis, two Soviet computer scientists, in 1962. AVL trees were named for these two inventors. Their paper on AVL trees [AVL62] described the first algorithm for maintaining balanced binary search trees. The chapter goes on to discuss Splay Trees as another example of balanced binary search trees.
Kent D. Lee, Steve Hubbard

### Chapter 11. B-Trees

Abstract
This chapter covers one of the more important data structures of the last thirty years. B-Trees are primarily used by relational databases to efficiently implement an operation called join. B-Trees have other properties that are also useful for databases including ordering of rows within a table, fast delete capability, and sequential access. The chapter begins by introducing some terminology from relational databases to motivate the need for B-Trees. It goes on to introduce the B-Tree data structure and its implementation. By the end of this chapter you should have an understanding of B-Trees, their advantages over other data structures, and you should be able to demonstrate your understanding by implementing a B-Tree that can be used to efficiently process em joins in relational databases.
Kent D. Lee, Steve Hubbard

### Chapter 12. Heuristic Search

Abstract
This text has focused on the interaction of algorithms with data structures. Many of the algorithms presented in this text deal with search and how to organize data so searching can be done efficiently. Many problems involve searching for an answer among many possible solutions, not all of which are correct. Sometimes, there are so many possibilities, no algorithm can be written that will efficiently find a correct solution amongst all the possible solutions. In these cases, we may be able to use a rule of thumb, most often called a heuristic in computer science, to eliminate some of these possibilities from our search space. If the heuristic does not eliminate possible solutions, it may at least help us order the possible solutions so we look at better possible solutions first, whatever better might mean. In Chap. 7 depth first search of a graph was presented. Sometimes search spaces for graphs or other problems grow to such an enormous size, it is impossible to blindly search for a goal node. This is where a heuristic can come in handy. This chapter uses searching a maze, which is really just a type of graph, as an example to illustrate several search algorithms that are related to depth first or breadth first search. Several applications of these search algorithms are also presented or discussed. Heuristic search is often covered in texts on Artificial Intelligence [Cop04]. As problems in AI are better understood, algorithms arise that become more commonplace over time. The heuristic algorithms presented in this chapter are covered in more detail in an AI text, but as data sizes grow, heuristic search will become more and more necessary in all sorts of applications. AI techniques may be useful in many search problems and so are covered in this chapter to provide an introduction to search algorithms designed to deal with large or infinite search spaces.
Kent D. Lee, Steve Hubbard

### Chapter 13. Appendix A: Integer Operators

Abstract
This documentation was generated from the Python documentation available by typing help(int) in the Python shell.
Kent D. Lee, Steve Hubbard

### Chapter 14. Appendix B: Float Operators

Abstract
This documentation was generated from the Python documentation available by typing help(float) in the Python shell.
Kent D. Lee, Steve Hubbard

### Chapter 15. Appendix C: String Operators and Methods

Abstract
This documentation was generated from the Python documentation available by typing help(str) in the Python shell.
Kent D. Lee, Steve Hubbard

### Chapter 16. Appendix D: List Operators and Methods

Abstract
This documentation was generated from the Python documentation available by typing help(list) in the Python shell.
Kent D. Lee, Steve Hubbard

### Chapter 17. Appendix E: Dictionary Operators and Methods

Abstract
This documentation was generated from the Python documentation available by typing help(dict) in the Python shell.
Kent D. Lee, Steve Hubbard

### Chapter 18. Appendix F: Turtle Methods

Abstract
This documentation was generated from the Python documentation available by typing from turtle import * help (Turtle) in the Python shell.
Kent D. Lee, Steve Hubbard

### Chapter 19. Appendix G: TurtleScreen Methods

Abstract
This documentation was generated from the Python documentation available by typing from turtle import * help (TurtleScreen) in the Python shell.
Kent D. Lee, Steve Hubbard

### Chapter 20. Appendix H: Complete Programs

Abstract
This is the sample drawing application from the first chapter. It illustrates the use of the tkinter library including many widgets and mouse handling. This program can be downloaded from the text’s website.
Kent D. Lee, Steve Hubbard