A binary search tree is organized, as the name suggests, in a binary tree, as shown in Figure 13.1. Such a tree can be represented by a linked data structure in which each node is an object. In addition to a *key* field, each node contains fields *left, right,* and *p* that point to the nodes corresponding to its left child, its right child, and its parent, respectively. If a child or the parent is missing, the appropriate field contains the value NIL. The root node is the only node in the tree whose parent field is NIL.

The keys in a binary search tree are always stored in such a way as to satisfy the * binary-search-tree property*:

The binary-search-tree property allows us to print out all the keys in a binary search tree in sorted order by a simple recursive algorithm, called an **inorder tree walk***.* This algorithm derives its name from the fact that the key of the root of a subtree is printed between the values in its left subtree and those in its right subtree. (Similarly, a * preorder tree walk* prints the root before the values in either subtree, and a

INORDER-TREE-WALK(x)

1ifxNIL

2thenINORDER-TREE-WALK (left[x])

3 printkey[x]

4 INORDER-TREE-WALK (right[x])

Draw binary search trees of height 2, 3, 4, 5, and 6 on the set of keys {1, 4, 5, 10, 16, 17, 21}.

What is the difference between the binary-search-tree property and the heap property (7.1)? Can the heap property be used to print out the keys of an *n*-node tree in sorted order in *O*(*n*) time? Explain how or why not.

Give a nonrecursive algorithm that performs an inorder tree walk. (*Hint: *There is an easy solution that uses a stack as an auxiliary data structure and a more complicated but elegant solution that uses no stack but assumes that two pointers can be tested for equality.)

Argue that since sorting *n* elements takes (*n *lg* n*) time in the worst case in the comparison model, any comparison-based algorithm for constructing a binary search tree from an arbitrary list of *n* elements takes (*n *lg* n*)* *time in the worst case.

The most common operation performed on a binary search tree is searching for a key stored in the tree. Besides the SEARCH operation, binary search trees can support such queries as MINIMUM, MAXIMUM, SUCCESSOR, and PREDECESSOR. In this section, we shall examine these operations and show that each can be supported in time *O*(*h*) on a binary search tree of height *h.*

We use the following procedure to search for a node with a given key in a binary search tree. Given a pointer to the root of the tree and a key *k,* TREE-SEARCH returns a pointer to a node with key *k* if one exists; otherwise, it returns NIL.

TREE-SEARCH (x, k)

1ifx= NIL ork = key[x]

2then returnx

3ifk < key[x]

4then returnTREE-SEARCH(left[x],k)

5else returnTREE-SEARCH(right[x],k)

ITERATIVE-TREE-SEARCH (x,k)

1whilexNIL andkkey[x]

2do ifk<key[x]

3thenxleft[x]

4elsexright[x]

5returnx

An element in a binary search tree whose key is a minimum can always be found by following *left* child pointers from the root until a NIL is encountered, as shown in Figure 13.2. The following procedure returns a pointer to the minimum element in the subtree rooted at a given node *x.*

TREE-MINIMUM (x)

1whileleft[x] NIL

2doxleft[x]

3returnx

The pseudocode for TREE-MAXIMUM is symmetric.

TREE-MAXIMUM (x)

1whileright[x] NIL

2doxright[x]

3returnx

Given a node in a binary search tree, it is sometimes important to be able to find its successor in the sorted order determined by an inorder tree walk. If all keys are distinct, the successor of a node *x* is the node with the smallest key greater than *key*[*x*]*.* The structure of a binary search tree allows us to determine the successor of a node without ever comparing keys. The following procedure returns the successor of a node *x* in a binary search tree if it exists, and NIL if *x* has the largest key in the tree.

TREE SUCCESSOR(x)

1ifright[x] NIL

2then returnTREE-MINIMUM(right[x])

3yp[x]

4whileyNIL andx=right[y]

5doxy

6yp[y]

7returny

The running time of TREE-SUCCESSOR on a tree of height *h*is *O*(*h*), since we either follow a path up the tree or follow a path down the tree. The procedure TREE-PREDECESSOR, which is symmetric to TREE-SUCCESSOR, also runs in time *O*(*h*).

In summary, we have proved the following theorem.

**a.**** **2, 252, 401, 398, 330, 344, 397, 363.

**b.**** **924, 220, 911, 244, 898, 258, 362, 363.

**c.**** **925, 202, 911, 240, 912, 245, 363.

**d.**** **2, 399, 387, 219, 266, 382, 381, 278, 363.

**e.**** **935, 278, 347, 621, 299, 392, 358, 363.

Use the binary-search-tree property to prove rigorously that the code for TREE-SUCCESSOR is correct.

An inorder tree walk of an *n*-node binary search tree can be implemented by finding the minimum element in the tree with TREE-MINIMUM and then making *n* - 1 calls to TREE-SUCCESSOR. Prove that this algorithm runs in (*n*) time.

The operations of insertion and deletion cause the dynamic set represented by a binary search tree to change. The data structure must be modified to reflect this change, but in such a way that the binary-search-tree property continues to hold. As we shall see, modifying the tree to insert a new element is relatively straightforward, but handling deletion is somewhat more intricate.

TREE-INSERT(T,z)

1yNIL

2xroot[T]

3whilexNIL

4doyx

5ifkey[z] <key[x]

6thenxleft[x]

7elsexright[x]

8p[z]y

9ify= NIL

10thenroot[T]z

11else ifkey[z] <key[y]

12thenleft[y]z

13elseright[y]z

The procedure for deleting a given node *z* from a binary search tree takes as an argument a pointer to *z*. The procedure considers the three cases shown in Figure 13.4. If *z* has no children, we modify its parent *p*[*z*] to replace *z* with NIL as its child. If the node has only a single child, we "splice out" *z* by making a new link between its child and its parent. Finally, if the node has two children, we splice out *z*'s successor *y*, which has no left child (see Exercise 13.3-4) and replace the contents of *z* with the contents of *y*.

The code for TREE-DELETE organizes these three cases a little differently.

TREE-DELETE(T, z)

1ifleft[z] = NIL orright[z] = NIL

2thenyz

3elseyTREE-SUCCESSOR(z)

4ifleft[y] NIL

5thenxleft[y]

6elsexright[y]

7ifxNIL

8thenp[x]p[y]

9ifp[y] = NIL

10thenroot[T]x

11else ify=left[p[y]]

12thenleft[p[y]]x

13elseright[p[y]]x

14ifyz

15thenkey[z]key[y]

16 Ifyhas other fields, copy them, too.

17returny

In summary, we have proved the following theorem.

Give a recursive version of the TREE-INSERT procedure.

We can sort a given set of *n* numbers by first building a binary search tree containing these numbers (using TREE-INSERT repeatedly to insert the numbers one by one) and then printing the numbers by an inorder tree walk. What are the worst-case and best-case running times for this sorting algorithm?

We have shown that all the basic operations on a binary search tree run in *O*(*h*) time, where *h* is the height of the tree. The height of a binary search tree varies, however, as items are inserted and deleted. In order to analyze the behavior of binary search trees in practice, it is reasonable to make statistical assumptions about the distribution of keys and the sequence of insertions and deletions.

Unfortunately, little is known about the average height of a binary search tree when both insertion and deletion are used to create it. When the tree is created by insertion alone, the analysis becomes more tractable. Let us therefore define a * randomly built binary search tree* on

We begin by investigating the structure of binary search trees that are built by insertion alone.

k= min {_{i}k: 1_{l}liandk>_{l}k}_{j}

k= max {_{i}k: 1_{l}liandk<_{l}k} ._{j}

G= {_{j}k: 1_{i}i<jandk>_{l}k>_{i}kfor all_{j}l<isuch thatk>_{l}k}_{j}

L= {_{j}k: 1_{i}i<jandk<_{l}k<_{i}kfor all_{j}l<isuch thatk<_{l}k} ._{j}

d(k,_{j}T) = |GL_{j}| + |_{j}| .

where *H _{n}* = ln

S= {k: 1_{i}inandk>_{l}kfor all_{i}l<i} .

* Proof *We can view the cardinality of the set

which follows from the definition of *. *

We now have the tools to bound the height of a randomly built binary search tree.

The average height of a randomly built binary search tree on *n* distinct keys is *O*(lg *n*).

Pr{d(k,_{j}T)t} Pr{|G|_{j}t/2} + Pr{|L|_{j}t/2} .

Let us examine Pr{|*G _{j}*|

Pr{|Gj|t/2}

= Pr{|{k: 1_{i}i<jandk>_{l}k>_{i}kfor all_{j}l<i}|t/2}

Pr{|{k:_{i}inandk> k_{i}ifor alll>i}|t/2}

= Pr{|S| t/2} ,

Using a symmetric argument, we can prove that

Pr{|L|_{j}_{ }_{t/2} Pr{|S| t/2},}

and thus, by inequality (13.2), we obtain

Pr {d(k)_{j}, Tt} 2 Pr{|S|t/2} .

Pr{d(k) 2(_{j}, T+ 1)H_{n}} 2Pr{|S| (+ 1)H}_{n}

2/n^{2 }.

Consider RANDOMIZED-QUICKSORT operating on a sequence of *n* input numbers. Prove that for any constant *k* > 0, all but *O*(l/*n ^{k}*) of the

13-1 Binary search trees with equal keys

Equal keys pose a problem for the implementation of binary search trees.

**c.**** **Keep a list of nodes with equal keys at *x*, and insert *z* into the list.

Given two strings *a* = *a _{0}a*

2.* p<q* and *a _{i }*=

13-3 Average node depth in a randomly built binary search tree

In this problem, we prove that the average depth of a node in a randomly built binary search tree with *n* nodes is *O(*lg *n)*. Although this result is weaker than that of Theorem 13.6, the technique we shall use reveals a surprising similarity between the building of a binary search tree and the running of RANDOMIZED-QUICKSORT from Section 8.3.

* a.* Argue that the average depth of a node in

Thus, we wish to show that the expected value of *P*(*T*) is *O*(*n *1g* n*).

P(T) =P(T) +_{L}P(T) +_{R}n- 1 .

* d. *Show that

* e. *Recalling the analysis of the randomized version of quicksort, conclude that

13-4 Number of different binary trees

Let *b _{n}* denote the number of different binary trees with

* a. *Show that

* b *Let

The * Taylor expansion* of

where_{ }*f*^{(k)} (*x*) is the *k*th derivative of *f* evaluated at *x*.

(the *n*th * Catalan number*) by using the Taylor expansion of around