Brilliant Sheep
6Nov/11

Serializing and Deserializing a Binary Tree in Java

I've noticed that are couple of places that talk about the serialization and deserialization (i.e. storing and restoring) of binary trees (not necessarily binary search trees), however, I wasn't able to find a place that provides an actual implementation in Java. The closest thing that I found was a C implementation which was limited to a specific case (i.e. one-character nodes) and it still had some issues. Thus, finding myself cornered I've decided to bite the bullet and write a Java implementation that I could learn from and share with the rest of yinz ;) That said, my implementation ended up being as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
import java.io.*;
 
public class BinaryTreeUtils {
 
    /**
     * Takes the root node of one tree, and the root node of another tree and
     * determines if they're structurally the same (i.e. do they have the same
     * structure and same values in each of the nodes).
     */
    public static boolean isSame( Node a, Node b ) {
        // Both trees are empty, and thus equal.
        if( a == null && b == null ) {
            return true;
        }
        // Both trees are not empty, so that's check in detail if they match.
        else if( a != null && b != null ) {
            // Check first if the current nodes have the same data, before
            // checking recursively their left subtree and/or right subtree.
            return a.data.equals( b.data ) && 
                isSame( a.left, b.left ) && isSame( a.right, b.right );
        }
        // One of the trees is, so no need to check in detail.
        else {
            return false;
        }
    }
 
    /** Serializes the tree pointed out by the node */
    public static void serialize( final OutputStream out, Node node ) throws IOException {
        if( node == null ) {
            out.write( "()".getBytes( ) );
        } else {
            out.write( "(".getBytes( ) );
            out.write( ( node.data + "" ).getBytes( ) );
            serialize( out, node.left );
            serialize( out, node.right );
            out.write( ")".getBytes( ) );
        }
    }
 
    /** Deserializes the tree and returns its root node */
    public static Node deserialize( final BufferedInputStream reader ) throws IOException {
 
        if( reader.available( ) > 0 ) {
            char c = ( char ) reader.read( );
 
            if( c == '(' ) {
                if( reader.available( ) > 0 ) {
 
                    c = ( char ) reader.read( );
                    if( c == ')' ) {
                        return null;
                    }
 
                    StringBuffer sb = new StringBuffer( );
                    while( c != ')' && c != '(' ) {
                        sb.append( c );
 
                        if( reader.available( ) > 0 ) {
                            reader.mark( 1 );
                            c = ( char ) reader.read( );
                        } else {
                            break;
                        }
                    }
 
                    Node newNode = new Node( sb.toString( ) );
 
                    if( c == '(' ) {
                        reader.reset( );
                        newNode.left = deserialize( reader );
                        newNode.right = deserialize( reader );
                    }
 
                    if( reader.available( ) > 0 ) {
                        reader.mark( 1 );
                        c = ( char ) reader.read( );
                        if( c != ')' ) {
                            reader.reset( );
                        }
                    }
 
                    return newNode;
                }
            }
        }
 
        // There's nothing to process.
        return null;
    }
 
    /** Test Method */
    public static void main( String[ ] args ) {
 
        BinarySearchTree bst = new BinarySearchTree( );
        String[ ] input = 
            { "mlb", "cpe", "cma", "wno", "btu", "keh", "gaj", "yft", "lcu", "sgy" };
 
        // Populate the tree
        for( String inputString : input ) {
            bst.insert( inputString );
        }
 
        // Serialize the tree, and print the output to the screen.
        try {
            System.out.println( "Serialization Output:" );
            BinaryTreeUtils.serialize( System.out, bst.root );
        } catch( IOException e ) {
            e.printStackTrace( );
        }
 
        // Deserialize the tree located in c:/java/tree.txt and check if it's the same. 
        try {
            BinarySearchTree bst2 = new BinarySearchTree( );
            bst2.root = BinaryTreeUtils.deserialize( 
                new BufferedInputStream( new FileInputStream( "c:/java/tree.txt" ) ) );
 
            System.out.println( "\n\nDeserialization Confirmation:" );
            System.out.println( "bst2 is the same as bst1? " + 
                ( BinaryTreeUtils.isSame( bst.root, bst2.root ) ? "Yes!" : "No." ) );
 
        } catch( FileNotFoundException e ) {
            e.printStackTrace( );
        } catch( IOException e ) {
            e.printStackTrace( );
        }
    }
}

This implementation uses nodes that hold variable-length strings; however, be sure NOT to store strings that contain the following braces, i.e. ( and ) as the application will throw up :P

In case you're missing a binary [search] tree implementation for testing, feel free to use the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
/** Node of a Binary Search Tree */
class Node {
    Node parent, left, right;
    String data;
    public Node( String data ) {
        this.data = data;
    }
}
 
/** Binary Search Tree */
public class BinarySearchTree {
 
    /** Root node of the binary search tree. */
    public Node root;
 
    /**
     * Inserts a new node that has the specified data. The insert starts from
     * the root of the tree and goes all the way down until it finds a spot to
     * place the new node.
     */
    public void insert( String data ) {
        root = insert( root, data );
    }
 
    /**
     * Inserts a new node that has the specified data. The insert starts from
     * the specified node of the tree and goes all the way down until it finds 
     * a spot to place the new node.
     */
    public Node insert( Node node, String data ) {
        if( node == null ) {
            // Found a spot to insert new node.
            node = new Node( data );
        } else if( data.compareTo( node.data ) < 0) {
            // Insert in left subtree
            node.left = insert( node.left, data );
            node.left.parent = node;
        } else {
            // Insert right subtree.
            node.right = insert( node.right, data );
            node.right.parent = node;
        }
        // Done!
        return node;
    }
 
    // ...
}

Finally, the contents of tree.txt are the same as the ones produced by the serialization in the test code, i.e.:

(mlb(cpe(cma(btu()())())(keh(gaj()())(lcu()())))(wno(sgy()())(yft()())))

That's all. Enjoy :)

Comments (3) Trackbacks (0)
  1. I was asked this question in Amazon interview. I gave a similar answer for my algorithm. The interviewee did not like it. This can only serialize and deserialize a binary tree, and you have to develop a different parsing logic to read this. The way serialization generally works in the programming language libraries is just by joining the data of all variables with a separator in between. That is what was expected. Then by identifying which variables are null, you can identify the leaf nodes as well. That kind of logic would work for almost any class or data structure.

  2. I had to implement this in prep for another interview… so here u go:

    public String serialize(){
    		StringBuilder serial = new StringBuilder();
    		serialize(root, serial);
    		return serial.toString();
    	}
    	private void serialize(Node n, StringBuilder serial){
    		if(serial.length()&gt;0){
    			serial.append(".");
    		}
    		if(n!=null){			
    			serial.append(n.data);
    			serialize(n.left,serial);
    			serialize(n.right,serial);			
    		}
    		else{
    			serial.append("null");
    		}
    	}
    	public void deserialize(String serial){
    		Scanner s=new Scanner(serial);	
    		s.useDelimiter("\\.");
    		root=deserialize(s);
    	}
    	private Node deserialize(Scanner serial){	
    		if(!serial.hasNext()){
    			return null;
    		}
    		else if(serial.hasNext("null")){
    			serial.next("null");
    			return null;
    		}
    		else if(serial.hasNextInt()){
    			Node n=new Node();			
    			n.data=serial.nextInt();
    			n.left=deserialize(serial);
    			n.right=deserialize(serial);
    			return n;
    		}
    		return null;
    	}
  3. @yossee thanks for stopping by. I definitely will take a look at your approach for having a generic serialization/deserialization process that would be applicable to several data structures.

    Good luck with your interviewing :)


Leave a comment

(required)

No trackbacks yet.