|
|
The implementation of a visualizing tool for tree edit distance problems 指導老師:石維寬 教授 研究生:吳政忠
Outline Motivation and Goal Tree Edit Distance Shasha & Zhang Algorithm Klein Algorithm Tool Open Source Tool VGJ My TED Tool Result of Algorithms on My Tool Experiment Conclusion 2009/7/6 Thesis 2
Motivation Tree edit distance problem can be utilized in many fields XML(Extensible Markup Language) 、RNA secondary structure、compiler optimization、image analysis、 natural language processing Need an useful tool to help with the study of related problems 2009/7/6 Thesis 3
Goal Create a tool to visualize the process and result of tree edit distance algorithm Rooted ordered tree and uni-cost 2009/7/6 Thesis 4
Tree Edit Distance 2009/7/6 Thesis 5
Mapping 2009/7/6 Thesis 6 After match nodes then F delete F[4] and insert G[3], F is transformed into G
Operation 2009/7/6 Thesis 7 Insert some node in one tree equals to delete that node in another tree Therefore, edit operations become (1) delete a node from F (2) delete a node from G (3) match a a a G b 1 2 3 4 F a a a 1 2 3
Shasha and Zhang Algorithm Kaizhong Zhang, Dennis Shasha Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems.SIAM Journal of Computing.1989,Vol. 18,6,pp.1245-1262. Time : Space : Postorder + Right Operation 2009/7/6 Thesis 8
Input 2009/7/6 Thesis 9
Result of Algorithm 2009/7/6 Thesis 10 a a a a a a a a F b b b b b b b b G b a a a a a a a F b b b b b b b b G 1 2 3 4 5 6 7 8 1 2 4 3 5 6 7 8 1 2 3 4 5 6 7 8 1 2 4 3 5 6 7 8 match match
Result of Algorithm : (2) 2009/7/6 Thesis 11 a a b a a a a a F b b b b b b b b G b a b a a a a a F b b b b b b b G 1 2 3 4 5 6 7 8 1 2 4 3 5 6 7 8 1 2 3 4 5 6 7 8 1 2 4 3 5 6 7 8 delete G[6] match
Result of Algorithm : (3) 2009/7/6 Thesis 12 b a b a a a b a F b b b b b b b G b a b a a b b a F b b b b b b b G 1 2 3 4 5 6 7 8 1 2 4 3 5 6 7 8 1 2 3 4 5 6 7 8 1 2 4 3 5 6 7 8 match match
Result of Algorithm : (4) 2009/7/6 Thesis 13 b b b a a b b a F b b b b b b b G b b b a a b b F b b b b b b b G 1 2 3 4 5 6 7 8 1 2 4 3 5 6 7 8 1 2 3 4 5 6 7 8 1 2 4 3 5 6 7 8 delete F[3] match
Result of Algorithm : (5) 2009/7/6 Thesis 14 b b b a b b b F b b b b b b b G b b b b b b b F b b b b b b b G TED = 9 1 2 3 4 5 6 7 8 1 2 4 3 5 6 7 8 1 2 3 4 5 6 7 8 1 2 4 3 5 6 7 8 match
Klein Algorithm Philip N. Klein Computing the Edit-distance Between Unrooted Ordered Trees.6th annual European Symposium on Algorithms (ESA).1998,pp.91-102. Time : Space : Heavy Path Order + Left or Right Operation 2009/7/6 Thesis 15
Heavy Path Decomposition 2009/7/6 Thesis 16
Input 2009/7/6 Thesis 17
Result of Algorithm 2009/7/6 Thesis 18 a a a a a a a F b b b b b b b b G a 4 1 3 6 5 7 8 2 b a a a a a a b b b b b b b b a 4 1 3 6 5 7 8 2 match match
Result of Algorithm : (2) 2009/7/6 Thesis 19 b a b a a a a F b b b b b b b G a 4 1 3 6 5 7 8 2 b a b a a a a b b b b b b a 4 1 3 6 5 7 8 2 b b delete delete
Result of Algorithm : (3) 2009/7/6 Thesis 20 b a b a a a a F b b b b b b G a 4 1 3 6 5 7 8 2 b a b a a b a b b b b b b a 4 1 3 6 5 7 8 2 match match
Result of Algorithm : (4) 2009/7/6 Thesis 21 b b b a b a F b b b b b b G a 4 1 3 6 5 7 8 2 b b b a b a b b b b b b a 4 1 3 6 5 7 8 2 a delete match
Result of Algorithm : (5) 2009/7/6 Thesis 22 b b b b a F b b b b b b G b 4 1 3 6 5 7 8 2 b b b b a b b b b b b b 4 1 3 6 5 7 8 2 a delete match
Result of Algorithm : (6) 2009/7/6 Thesis 23 b b b b b b b b b b b b 4 1 3 6 5 7 8 2 TED = 10
VGJ VGJ(Visualizing Graphs with Java) is a tool for graph drawing and graph layout The current version is 1.03, released on 1998/4/20 Open source GNU GPL v.2 2009/7/6 Thesis 24
My TED Tool It can create two trees in some methods in order to implement algorithms of tree edit distance It can choose an specific algorithms to do according to items selected It can present the process and result of tree edit distance algorithms by visualizing and text methods 2009/7/6 Thesis 25
View of My TED Tool 2009/7/6 Thesis 26
General Tree Parameters 2009/7/6 Thesis 27
Parent of postordering 2009/7/6 Thesis 28
Parent of postordering : (2) 2009/7/6 Thesis 29
Algorithm Panel Ordering Method (1) General Ordering (2) Heavypath Ordering Operating Direction (a) Left Operation (b) Right Operation (c) Left or Right Algorithm Cost Value Delete Relabel 2009/7/6 Thesis 30
Input 2009/7/6 Thesis 31 4,4,4,8,7,7,8 3,3,8,7,7,7,8
Process 2009/7/6 Thesis 32
Process : (2) 2009/7/6 Thesis 33
Process : (3) 2009/7/6 Thesis 34
Process : (4) 2009/7/6 Thesis 35
Process : (5) 2009/7/6 Thesis 36
Process : (6) 2009/7/6 Thesis 37
Process : (7) 2009/7/6 Thesis 38
Process : (8) 2009/7/6 Thesis 39
Process : (9) 2009/7/6 Thesis 40
Experiment At the same number of nodes in two trees How about its performance to use Shasha and Zhang (1989) algorithm at each tree height 2009/7/6 Thesis 41
Shasha and Zhang Algorithm Kaizhong Zhang, Dennis Shasha Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems.SIAM Journal of Computing.1989,Vol. 18,6,pp.1245-1262. Time : Space : Postorder + Right Operation 2009/7/6 Thesis 42
25 Nodes 10000 Times 2009/7/6 Thesis 43
50 Nodes 2500 Times 2009/7/6 Thesis 44
75 Nodes 1000 Times 2009/7/6 Thesis 45
100 Nodes 1000 Times 2009/7/6 Thesis 46
Result of Experiment About height of (nodes/2) has largest number of subproblems With small height or large height, number of subproblems will decrease very fast If we know height of two trees is very small or very large, performance of Shasha and Zhang algorithm will be good If we know height of two trees is around (nodes/2), maybe we should choose another algorithm to solve tree edit distance 2009/7/6 Thesis 47
Conclusion Establish a visualizing tool for tree edit distance problems 2009/7/6 Thesis 48
Reference 2009/7/6 Thesis 49 [1]Bruce A. Shapiro, Kaizhong Zhang Comparing multiple RNA secondary structures.CABIOS.1990,Vol. 6,4,pp.309-318. [2]Dov Harel, Robert Endre Tarjan Fast algorithms for finding nearest common ancestors.SIAM Journal on Computing.5 1984,Vol. 13,2,pp.338-355. [3]Erik D. Demaine et al. An optimal decomposition algorithm for tree edit distance.ACM Transactions on Algorithms.2008. [4]Kaizhong Zhang Algorithms for the constrained editing distance between ordered labeled trees and related problems.Pattern Recognition.3 1995,Vol. 28,3,pp.463-474. [5]Kaizhong Zhang, Dennis Shasha Simple fast algorithms for the editing distance between trees and related problems.SIAM Journal of Computing.1989,Vol. 18,6,pp.1245-1262. [6]Kuo-Chung Tai The tree-to-tree correction problem.Journal of the ACM (JACM).7 1979,Vol. 26,3,pp.422-433.
Reference : (2) 2009/7/6 Thesis 50 [7]Larry Barowski Drawing Graphs with VGJ.[Online]1998.[Cited: 8 6 2009.]http://www.eng.auburn.edu/department/cse/research/graph_drawing/graph_drawing.html.—.Visualizing Graphs with Java (VGJ).[Online]1998.[Cited: 8 6 2009.]http://www.eng.auburn.edu/csse/research/research_groups/graph_drawing/manual/vgj_manual.html. [8]Philip N. Klein Computing the edit-distance between unrooted ordered trees.6th annual European Symposium on Algorithms (ESA).1998,pp.91-102. [9]Philip Bille A survey on tree edit distance and related problems.Theoretical Computer Science.2005,Vols. 1-3,337,pp.217-239 [10]Sebastian Maneth, Nikolay Mihaylov, Sherif Sakr XML tree structure compression.2008,pp.243-247.
Reference : (3) 2009/7/6 Thesis 51 [11]Serge Dulucq, Helene Touzet Analysis of tree edit distance algorithms.Combinatorial Pattern Matching, 14th Annual Symposium, CPM 2003.2003,Vol. 2676,pp.83-95. [12]Tatsuya Akutsu A relation between edit distance for ordered trees and edit distance for Euler strings.Information Processing Letters.15 11 2006,Vol. 100,3,pp.105-109. [13]Wikipedia Wikipedia.[Online]6 6 2009.[Cited: 8 6 2009.]http://en.wikipedia.org/wiki/Tree_traversal. [14]許伯鈞 一個新的樹編輯距離之演算法.s.l.,交通大學,2007.
Summary: 中華民國97學年度第二學期國立清華大學資訊系統與應用所碩士班吳政忠畢業論文:The implementation of a visualizing tool for tree edit distance problems
| URL: |
No comments posted yet
Comments