RESEARCH AND APPLICATION OF AUTOMATICAL ANNOTATION ON CHINESE PATENTS


laptop

Deng Na

Hubei University of Technology

Copyright © 2017 by Cayley Nielson Press, Inc.

ISBN: 978-0-9992443-1-9

Cayley Nielson Press Scholarly Monograph Series Book Code No.: 140-1-2

US$96.75

 

 

 

 

 

Preface


With the development of society, people are more and more aware of the tremendous changes in our life brought about by innovation. As one of the most important ways to protect innovation, patent has been paid more and more attention. More and more patents are accumulated in the worldwide since the amount of patents applications increases year by year. Because patents contain rich technology, economy and law information, patent analysis and mining has become an important research topic in the field of data mining. However, currently, the annotation of Chinese patents still rely on human work. This book focuses on the research and application of automatical annotation of Chinese patents.

The author is supported financially by This paper was supported by Research Foundation for Advanced Talents of Hubei University of Technology (No.BSQD12131), Fundamental Research Funds for the Young Teachers' Innovation project of Zhongnan University of Economics and Law (No.2014147), National Natural Science Foundation of China (No.61201250), Natural Science Foundation of Anhui Province (No. 1308085QF103), Guangxi Natural Science Foundation (No.2012GXNSFBA053174), and Guangxi University Key Lab of Cloud Computing and Complex System Found.

As the co-instructor, Professor Chunzhi Wang and Professor Zhiwei Ye devoted a lot of effort. Because the situation and the level is limited, the selection and evaluation of the methods are inappropriate or even wrong department, I were thankful that the readers could give criticism and correction.

 

Contents


Preface......................................................................................................... I
1 Automatically generation and evaluation of stop words list for Chinese Patents    1
1.1 Introduction........................................................................................... 1
1.2 The stop word list for general Chinese texts......................................... 4
1.3 Word Segmentation of Chinese............................................................ 6
1.4 Two methodologies of generating stop words lists for Chinese patents 6
1.4.1 Methodology One: based on the most simple and general strategy... 7
1.4.2 Methodology Two: based on statistics............................................... 7
1.4.3 Modification and adjustment of SAT................................................. 9
1.5 Analysis and Evaluation of Experiment.............................................. 10
1.5.1 Dataset.............................................................................................. 10
1.5.2 Using Methodology One to generate stop words list under corpuses with different scales   10
1.5.3 The accuracies of stop words list under corpuses with different scales using Methodology One 11
1.5.4 Methodology One’s list compared with the list for general texts.... 12
1.5.5 The accuracies of stop words list under corpuses with different scales using Methodology Two 13
1.6. Conclusion.......................................................................................... 14
2 Keywords extraction for Chinese patents based on Part-of-speech tagging and stop words elimination  15
2.1 Introduction......................................................................................... 15
2.2 Related work....................................................................................... 16
2.3 Our methodology................................................................................ 17
2.4 Word Segmentation and Part-of-speech tagging................................ 17
2.5 Elimination of stop words in patents................................................... 20
2.6 Algorithm............................................................................................ 20
2.7 Experiment.......................................................................................... 21
2.7.1 Dataset.............................................................................................. 21
2.7.2 Extraction result............................................................................... 21
2.7.3 The accuracy of keywords extraction and its influence factors....... 23
2.8 Conclusion........................................................................................... 23
3 A functional clauses extraction method of Chinese patents in specific domain based on templates   24
3.1 Introduction......................................................................................... 24
3.2 Concept............................................................................................... 26
3.2.1 Function clause................................................................................. 26
3.2.2 Template........................................................................................... 27
3.3 The collection and classification of templates..................................... 28
3.3.1 The collection of templates............................................................... 28
3.3.2 The classification of templates......................................................... 28
3.4 The extraction method of function clauses......................................... 29
3.5 Experiments......................................................................................... 30
3.6 Conclusions......................................................................................... 31
4 The construction method of clue words thesaurus in Chinese patents based on iteration and self-filtering.... 32
4.1 Introduction......................................................................................... 32
4.2 Related work....................................................................................... 33
4.3 Clue words.......................................................................................... 34
4.4 Algorithm............................................................................................ 36
4.4.1 Self-filtering..................................................................................... 37
4.4.2 Locating candidate effect statements.............................................. 38
4.5 Experiments......................................................................................... 40
4.5.1 Collection of initial clue words........................................................ 40
4.5.2 Iteration............................................................................................ 41
4.6 Conclusion and future work................................................................ 41
5 PaEffExtr: A Method to Extract Effect Statements Automatically from Patents   43
5.1 Introduction......................................................................................... 43
5.2 Related Work...................................................................................... 44
5.3 Characteristics of Patent Abstracts..................................................... 45
5.4 Multi-features fused scoring algorithm............................................... 47
5.4.1 Calculation of distribution score...................................................... 48
5.4.2 Calculation of morphological score.................................................. 49
5.4.3 Algorithm PaEffExtr........................................................................ 50
5.4.4 Evaluation of algorithm.................................................................... 52
5.5 Experiments......................................................................................... 53
5.5.1 Clue words....................................................................................... 53
5.5.2 Comparative experiments................................................................. 54
5.5.3 Runtime............................................................................................ 55
5.6 Conclusion and future work................................................................ 56
6 Intelligent Recommendation of Chinese Traditional Medicine Patents Supporting New Medicine’s R&D....... 57
6.1 Introduction......................................................................................... 57
6.2 Related Work...................................................................................... 59
6.3 Chinese traditional medicine patents’ intelligent recommendation..... 62
6.3.1 Architecture...................................................................................... 62
6.3.2 Construction of Alias Database........................................................ 62
6.3.3 Technology Annotation Automatically in Chinese Traditional Medicine Patents    64
6.3.4 The Similarity of Patents.................................................................. 65
6.4 Experiments......................................................................................... 66
6.4.1 Recalling optimization...................................................................... 67
6.4.2 Patent recommendation.................................................................... 68
6.5 Conclusion and Future Work.............................................................. 79
References................................................................................................. 80


 

Readership


This book should be useful for students, scientists, engineers and professionals working in the areas of optoelectronic packaging, photonic devices, semiconductor technology, materials science, polymer science, electrical and electronics engineering. This book could be used for one semester course on adhesives for photonics packaging designed for both undergraduate and graduate engineering students.

 

Originality and Plagiarism

Prospective authors should note that only original and previously unpublished manuscripts will be considered. The authors should ensure that they have written entirely original works, and if the authors have used the work and/or words of others, that this has been appropriately cited or quoted. Furthermore, simultaneous submissions are not acceptable. Submission of a manuscript is interpreted as a statement of certification that no part of the manuscript is copyrighted by any other publication nor is under review by any other formal publication. It is the primary responsibility of the author to obtain proper permission for the use of any copyrighted materials in the manuscript, prior to the submission of the manuscript.