Text Categorization Based on Bayesian Classification Approach using Class-Specific Features

A.POORNIMA; A.SHARMILBEGAM; G.SRIDEVI; M. RAMASAMY

Issue Abstract

Abstract
The wide availability of web documents in electronic forms requires an automatic technique to label the documents with a predefined set of topics, what is known as automatic Text Categorization (TC). Over the past decades, it has been witnessed a large number of advanced machine learning algorithms to address this challenging task. The generated presentation slides can be used as drafts to help the presenters prepare their formal slides in a quicker way. A novel system called PPSGen is proposed to address this task. Documents are usually represented by the ―bag-of-words‖: namely, each word or phrase occurs in documents once or more times is considered as a feature. It first employs the regression method to learn the importance scores of the sentences in an academic paper, and then exploits the integer linear programming (ILP) method to generate well-structured slides by selecting and aligning key phrases and sentences.. This paper proposes a novel system called PPSGen to generate presentation slides from academic papers. We train a sentence scoring model based on SVR and use the ILP method to align and extract key phrases and sentences for generating the slides. Experimental results show that our method can generate much better slides than traditional methods.
Keywords: Text Categorization(TC); Machine Learning Algorithms; SVR; PPS Generate; Integer Linear Programming;

Author Information

A.POORNIMA,

Issue No

2

Volume No

3

Issue Publish Date

05 Feb 2017

Issue Pages

56-61

Issue References

References

H. Liu and L. Yu, ―Toward integrating feature selection algorithms for classification and clustering,‖ IEEE
Transactions on Knowledge and Data Engineering, vol. 17, no. 4, pp. 491– 502,2005.
P. M. Baggen stoss, ―Class-specific feature sets in classification, ‖IEEE Transactions on Signal Processing, vol.
47, no. 12,pp. 3428–3432, 1999.
B. Tang and H. He, ―ENN: Extended nearest neighbor method for pattern recognition [research frontier],‖ IEEE Computational Intelligence Magazine, vol. 10, no. 3, pp. 52–60, 2015.
I.-S. Oh, J.-S. Lee, and C. Y. Suen, Analysis of class separation and combination of class-dependent features for handwriting recognition,‖ IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 21, no. 10, pp. 1089–1094, 1999.
D. Cai, X. He, and J. Han, ―Document clustering using locality preserving indexing,‖ IEEE Transactions on
Knowledge and Data Engineering, vol. 17, no. 12, pp. 1624–1637, 2005.

Back To Issues

Cheeseburger menu

Text Categorization Based on Bayesian Classification Approach using Class-Specific Features

Services

Publications

Quick Links

GET JOURNAL UPDATES

Resources