To read this content please select one of the options below:

Application of probabilistic methods to Chinese

Xiangji Huang (Centre for Interactive Systems Research, Department of Information Science, City University, Northampton Square, London EC1V 0HB)
S.E. Robertson (Centre for Interactive Systems Research, Department of Information Science, City University, Northampton Square, London EC1V 0HB)

Journal of Documentation

ISSN: 0022-0418

Article publication date: 1 March 1997

167

Abstract

The use of text retrieval methods based on the probabilistic model with Chinese language material is discussed. Since Chinese text has no natural word boundaries, we must either apply a dictionary‐based word segmentation method to the text, or index and search in terms of single Chinese characters. In either case, it becomes important to have a good way of dealing with phrases or contiguous strings of characters; the probabilistic model does not at present have such a facility. Some ad hoc modificatkions of the probabilistic weighting function and matching method are proposed for this purpose.

Keywords

Citation

Huang, X. and Robertson, S.E. (1997), "Application of probabilistic methods to Chinese", Journal of Documentation, Vol. 53 No. 1, pp. 74-79. https://doi.org/10.1108/EUM0000000007193

Publisher

:

MCB UP Ltd

Copyright © 1997, MCB UP Limited

Related articles