Inference algorithms for pattern-based CRFs on sequence data Journal Article


Author(s): Kolmogorov, Vladimir N; Takhanov, Rustem S
Article Title: Inference algorithms for pattern-based CRFs on sequence data
Affiliation IST Austria
Abstract: We consider Conditional random fields (CRFs) with pattern-based potentials defined on a chain. In this model the energy of a string (labeling) (Formula presented.) is the sum of terms over intervals [i, j] where each term is non-zero only if the substring (Formula presented.) equals a prespecified pattern w. Such CRFs can be naturally applied to many sequence tagging problems. We present efficient algorithms for the three standard inference tasks in a CRF, namely computing (i) the partition function, (ii) marginals, and (iii) computing the MAP. Their complexities are respectively (Formula presented.), (Formula presented.) and (Formula presented.) where L is the combined length of input patterns, (Formula presented.) is the maximum length of a pattern, and D is the input alphabet. This improves on the previous algorithms of Ye et al. (NIPS, 2009) whose complexities are respectively (Formula presented.), (Formula presented.) and (Formula presented.), where (Formula presented.) is the number of input patterns. In addition, we give an efficient algorithm for sampling, and revisit the case of MAP with non-positive weights.
Keywords: Conditional random fields; Sequence tagging; String algorithms
Journal Title: Algorithmica
Volume: 76
Issue 1
ISSN: 1432-0541
Publisher: Springer  
Date Published: 2016-09-01
Start Page: 17
End Page: 46
Sponsor: This work has been partially supported by the European Research Council under the European Unions Seventh Framework Programme (FP7/2007-2013)/ERC grant agreement no. 616160.
URL:
DOI: 10.1007/s00453-015-0017-7
Open access: yes (repository)
IST Austria Authors
Related IST Austria Work