Process extraction from educational texts

Intellectual Systems and Technologies

Business process modeling plays an important role in analysis and optimization of organizational processes. Automation of process models is particularly crucial in domains where processes involve mostly intellectual activity that is not properly documented. Software development is an example of a domain with these properties. Educational materials like instructions and guides, blog posts, or conference talks are an important source of information about the processes in this case. Known algorithms of process extraction pose strict requirements to the input text. In this paper, we propose an approach to process extraction from complex sources containing descriptions of multiple processes and text blocks unrelated to the process model. In order to account for these aspects, the method considers not only standard lexical and syntactic properties of the text but also its structure and markup.