Text-mining

An IT consulting company has a collection of several thousand project status reports from completed software development projects. Each report is a Word document written by the project manager indicating the current status of the project (ahead of, on, or behind schedule; under, on, or over budget) and the reasons for that status. The company would like to use text mining to analyze this collection of reports and determine the factors that cause projects to be behind schedule or over budget. List the three steps in the text-mining process the company could use, and discuss what would be done in each step for this project.