Not the answer you're looking for? I have to use some awful CMS system to display our work files. You can extract text from popular file formats, preprocess raw text, extract individual words, convert text into numerical representations, and build statistical models. Both