You'd better use uploading word documents if formatting or even the existance of paragraphs is important. Others have ’ (a-hat, euro symbol, TM) in place of apostrophies. Seems like a character encoding problem, especially since the html files don't contain any encoding information. Some files have paragraphs starting with     (4 accented letter A's). tab symbol ) to help you see the formatting in your document. I also get random characters sprinkled around the documents. Word provides indent markers that allow you to indent paragraphs to the location you want. There is an option in word (Paragraph symbol in the main menu the symbol looks. It's so unpredictable that it's only good for reading the words themselves. Be consistent with spacing between lines and between paragraphs. Now I just don't trust the formatting of online text at all. I'm not sure how students have ended up with these, but when I started I was deducting marks for writing too many topics on one big paragraph, till I realised there really were paragraphs.
So maybe just grade them by reading the plain HTML? I get some online texts which have first-line indent and line breaks between paragraphs if I open them in Notepad, but which look like one unformatted paragraph in a web browser. It does seem to preserve extra spaces, even if they don't show up in HTML.