Scoop -- the swiss army chainsaw of content management
Front Page · Everything · News · Code · Help! · Wishlist · Project · Scoop Sites · Dev Notes · Latest CVS changes · Development Activities
Formatting for the HTML challenged. Feature Requests
By fsterman , Section Wishlist []
Posted on Sun Jun 02, 2002 at 12:00:00 PM PST
For some it is rather difficult to write a story, people that have no coding experience can be left out in the cold when it comes to formatting a submission. In this "story" I point out a semi elegant solution for the HTML inhibited.

Many people have written in with cries of hatred toward HTML. These are usually people that have no coding experience but still wish to format their submissions. It is understandable, writing in HTML can be clunky for those not used to coding. It is frustrating since to get the desired emphasis one must type shift+comma, b, shift+period, WORD, shift+comma, slash, b, shift+period making a submitter perform up to 9 more tasks than they usually do just to make one word bold. We have tried to shorten the process by making auto-formatting an option, using other special characters for formatting. In doing this we have reduced the required number of actions from 11 to 4 (shift+8, WORD, shift+8), reduced the number of characters we can use, and still have not eliminated the need to learn a new way of typing. This method still technically is programming, just a less clunky form of it.

My solution is to eliminate the problem by instead of K5 making a new way of typing to instead tell people how to make there old way work with K5's submission method. The old way in this case is using any word processor and just exporting what they write as HTML. This solution does take more work in exporting, copying, and pasting but with lengthy heavily formatted submissions it is much easier. One could have more freedom with tables, citing sources, and maybe pictures if K5 or any of the other Scoop sites decide to implement them. It should be rather easy to have the engine just display whatever characters the submission contains giving the web browser the work to format all of the information (from what I know of Perl and what I assume about the Scoop engine). This would lessen the number of submissions with bad grammar/spelling (neither of which I am near perfect at) and make the grammar obsessed sleep easier at night.

There is no limit on who could use this way of submitting. Most every platform has Mozilla, which would be the simplest solution. Just make a new composer page, write your story, and copy/paste the information from the HTML source tab. Don't like Mozilla? Then use Open Office, Abiword, or any one of the bazillion free HTML WYSIWYG programs. Both office suites have builds for most any platform, are free, and have hit the 1.0 mark. Macintosh versions without X11 are just around the corner and of course there are ways to get a free office suite. We could make a How To guide for Word, Open Office, Abiword, Mozilla, and any other program we felt enough people used. The guides, if accompanied with pictures, wouldn't be long, at most a 2 or 3 paragraphs. All they would need to say is how to export the document open it in a web browser, and where to copy and paste.

While this operation could take more work to tech people the concepts are much easier and require much less practice and memorization. Without doing a full GOMS keystroke analysis I can guess that on less complicated projects HTML would be faster, but only by a little. Complicated projects would be much easier to manage in a word processor; tables, footnotes, lists, margins, and lots of other things take longer in straight HTML and get broken easily.

I believe we should get rid of auto-formatting or allow for people to choose if they want it. If what I say is taken into K5 we should probably post a poll to see how many people actually use the auto-formatting (or if it is possible to track it from the Scoop engine) and run from there. Maybe we could make it so that it would format the copy and pasted HTML. Snip off body tags and other non-essentials, make links out of bare URL's, and maybe make people use something like angled bracket, URL, angled bracket, angled bracket, text meant to be a link, angled bracket. Make the auto-format recognize anything with an <> (not the curly or square brackets to keep character set at it's largest) and "www" or "http" in it and change it to an href link. The option to turn auto-format off should be available to those who wish to write a piece with actual HTML examples in it. In any case we would need Scoop be able to either just pass off unknown HTML to the browser or recognized all off the weird HTML Mozilla and other programs can make.

Editors note: I probibally should rewrite this and do tests to make sure everything would work as I think it will work. This is just an idea though and if some things are wrong you people are much more capable to correct me and apply the basic priniciples. And of course I am tired and don't want to write anymore :).
< Indus Telegraph | File Upload Thinger >

Menu
· create account
· faq
· search
· report bugs
· Scoop Administrators Guide
· Scoop Box Exchange

Login
Make a new account
Username:
Password:

Poll
Do I sound like a rambling drunk
· Yes, put away that Everclear. 66%
· No, just a Pot Head. 33%
· No, more like my lord and savior oh holy one. 0%
· Do you mind taking some writing classess ignoraint Grammically inccorect bastard! 0%
· No, I think you are right. 0%

Votes: 3
Results | Other Polls

Related Links
· Scoop
· ways
· GOMS
· More on Feature Requests
· Also by fsterman

Story Views
  48 Scoop users have viewed this story.

Display: Sort:
Formatting for the HTML challenged. | 7 comments (7 topical, 0 hidden)
The problem.. (4.00 / 1) (#1)
by hurstdog on Sun Jun 02, 2002 at 11:00:26 AM PST

Is that html from word, netscape, etc, are all pretty horrible, from what I've seen. Also then the site would have to allow tables, and some other html we don't like to allow to prevent page widening posts and the like. Its entirely possible to do what you want right now, with no changes, other than the larger amount of allowed html. And that is the problem, as I see it...



-hurstdog


Gah! (3.00 / 1) (#3)
by em on Thu Jun 06, 2002 at 04:12:31 AM PST

As somebody that has had to edit Word-generated HTML user submissions to Adequacy, I have to second hurst's opinion.

--em
Editor, Adequacy.org
Are you Adequate?



Two decent options for Word-generated HTML (4.00 / 1) (#4)
by hillct on Sat Jun 08, 2002 at 09:22:52 AM PST

You could either borrow code from the Demoroniser or run it through HTML::FormatText to get plain text then apply our existing autoformat routines.

--CTH



--
ScoopHost.com - Premier Scoop Hosting and custom development from the lead developers.


Cross-platform alternative (none / 0) (#6)
by juahonen on Tue Jun 11, 2002 at 01:36:42 AM PST

Instead of parsing HTML generated by word processors, there should be a simpler option for both Scoop implementation and authoring a story. Here's my suggestion

Using Rich Text Format
Most of the modern word processors can output Rich Text Format documents. It is really simple for the author to save the document in RTF and then upload it via an upload form. Rich Text Format is standard, there is no lax formatting schemes which plague HTML. Further, the text styles allowed in Scoop postings are relatively easy to parse compared to all the alternative methods in HTML. To be able to get the formatting of bold and italics right in HTML would require a full-featured HTML engine, yet it would still be unable to strip unnecessary "bloat" tags added by, for example, Microsoft Word.

Rich Text Format would allow all Windows, Macintosh and Linux users to write their stories in word processors, using spell checking and all the tweaks they have installed. Yet exporting to RTF would ensure the data is in uniform format which simplifies the requirements (and therefore the effort) for parsing and converting the data to acceptable HTML.

Caveats
I don't know if there are any usable Open Source RTF parsers available for Linux for the job. Searching Freshmeat found these programs, some of which sound good like the RTF to HTML converter and rtf-converter. I haven't used any of the software so I don't actually know if they're useful. Unfortunately there is no other file format common to all word processors on all the three mentioned platforms. If RTF cannot be parsed and we still want to enable the users to be able to write their stories in word processors, it would require complex programs and possibly supporting multiple input formats.

Allowing file uploads can make the web server more vulnerable to DoS attacs if no precautions are taken to protect the server.





Formatting for the HTML challenged. | 7 comments (7 topical, 0 hidden)
Display: Sort:

Hosted by ScoopHost.com Powered by Scoop
All trademarks and copyrights on this page are owned by their respective companies. Comments are owned by the Poster. The Rest © 1999 The Management

create account | faq | search