The Space Between Two Characters

If you’re claustrophobic, you’re afraid of confined spaces. If you’re a software developer, you can be afraid of non-existing space.

When it comes to editing text, we usually don’t think about the space between the characters. There simply isn’t any. When you write a text editor, things start to look different. Suddenly, you have a caret or cursor which goes between the characters and that space between two characters can suddenly become uncomfortably tight.

Fire up your favorite text editor, Word, Writer, whatever. It has to support character formatting, though. Now enter this:

Hello, world!

If in doubt, the bold text ends before the comma and the italic part starts with the “w” and ends with “!” including both.

Now move to the “e” and type “x”. What do you get?

Hxello, world!

Piece of cake.

Now move to the “H” and type “x”. What do you get? Is the new x bold or not? Do you get “xHxello” or “xHxello“? How about typing “x” after the “o” of our abused “Hello”? Is that new x bold or not? If it is, what is the most simple way to make it non-bold? Do you have to delete the comma, do you have to go through menus or toolbars or is there a simple, consistent way to add a character inside and outside of a formatted range of text?

Let’s go one step further. Add a character after the “!”. Is it italic? If not, you’re lucky. If it is … what’s the most simple way to you get rid of the italic? If you press Return now, will the italic leak to the next line? If not, how can you make it leak? If that italic is the last thing in your text, can you add non-italic text beyond without fumbling with the formatting options?

There is no space between two characters and when you write a text editor, that non-existing space is biting you. Which is actually the problem: There is no consistent way to move in and out of a formatted range of characters.

The naive attempt would be to say “depending on the side you came from, you’re inside or outside.” So, if we have this (| is the cursor or caret): “Hello |world” and you type something, the question is: How did the caret end there? Did it come from “w|o” and moved one to the left? Or from “o| ” and move one position right?

That works somewhat but it fails at the beginning and the end of the text plus you’re in trouble during deleting text. What should happen after the last character of “Hello” has been deleted? Should that also delete the character range or should there be an empty, invisible bold range left and when you type something now, it should appear again? If you keep the empty invisible range, when do you drop it? Do you keep it as long as the user stays “in” it? Or until the document is saved? Loaded again from disk?

It’s a mess and there is a reason why neither Word nor OpenOffice get it right: You can’t. There is information in the head of the user (what she wants) but no way for her to tell the computer. Duh.

That is, unless you start to give the user a visual cue what is going on. The problems we have is that there is no simple, obvious way for the user to say “I want …” because there is no space on the screen reserved for this. We barely manage to squeeze a caret between the characters. There is just not enough room.

Well, there could be. A simple solution might be to add a little hint to the cursor to show which way it is leaning right now. Right. How about “A|B“? Here, you have three options. Add bold, italic and normal.

In HTML, this is simple. I’m editing this text in Firefox using the standard text area. What looks fancy to you looks like this to me: “<b>A</b>|<i>B</i>”

And this is the solution: I need to add a visual cue for the start and end of the format ranges. Maybe a simple U-shape which underlines the text for which the character format applies. Or an image (> and < in this example): “>A<|>B<“. And suddenly, it’s completely obvious on which side of the range start and end you are and what you want. You can delete the text in the range without losing it or you can delete both and you can move in and out of the range at will.

The drawback is that you need to keep that information somewhere. It adds a pretty huge cost to the limits of a format range. I’ll have to try and see how much that is and if I can get away with less by cleverly using the information I already have.

Also, it clearly violates WHYSIWYG. On the other hand, we get WYSIWYW which is probably better for the user.

8 Responses to The Space Between Two Characters

  1. […] Should offer a side-by-side editor (source and preview because WYSIWYG is impossible to get right) […]

  2. hasenj says:

    I’m pretty sure it can be done right. For instance, the range could be shows as a transparent non-obtrusive marker *only* when the cursor is inside the range. i.e. context-sensitive ranger-marker-display.

    • digulla says:

      Sounds like a good idea. I wonder why there is no editor (yet) which has implemented this. Most try to be smart about the style buttons.

      One exception are HTML editors where the cursor has two positions (one inside the element and one outside). CKEditor for example will behave differently when you move the cursor into/out of a range. That said, if several elements end at a certain position, you can only exit them all at once. So it’s not perfect, either.

  3. […] at 20:15 | In Software | Leave a Comment Tags: WYSIWYG, OO, AOP, Performance In my old post “The Space Between Two Characters“, I wrote about some flaws of WYSIWYG. Since then, I got some […]

  4. At least for italic it works well in Word and RichEdit where the cursor itself will be italic too. This helps determining what style will be applied when typing there.

    Maybe one can even take this further by putting little overlays around the cursor that show the formatting at that point. However, the question is (at least for word processors) whether that should follow the formatting or rather the style. As no one in their right mind would try formatting larger documents with physical formatting.

    • digulla says:

      Software like TeX and Wikis show that you can get away with a few simple hints what a piece of text means and let the software decide how to render it. Especially TeX is built on the fact that only a few people know how to make text look good and that they shouldn’t be allowed to format it.

      Just think of italics. How does this look: “Hello!” Often, the “!” touches the ending quote. I’ve seen text editors where the exclamation mark was actually rendered over the quote so it looks as if I had started with double quotes and ended with a single quote! Or when I used single quotes, they didn’t show up at all.

      In word processors, when you mark italic text, then you have either a rectangular blue rectangle (so the text leaks) or you get a tilted box and ugly gaps when you mark text surrounding the italic words.

      If something so simple doesn’t work, it’s telling something. It says: There is a lot of room for improvement.

      • The TeX argument isn’t a very good one, though. At least there are many points that simply nag me. First of all, no one with no idea of how good writing looks can lay out a good-looking document. Not even with TeX. From what I’ve seen so far, many people make similar and similarly stupid mistakes:

        • “Hello!” instead of “Hello!” renders as ”Hello!” – fail (note the incorrect opening quote).

        • Then punctuation and formatting. Punctuation characters are formatted the same as the adjacent text but I’ve seen things like “{\em Hello}!” or “{\em Hello!}” numerous times which will render as “Hello!” or “Hello!”, respectively. The latter of which looks really ugly and yes, there the exclamation mark actually touches the closing quotation mark (because the quotation marks weren’t made italic).

        • Thin space for abbreviations, e. g. “e. g.”, “i. e.”, &c. How many people do you know who know that there needs to be a \, after the first full-stop?

        Besides, every non-trivial (i. e. everything more than a few paragraphs) needs custom formatting to get right. LaTeX can take you maybe 90 % of the way but no further. And documents lacking the final 10 percent do look as horrible as documents lacking every sense of good formatting. But I digress.

        As for selection in word processors: Why should the selection follow the format, i. e. be tilted for italic text? The user doesn’t gain much that way since they already see that the selected text is italic. Also it’s pretty much the only style that actually alters the shape of the characters (or their bounding box). For a selection it’s usually very much clearer what styles are used than for a single caret between two characters. On that topic Jef Raskin wrote a few pages in his book The Humane Interface (p. 136 ff.) although I don’t particularly agree with him on that part. Plus he touches only text insertion and deletion (and a little bit of selection), yet no formatting.

        But formatting is an interesting matter. I’ve helped a friend (medicine student) a while ago formatting a part of her thesis and I noticed a very interesting, but rather disturbing thing: She didn’t think of a header in terms of “Hey, this is a heading, how can I format it as such?” which would have actually been easy since Word 2007 made the styles much more accessible than the physical formatting tools, but instead she thought “Hey, this is a heading, how can I make it large and bold?”. And that somewhat hurt. Large documents are never formatted directly with bold, italic and font sizes. You are using styles. Be that Word, Writer or LaTeX you have some facilities for telling the application “This is a heading” or “This should be emphasized”.¹ But when you then see people who write dozens of pages and still just think of the word processor as a slightly more colorful typewriter then you actually ask yourself whether the way we think of writing documents the right way is actually right.

        _______________________

        ¹ Though with LaTeX it’s really just a bunch of allegedly semantic macros around a bunch of physical formatting macros around a bunch of physical formatting commands, as a user it’s sometimes hard to figure out what is semantic in nature and what’s not and there goes a big part of LaTeX’s advantage in this respect, in my eyes.

  5. digulla says:

    I agree with the shortcomings of (La)TeX. It’s a pity that Knuth stopped to evolve the system further. Today, I’d expect that TeX can recognize things like “i.e.” and adds the nice tiny space. It’s not rocket science but probably a bit hard to do with the TeX parser.

    So again, the limitations of the software force us to come up with tedious workarounds. The same with user styles: Word processors shouldn’t offer buttons for bold/italics/underline. They should only offer styles. If people would have to go through the style dialog every time to make something bold, they’d adopt using styles in a no time (and refrain from using 16 different fonts per page just because there is a selection box in the toolbar). 🙂

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: