The Space Between Two Characters

31. August, 2008

If you’re claustrophobic, you’re afraid of confined spaces. If you’re a software developer, you can be afraid of non-existing space.

When it comes to editing text, we usually don’t think about the space between the characters. There simply isn’t any. When you write a text editor, things start to look different. Suddenly, you have a caret or cursor which goes between the characters and that space between two characters can suddenly become uncomfortably tight.

Fire up your favorite text editor, Word, Writer, whatever. It has to support character formatting, though. Now enter this:

Hello, world!

If in doubt, the bold text ends before the comma and the italic part starts with the “w” and ends with “!” including both.

Now move to the “e” and type “x”. What do you get?

Hxello, world!

Piece of cake.

Now move to the “H” and type “x”. What do you get? Is the new x bold or not? Do you get “xHxello” or “xHxello“? How about typing “x” after the “o” of our abused “Hello”? Is that new x bold or not? If it is, what is the most simple way to make it non-bold? Do you have to delete the comma, do you have to go through menus or toolbars or is there a simple, consistent way to add a character inside and outside of a formatted range of text?

Let’s go one step further. Add a character after the “!”. Is it italic? If not, you’re lucky. If it is … what’s the most simple way to you get rid of the italic? If you press Return now, will the italic leak to the next line? If not, how can you make it leak? If that italic is the last thing in your text, can you add non-italic text beyond without fumbling with the formatting options?

There is no space between two characters and when you write a text editor, that non-existing space is biting you. Which is actually the problem: There is no consistent way to move in and out of a formatted range of characters.

The naive attempt would be to say “depending on the side you came from, you’re inside or outside.” So, if we have this (| is the cursor or caret): “Hello |world” and you type something, the question is: How did the caret end there? Did it come from “w|o” and moved one to the left? Or from “o| ” and move one position right?

That works somewhat but it fails at the beginning and the end of the text plus you’re in trouble during deleting text. What should happen after the last character of “Hello” has been deleted? Should that also delete the character range or should there be an empty, invisible bold range left and when you type something now, it should appear again? If you keep the empty invisible range, when do you drop it? Do you keep it as long as the user stays “in” it? Or until the document is saved? Loaded again from disk?

It’s a mess and there is a reason why neither Word nor OpenOffice get it right: You can’t. There is information in the head of the user (what she wants) but no way for her to tell the computer. Duh.

That is, unless you start to give the user a visual cue what is going on. The problems we have is that there is no simple, obvious way for the user to say “I want …” because there is no space on the screen reserved for this. We barely manage to squeeze a caret between the characters. There is just not enough room.

Well, there could be. A simple solution might be to add a little hint to the cursor to show which way it is leaning right now. Right. How about “A|B“? Here, you have three options. Add bold, italic and normal.

In HTML, this is simple. I’m editing this text in Firefox using the standard text area. What looks fancy to you looks like this to me: “<b>A</b>|<i>B</i>”

And this is the solution: I need to add a visual cue for the start and end of the format ranges. Maybe a simple U-shape which underlines the text for which the character format applies. Or an image (> and < in this example): “>A<|>B<“. And suddenly, it’s completely obvious on which side of the range start and end you are and what you want. You can delete the text in the range without losing it or you can delete both and you can move in and out of the range at will.

The drawback is that you need to keep that information somewhere. It adds a pretty huge cost to the limits of a format range. I’ll have to try and see how much that is and if I can get away with less by cleverly using the information I already have.

Also, it clearly violates WHYSIWYG. On the other hand, we get WYSIWYW which is probably better for the user.


DecentXML 1.2

31. August, 2008

DecentXML 1.2, my own XML 1.1-compliant parser, is now available.