Archive for opera

Part III: wandering through unicode, legacy fonts, and browsers

Unicode — Supplementary Planes

Back to Unicode. Even Unicode needs room to expand. The Basic Multilingual Plane (BMP, also known as Plane 0) has codepoints 0 through FFFF hex (0 through 65,535) and contains, ignoring various special mathematical characters, historical quirks, and just plain oddities, most of the current alphabets and symbols in use today, even including the vast array of Chinese kanji. But of course there’s more room needed once historical alphabets and other special character sets are also considered. Supplementary planes such as Plane 1 are simply sets of 65,535 code points following Plane 0 (some of us refer to these as astral planes :-) ). Using hexadecimal notation, it’s easy to spot Plane 1: all Plane 0 codepoints use four hexadecimal digits; all Plane 1 use five, etc.

Here is an example of a Plane 1 character: 𝔊 which renders as 𝔊 (marks a septaugint reference). In case anyone viewing this is having trouble, here’s an image:

To set things up to view this can be an interesting process. First of all it’s necessary to find a unicode font that supports plane 1 characters. Not all such will actually show the above, for example Code2001, an otherwise excellent unicode font, does not include historical Greek musical notation however Cardo works nicely. Just because a font is unicode based does not mean it will be “complete” (such a font would be staggeringly large). In particular, when choosing among unicode fonts, pay attention to which version of unicode is being supported, and what the target languages are. Plane 1 characters start to appear in Unicode versions 3.2 and up.

Second, the operating system might need slight adjusting to see Plane 1. I’m talking, of course, about Windows, XP and earlier versions. This page discusses how to do the registry edits necessary both for the operating system and for IE6 8-|.

Linux/*nix, Mac, and Vista are all presently able to handle Plane 1 without modifications. My understanding is that for Windows Me and anything prior to NT, it’s hopeless.

Third, applications in general and browsers in particular may need slight tweaking to see Plane 1 characters. As usual, IE6 is the worst offender in this regard requiring not just setup but also a registry edit. In general, once the browser is set up with a Plane 1 aware font, it’s good to go. Firefox, Safari, Opera, and Konqueror all fall under this category.

A few notes: IE6 needs to be set to user-defined encoding plus the extended font needs to be listed under User Defined (rather than any of the specific regions listed, which refer back to non-unicode encodings as discussed in part 1). Opera actually reverses from the general MO of other browsers: under Tools->Preferences->Advanced->International Fonts, it’s possible to set a particular unicode font to a particular language (going by Unicode’s code blocks). This is the direction other browsers should no doubt take in the future.

del.icio.us:Part III: wandering through unicode, legacy fonts, and browsers  digg:Part III: wandering through unicode, legacy fonts, and browsers

Comments

fun with browsers and javascript

Since javascript is a client side implementation (meaning that it is the browser and not the host site (such as is the case with php/perl/other cgi scripting) that actually runs the script) this means I can have endless fun finding out the many ways in which javascript gets completely hosed up.

I mean, take a pretty simple bit of javascripting:
document.getElementById('cleanslate').innerHTML = s;
where s is a nice little list of selectable (checkbox) items in a list to be output.

And then of course, in the html we have:
<span id = "cleanslate"></span>

How much simpler does it get?

Well. It worked beautifully in Firefox.
Didn’t do a single thing in IE6.

And it’s hard to figure out where to even start. In no amount of googling have I found a book, or a faq, or any kind of documentation that extensively details the ways in which Javascript differs from browser to browser. I would think this would be a huge niche. I certainly found many many questions on this stuff. Even if the differences are under constant change, I would still expect at least some online resources to be out there. (Anyone know of any?)

I did find another site that demonstrated a very similar problem: Chameleon but which offered no solution, only the illustration.

After much experimenting, I finally figured out that the issue was really this:


<p>
<span id = "cleanslate"></span>

There is something about the unclosed <p> that IE6 does not like. This actually kind of makes sense — since it’s not closed (yes, I know, slap my hands already), the question is what is the actual DOM? Firefox obviously defaults to something reasonable to handle it, IE6 takes the novel approach of displaying nothing at all. Well the “Sun Java Console” (now there’s a misnomer if there ever was) does come up with the every helpful:
"Unknown run time error on line..."

As an aside: I just gotta love this “Sun Java Console” for its utter uselessness. It’s not even possible to copy and paste the message somewhere for future reference. I can’t make use of IE6 until I close the window. It offers no means of running anything, debugging, tracing, etc. There’s an extension available at Microsoft but even after I jumped through the WGA hoops to download it, when I ran it to install it, it said “No Installation Data available.” It’s particularly annoying after using so many lovely Firefox extensions that enable extensive Javascript debugging.

But here’s the odd part. Removing the <p> altogether fixed IE6. Finishing it off before the opening <span> worked, as well. But enclosing the span within the paragraph did not work. Throughout all these variations, Firefox worked throughout.

Opera, unfortunately, is an entirely different story…! It looks fine, on the first pass. But it’s clearly not actually getting the checkboxed items into its form when the form button on that page is pressed. Ugh.

del.icio.us:fun with browsers and javascript  digg:fun with browsers and javascript

Comments (1)

Part II: wandering through unicode, legacy fonts, and browsers

Precomposed versus Combining

In the course of putting together the encodings (called code points) in Unicode, a number of decisions had to be made regarding the current existing encodings, particularly well known and/or well established ones. In some cases, even though the Unicode Consortium has a particular policy regarding some encode vs render issues, there are inconsistent inclusions due to this grandfathering of prior established encodings and (to be quite honest) outright mistakes on the part of the Consortium. The question of precomposed characters versus combining characters is a classic one.

Read the rest of this entry »

del.icio.us:Part II: wandering through unicode, legacy fonts, and browsers  digg:Part II: wandering through unicode, legacy fonts, and browsers

Comments

Part I: wandering through unicode, legacy fonts, and browsers

In the beginning was ASCII, at seven bits. And it was good, until someone noticed a few missing characters. In this way, ASCII with eight bits was born. But alas! There were even more characters to be respresented. And thus began the exodus in search of ways to show these missing characters.

And the way split and conmingled between encoding and rendering. After all, since no encoding existed for some characters, those attempting to render them on the screen made up their own encodings as they went. And things continued in this sorry mess for years, until Unicode was born…

Of course at this point, now various different programs such as mailers, forums, word processors, internet browsers and even programming languages all are in the process of being updated to understand unicode. It’s a giant bear of a mess along many fronts, although I do believe Unicode is the way forward, the essential problem is that it’s been introduced at this point instead of at, say, the point ASCII originally came on the scene. But no matter, things will sort themselves out. Over the years and years, but hey.

So what, exactly, am I talking about? In order to represent the alphabet, two things are needed. First is a kind of representation that says which character we’re talking about. So for example (decimal) 65 is used to represent the letter ‘a’. The second kind of thing is whatever is used to render ‘a’. It could be a, or a or a. These two things are the encoding and the rendering of a given character. Conceptually these two properties of a character should be distinct. In practice, of course, it’s not always been that cleanly handled, and there are some issues where the lines are legitimately blurred.

The problem was, of course, that the original encoding table (ASCII) was much too limited to handle languages other than English. To address this, a number of ISO 8859’s were developed to cover additional characters such as ß or ñ and other marks and symbols such as © and £. However, since rendering (or typography) was not considered in these representations. a number of legacy fonts developed that used additional proprietary (and conflicting) encodings for additional information not covered in the standards. And in all of this, languages that did not even use the Latin alphabet (such as Greek, Russian, Arabic, Hebrew, Japanese, Chinese, and so on and on) were definitely ill-represented overall. In most cases there are several possible representations and standards to use, which results in a nightmare for anyone trying to represent extended or other character sets in programs that make use of them. (Which basically potentially includes any program which ever tries to communicate with its user in anything other than international symbols, but I digress.)

Unicode

The idea behind Unicode is to create one giant encoding standard for all of this (leaving the typography alone and up to whatever rendering a particular program wishes to use, or whatever font set the user has installed). Sounds simple enough although even this idea is fraught with complexity and inconsistencies. For example Unicode absorbed many of the original encoding standards in order to ensure backward compatibility; and has made various inconsistent decisions on the inclusion of other characters in different ways. However. the underlying concept is sound, and if it takes another twenty years to refine it, the end result should still be better than the cacaphony there is now.

Enough with the background. I want to discuss most of this in the context of browser rendition, since this is a good deal of what I work on anyhow.

Read the rest of this entry »

del.icio.us:Part I: wandering through unicode, legacy fonts, and browsers  digg:Part I: wandering through unicode, legacy fonts, and browsers

Comments (1)

rendering engines revisited

No! Don’t groan! I can be long winded, yes, but this one’s short and sweet. There was a point I had wanted to add to the last post but as happens, it got left out. But it is a good one… why, exactly, do I care so much about which browsers user which rendering engines? Or even on their general history?

When I’m checking to see if my web pages work across various platforms and various browsers, I don’t have to start out by going through and checking how they appear in an all encompassing, exhausting review using each and ever on any and all. If I hit the four main engines in the order of their percentage of use: Trident, Gecko, Presto (Opera), and KHTML, I’m in a very good position. Granted, for more troublesome nit picky detail, it is possible for two different gecko-based browsers to have different problems with some bit of HTML or CSS or JS or other, but in the main, I’ve likely shaken out all the major glaring problems in the most efficient manner.

Knowing that Safari has the lion’s share of the Mac OS lets me ensure that I’ve taken care of the majority of Mac users by checking with that one. If there’s some issue with MacIE, I can pretty much leave it alone or let it degrade. Or if I’m pressed for time, I can choose Safari over MacIE or Camino or Mac Firefox and get pretty good Mac coverage.

Speaking of which, I’m aware that this blog doesn’t display well in Safari. I’m hoping to have some time on a borrowed Mac soon to figure out why.

del.icio.us:rendering engines revisited  digg:rendering engines revisited

Comments (3)

« Previous entries Next Page » Next Page »

Bad Behavior has blocked 452 access attempts in the last 7 days.