Archive for konqueror

arrr! walk the plank ye scurvey dogs of the media!

(Happy Talk Like A Pirate Day.)

Well, I played around with EasyTags for a while, but two things came up that caused me to set it aside (and I may or may not return to using it). First, I realized that SoundJuicer (and for that matter audio-convert) still needed tweaking to properly handle the ID3 tags; I was still back at the general issue of — even though I converted the FLAC files to MP3– having to hand edit all of the tags. Erk.

Read the rest of this entry »

del.icio.us:arrr! walk the plank ye scurvey dogs of the media!  digg:arrr! walk the plank ye scurvey dogs of the media!

Comments

Part II: wandering through unicode, legacy fonts, and browsers

Precomposed versus Combining

In the course of putting together the encodings (called code points) in Unicode, a number of decisions had to be made regarding the current existing encodings, particularly well known and/or well established ones. In some cases, even though the Unicode Consortium has a particular policy regarding some encode vs render issues, there are inconsistent inclusions due to this grandfathering of prior established encodings and (to be quite honest) outright mistakes on the part of the Consortium. The question of precomposed characters versus combining characters is a classic one.

Read the rest of this entry »

del.icio.us:Part II: wandering through unicode, legacy fonts, and browsers  digg:Part II: wandering through unicode, legacy fonts, and browsers

Comments

Part I: wandering through unicode, legacy fonts, and browsers

In the beginning was ASCII, at seven bits. And it was good, until someone noticed a few missing characters. In this way, ASCII with eight bits was born. But alas! There were even more characters to be respresented. And thus began the exodus in search of ways to show these missing characters.

And the way split and conmingled between encoding and rendering. After all, since no encoding existed for some characters, those attempting to render them on the screen made up their own encodings as they went. And things continued in this sorry mess for years, until Unicode was born…

Of course at this point, now various different programs such as mailers, forums, word processors, internet browsers and even programming languages all are in the process of being updated to understand unicode. It’s a giant bear of a mess along many fronts, although I do believe Unicode is the way forward, the essential problem is that it’s been introduced at this point instead of at, say, the point ASCII originally came on the scene. But no matter, things will sort themselves out. Over the years and years, but hey.

So what, exactly, am I talking about? In order to represent the alphabet, two things are needed. First is a kind of representation that says which character we’re talking about. So for example (decimal) 65 is used to represent the letter ‘a’. The second kind of thing is whatever is used to render ‘a’. It could be a, or a or a. These two things are the encoding and the rendering of a given character. Conceptually these two properties of a character should be distinct. In practice, of course, it’s not always been that cleanly handled, and there are some issues where the lines are legitimately blurred.

The problem was, of course, that the original encoding table (ASCII) was much too limited to handle languages other than English. To address this, a number of ISO 8859’s were developed to cover additional characters such as ß or ñ and other marks and symbols such as © and £. However, since rendering (or typography) was not considered in these representations. a number of legacy fonts developed that used additional proprietary (and conflicting) encodings for additional information not covered in the standards. And in all of this, languages that did not even use the Latin alphabet (such as Greek, Russian, Arabic, Hebrew, Japanese, Chinese, and so on and on) were definitely ill-represented overall. In most cases there are several possible representations and standards to use, which results in a nightmare for anyone trying to represent extended or other character sets in programs that make use of them. (Which basically potentially includes any program which ever tries to communicate with its user in anything other than international symbols, but I digress.)

Unicode

The idea behind Unicode is to create one giant encoding standard for all of this (leaving the typography alone and up to whatever rendering a particular program wishes to use, or whatever font set the user has installed). Sounds simple enough although even this idea is fraught with complexity and inconsistencies. For example Unicode absorbed many of the original encoding standards in order to ensure backward compatibility; and has made various inconsistent decisions on the inclusion of other characters in different ways. However. the underlying concept is sound, and if it takes another twenty years to refine it, the end result should still be better than the cacaphony there is now.

Enough with the background. I want to discuss most of this in the context of browser rendition, since this is a good deal of what I work on anyhow.

Read the rest of this entry »

del.icio.us:Part I: wandering through unicode, legacy fonts, and browsers  digg:Part I: wandering through unicode, legacy fonts, and browsers

Comments (1)

traversing the browser family tree

One thing I find fascinating is the way in which browsers have gone forth and multiplied over this world. It’s not immediately obvious, but some browsers are actually quite closely related. What follows is a rough chronology of the more well known browsers. There were (and are) many more, of course, and a browse through the sources listed at the end may be of interest.

Read the rest of this entry »

del.icio.us:traversing the browser family tree  digg:traversing the browser family tree

Comments

Bad Behavior has blocked 443 access attempts in the last 7 days.