DotBazaar Virtual Keyboard

DotBazaar Virtual Keyboard supports 100 characters. This range of characters covers many languages for which Arabic script or its variants have been used as a writing system. These include Arabic, Azeri, Jawi (Malay), Kurdish (Sorani), Persian and Urdu. These characters are distributed on the unique DotBazaar Keyboard

on three levels, Normal, Shift and Lock. The following Q & A refers to the keyboard and its functionalities.

The Virtual Keyboard has several uses including:

  1. If the user’s local keyboard does not contain all the necessary characters for a desired domain name, the registrant can type in the domain name on the Virtual Keyboard, copy the result on the clipboard and paste it onto the application, e.g., in the control panel of a registrar.

  2. The Virtual Keyboard provides other useful information about the domain such as the length of the A-labels (see Question 4) and the variant domains (Question 13).

The drop-down gives a choice of General, ANAJ and AKPU. General is the default state where all 100 characters may be utilized. ANAJ limits the characters to those generally used in Arabic language (including characters used mainly in North Africa or for foreign words in Arabic) and Jawi. AKPU contains the characters for Azeri, Kurdish, Persian and Urdu. There is no language restriction for dotBazaar domains, but some users may find it more convenient to use the less crowded ANAJ or AKPU version.

You can look up all the 100 characters in DotBazaar Character Table where, in addition, up to four possible forms of each character, depending on their position in a label, are also displayed. Other information in that table include the Unicode code point of each character (see the Question 5), the type of joining property of characters and the languages in which a particular character is commonly used.

Since the Domain Name System (DNS) only supports US-ASCII characters, the domain name needs to be encoded. The standard algorithm, called "punycode", produces a unique string called the ASCII-Label, or A-Label for short (sometimes also simply called Punycode string). Such a string is identified by an initial xn--. The A-label on display also gives the count of A-label characters. For a domain to be acceptable in the domain name system, the A-Label count must not exceed 63.

The U-label, short for Unicode-Label, is how a string appears to the end-user. There is a universal UNICODE coding system that provides a unique code point for each character. The code points for the characters of the domain you type in will appear in the bottom left side of the display under Code Points. The A-label or Punycode string discussed above is the compressed form of the totality of the code points for the domain.

There are a few exceptional cases where due to font problems, the U-label display and the display on top of the Virtual Keyboard will not be entirely correct. The Reference View always presents the correct display. So in most cases, the Reference View will be the same as the other two.

DotBazaar Virtual Keyboard provides a convenient means for checking the validity of a domain before you submit it for registration. Simply type in the desired string; if the string is not valid as a domain, a sentence such as <The domain name in not valid> or <Domain name contains invalid characters> will be returned under the keyboard display. There are syntactic restrictions that make certain strings invalid as dotBazaar domains. These rules may be found in the User Guide.

First note that there are three sets of digits on dotBazaar Virtual Keyboard. The Normal display shows ASCII digits (also known as Western Arabic), the Shift displays Eastern Arabic digits usually used in conjunction with Arabic language, and the Lock gives the Extended Arabic digits mainly used with Persian and Urdu. These digits carry distinct Unicode code points even when they look identical (Eastern Arabic and Extended Arabic digits are identical except for the digits 4, 5 and 6). Two rules limit the use of digits. One is that no domain may begin with a digit. The other limitation is that all digits used in one label must come from a single one of the three digit sets.

A domain label may not begin or end with a hyphen. Consecutive hyphens are not allowed.

ZWNJ stands for Zero-Width Non-Joiner. It is a blank half-space that is of common use in many writing systems including those of Kurdish, Persian and Urdu. On dotBazaar Virtual Keyboard, it is placed in the upper right corner on the same key as hyphen and is reachable in the Lock mode.

As in the case of hyphen, a label cannot begin or end in ZWNJ, nor can there be consecutive ZWNJs. There are further limitations described in

the User Guide. ZWNJ can be used only after (to the left of) a letter character of dual joining property, i.e., a letter character that can be joined to the next character on the left or to the right, and it must be followed (on the left) by a right-joining or dual-joining letter character. There are three exceptions, namely the dual-joining letter characters U+0637 (ط), U+0638 (ظ) and U+06BE (Urdu letter Heh Doachashmi ھ) which may NOT be followed (on the left) by ZWNJ.

Due to possible confusion with other characters, this character is not directly included in Dotbazaar Table. However, the character can be realized as follows: If this character is the last character of a domain or if it is followed by a non-joining letter character, simply use the identical-looking U+0647, and if this is not the case, use U+0647 followed by ZWNJ.

If you use Shift together with Arabic Kaf (located on the same key as ASCII K) you obtain the Farsi/Urdu Keh.
The Yeh located on the key for ASCII letter I is for Arabic-language and the one located on the key for ASCII letter Y is for Farsi/Urdu. Alef Maksura is obtained by Shift+(Arabic Yeh). Other variations on this sound are obtained from the same two keys using Shift or Caps Lock.

There is no universally applicable definition of variant, but generally speaking, domains with different A-labels but identical or confusingly similar U-labels are considered variants. For dotBazaar domains, variants have a well-defined meaning based on the characters occurring in the label. Characters are considered variants of one another in the following two cases:

  1. Digits from different digit sets denoting the same numeral. For example: 6, ٦ and ۶ are considered variants.

  2. Certain letter characters are considered variants depending on their joining position in the label. The joining position is important because a letter character often takes on different shapes depending on the position. Example: Farsi Yeh (U+06CC) will be a variant of Arabic Yeh (U+064A) if it is followed by another letter (on the left), but not if it is the last character of a label or is not followed by a letter character. In the latter case, the Farsi and Arabic Yeh’s differ visually. On the other hand, Farsi Yeh will be a variant of Arabic Alef Maksura (U+0649) when it is not followed by another letter character.

Labels that are identical except for the occurrence of variant characters in one or more corresponding positions are considered variants.

Upon the registration of a domain, all its possible variants are blocked for registration by other applicants. This is to prevent possible confusion or abuse. Only the original applicant may register one or more of the blocked variants and activate them.

The document DotBazaar Variants provides a complete guide, but you can simply use DotBazaar Virtual Keyboard to verify. When you type in a label, all possible variant characters will be listed by position under Variants. Example: The label <بازار> shows no variants, but typing the label <پیرانکوه> will cause the display of 2, 2, 2, 2 and 3 variant characters respectively in positions 1, 2, 5, 6 and 8. This means that there are a total of 2x2x2x2x3=48 variants in this set, which can be obtained by choosing one character from each set of variant characters.

This depends entirely on the purpose for which the domain has been registered and the intended target audience of the associated business. For businesses tied to a limited locality, there is generally no need to register variants of the business name with characters not in use in the particular locality. On the other hand, for businesses targeted to areas with different languages and keyboards, it is wise to register variants that may actually be addressed. One may then re-direct all variants to the same site. In any case, certain variant combinations are of no practical value. Take the word <کیف>. Theoretically, there are four variants here because there are two choices for the first letter (Arabic or Farsi Kaf or Keheh) and two choices for the second (Arabic or Farsi Yeh). But only two of the four variants can be of use, either both letters Arabic (for Arab-speaking region) or both letters Farsi (for Farsi and Urdu region). A combination of one Arabic and one Farsi will probably not find any practical use.