Learn Japanese with JapanesePod101.com

View topic - Latin-character Character URLs

Latin-character Character URLs

AKA Geek talk - discuss technology in general; this may or may not relate to Japanese

Latin-character Character URLs

Postby 血まみれ剣術師 » Thu 10.29.2009 2:47 pm

I'm sure many have already heard this news, but I'm going to put it up for those who've not heard about it. The inevitable is coming... URLs that no longer require Latin based characters. "When?" Sometime mid next year URLs with different languages will become available. Japanese will be one of the languages available. This technology isn't new and has been worked on for several years. "...The English dominance over the internet is about to come to an end..." :?

What do you think?
血まみれ剣術師
 
Posts: 95
Joined: Wed 09.16.2009 8:26 pm
Location: North Carolina
Native language: English

Re: Latin-character Character URLs

Postby phreadom » Thu 10.29.2009 4:37 pm

http://www.w3.org/International/

?

Can you point me to more information on this? I'm curious what prompted you to mention this today.
猿も木から落ちる
User avatar
phreadom
Site Admin
 
Posts: 1761
Joined: Sun 01.29.2006 8:43 pm
Location: Michigan, USA
Native language: U.S. English (米語)
Gender: Male

Re: Latin-character Character URLs

Postby 血まみれ剣術師 » Thu 10.29.2009 4:49 pm

血まみれ剣術師
 
Posts: 95
Joined: Wed 09.16.2009 8:26 pm
Location: North Carolina
Native language: English

Re: Latin-character Character URLs

Postby phreadom » Thu 10.29.2009 5:53 pm

That's what I wanted. Thanks. :)
猿も木から落ちる
User avatar
phreadom
Site Admin
 
Posts: 1761
Joined: Sun 01.29.2006 8:43 pm
Location: Michigan, USA
Native language: U.S. English (米語)
Gender: Male

Re: Latin-character Character URLs

Postby jimbreen » Thu 10.29.2009 8:31 pm

血まみれ剣術師 wrote:I'm sure many have already heard this news, but I'm going to put it up for those who've not heard about it. The inevitable is coming... URLs that no longer require Latin based characters. "When?" Sometime mid next year URLs with different languages will become available. Japanese will be one of the languages available. This technology isn't new and has been worked on for several years. "...The English dominance over the internet is about to come to an end..." :?


I'll say it's not new.I wrote a paper mentioning it over 5 years ago. The .jp registry has allowed domain names in Japanese for about 6 years.

In fact non-Latin URLs has been MUCH slower taking off than people expected. A lot has been the fault of browser developers who haven't been that friendly in their implementations. Another is that once you have a domain name in, e.g., Chinese, it's very difficult for anyone outside China to access it unless they know how to key in Chinese. The experience in Japan was that a lot of companies, etc. dashed in and got Japanese domains, and then on reflection decided not to use them.

Jim
User avatar
jimbreen
 
Posts: 164
Joined: Tue 06.27.2006 2:09 am
Location: Melbourne, Australia

Re: Latin-character Character URLs

Postby furrykef » Thu 10.29.2009 9:21 pm

Let's not confuse URLs with domain names. URLs with Japanese characters are already available and probably have been for quite a long time (though in some browsers, the characters will be converted to unreadable numeric codes after you actually enter the URL). The Japanese Wikipedia, for example, uses Japanese characters in all of its article titles and therefore the URLs of all of its articles.

Allowing non-Latin characters in domain names opens a can of worms beyond just accessibility, because some non-Latin characters have the same letterforms as Latin characters, allowing a form of domain name spoofing where a domain name looks the same as another domain name. For example, Greek capital letter alpha looks exactly the same as the capital letter A, but is encoded differently since it belongs to a different alphabet. I think some browsers already have defense mechanisms against this, but I still don't like giving potential ammunition to scammers for little practical advantage.

My guess is that most people the world over are perfectly comfortable with Latin domain names as it is. It'd probably suck if you're in, say, Greece or Russia, though, since unlike China (which has pinyin) and Japan (which has romaji), I think they simply don't use the Latin alphabet in their day-to-day lives.

- Kef
Last edited by furrykef on Fri 10.30.2009 10:35 am, edited 1 time in total.
Founder of Learning Languages Through Video Games.
Also see my lang-8 journal, where you can help me practice Japanese (and Spanish, and Italian!)
User avatar
furrykef
 
Posts: 1572
Joined: Thu 01.10.2008 9:20 pm
Native language: Eggo (ワッフル語の方言)
Gender: Male

Re: Latin-character Character URLs

Postby phreadom » Fri 10.30.2009 12:57 am

jimbreen wrote:I'll say it's not new.I wrote a paper mentioning it over 5 years ago. The .jp registry has allowed domain names in Japanese for about 6 years.

In fact non-Latin URLs has been MUCH slower taking off than people expected. A lot has been the fault of browser developers who haven't been that friendly in their implementations. Another is that once you have a domain name in, e.g., Chinese, it's very difficult for anyone outside China to access it unless they know how to key in Chinese. The experience in Japan was that a lot of companies, etc. dashed in and got Japanese domains, and then on reflection decided not to use them.

Jim


Do you know any examples Jim? I'd be curious to try some out. :)

We were discussing this in the chat today... and I got the feeling that such domain names really wouldn't be used outside of that specific country or at least language group, thus really limiting their use and appeal.

I can see some countries doing it for nationalistic purposes (China, North Korea spring to mind, but I'm sure many others would do it for the same reason)... but the reality seems to be that this would only limit exposure and splinter the global nature of the web with a Babel effect.

Like you said, most people outside of a given country or language group aren't going to know how to key in different alphabets, syllabaries, logographies, etc, and thus wouldn't touch those websites.

(Perhaps they would simply be best used as aliases to existing sites, so that you could reach the site by different domain names in different languages, but would still get the same content.. or perhaps you might get your own localized content by going to a website by using your local version of the website's name... it will be interesting to see if/when it takes off.)
猿も木から落ちる
User avatar
phreadom
Site Admin
 
Posts: 1761
Joined: Sun 01.29.2006 8:43 pm
Location: Michigan, USA
Native language: U.S. English (米語)
Gender: Male

Re: Latin-character Character URLs

Postby jimbreen » Fri 10.30.2009 3:19 am

phreadom wrote:
jimbreen wrote:I'll say it's not new.I wrote a paper mentioning it over 5 years ago. The .jp registry has allowed domain names in Japanese for about 6 years.


Do you know any examples Jim? I'd be curious to try some out. :)


Try: http://えび田.jp/ - assuming your browser codes it OK.

Also try XN--MNQ89HQW2B.JP, which is the actual coding used in domain names. That site no longer exists, but you should be told the domain name (by Firefox, anyway - sunno about IE).

Jim
User avatar
jimbreen
 
Posts: 164
Joined: Tue 06.27.2006 2:09 am
Location: Melbourne, Australia

Re: Latin-character Character URLs

Postby phreadom » Fri 10.30.2009 3:36 am

jimbreen wrote:Try: http://えび田.jp/ - assuming your browser codes it OK.

Also try XN--MNQ89HQW2B.JP, which is the actual coding used in domain names. That site no longer exists, but you should be told the domain name (by Firefox, anyway - sunno about IE).

Jim


Yep, the example you gave (in kana/kanji) worked fine. :) Thanks! (the other one didn't)

http://www.csse.monash.edu.au/~jwb/jwww.html

Looks like you already had an article up that touched on this as well... just stumbled across it while searching for some other examples. :) (unfortunately it seems most of the examples in this don't work either)

On a side note, this raises another issue here on the forum... the URL code within the forum software doesn't handle Japanese script... so you can't make actual links out of either links such as http://えび田.jp/ , or even the Japanese Wikipedia links without first encoding them...

So http://ja.wikipedia.org/wiki/%E3%83%AD%E3%83%9C%E3%83%83%E3%83%88 works... but http://ja.wikipedia.org/wiki/ロボット doesn't. :( (and you can even see right there how the auto-link-creation code creates a broken link because it doesn't handle the katakana portion of the link and thus just cuts it off)

It's something on my TODO list to tackle... as it's kind of an important issue on a forum such as this. ;)
猿も木から落ちる
User avatar
phreadom
Site Admin
 
Posts: 1761
Joined: Sun 01.29.2006 8:43 pm
Location: Michigan, USA
Native language: U.S. English (米語)
Gender: Male

Re: Latin-character Character URLs

Postby hyperconjugated » Fri 10.30.2009 4:48 am

phreadom wrote:We were discussing this in the chat today... and I got the feeling that such domain names really wouldn't be used outside of that specific country or at least language group, thus really limiting their use and appeal.

I can see some countries doing it for nationalistic purposes (China, North Korea spring to mind, but I'm sure many others would do it for the same reason)... but the reality seems to be that this would only limit exposure and splinter the global nature of the web with a Babel effect.

Marshall Unger had an interesting view, expressing certain irony of the situation:
"Even today, the vast majority of those who use Japanese script on computers input data in romanization; to that extent, even though they may refuse to read data in romanized form, they already, in a psychologically fundamental way, make use of an alphabetic representation of Japanese words and phrases."
Irgendwann fällt jede Mauer
User avatar
hyperconjugated
 
Posts: 636
Joined: Fri 05.06.2005 5:12 pm
Location: Finland
Native language: Finnish

Re: Latin-character Character URLs

Postby keatonatron » Fri 10.30.2009 6:38 am

I made a post about this on this very site a good 3-4 years back.

I can see how it would be very helpful in Japan, because there are so many different ways to Romanize Japanese.

If you advertise your site as "しゅうしょく ドット コム", people wouldn't know if it's shuushoku.com, shyuushyoku.com, shūshoku.com, or something else. Shi/si, tsu/tu, zu/du would probably create the most problems. And if you have to explain how to romanize it, there's no real advantage to naming your site something easy to remember like 就職.com (which is now [er, has been] a valid URL).

I don't think the accessibility argument is that big of an issue. It's generally understood that Japanese URL's would only be used for sites that are targeted specifically to Japanese natives (i.e. the people with the hardest time remembering non-Japanese spelling!), and if an international site was needed a second domain could/would be set up (kind of like the companies that register both a .co.jp and .com name).

I do think the spoofing problems Furry mentioned would be a problem though... I hadn't thought of that.
User avatar
keatonatron
 
Posts: 4838
Joined: Sat 02.04.2006 3:31 am
Location: Tokyo (Via Seattle)
Native language: English
Gender: Male

Re: Latin-character Character URLs

Postby jimbreen » Fri 10.30.2009 6:52 am

hyperconjugated wrote:Marshall Unger had an interesting view, expressing certain irony of the situation:
"Even today, the vast majority of those who use Japanese script on computers input data in romanization; to that extent, even though they may refuse to read data in romanized form, they already, in a psychologically fundamental way, make use of an alphabetic representation of Japanese words and phrases."


Apply a lot of salt to Jim Unger's outpourings, especially on romanization.

Jim
User avatar
jimbreen
 
Posts: 164
Joined: Tue 06.27.2006 2:09 am
Location: Melbourne, Australia

Re: Latin-character Character URLs

Postby Yudan Taiteki » Fri 10.30.2009 8:34 am

That seems fairly non-controversial, though -- most Japanese people do use romaji input with the IME, as far as I'm aware.
-Chris Kern
User avatar
Yudan Taiteki
 
Posts: 5609
Joined: Wed 11.01.2006 11:32 pm
Native language: English

Re: Latin-character Character URLs

Postby 血まみれ剣術師 » Fri 10.30.2009 2:52 pm

You're somewhat correct on the URLs, but the the URLs to be released are entirely in foreign characters. The major difference is you can do everything without Latin characters. That's what makes this slightly different. They'll no longer need Latin characters to use email or surf the web. Many thought this idea was impossible to do, but it's going to happen next year. :D

Domain Names Available Today
Full Latin URL
domainname.TLD
Semi-Latin URL
실례.kr
Domain Names Coming
Localized
실례.테스트
血まみれ剣術師
 
Posts: 95
Joined: Wed 09.16.2009 8:26 pm
Location: North Carolina
Native language: English

Re: Latin-character Character URLs

Postby 血まみれ剣術師 » Fri 10.30.2009 3:07 pm

Not really. A person will probably have the option of typing all three ways sooner or later. Completely localizing the characters will help the rest of the world connect. I highly doubt that minor obstacle will become a barrier.

Examples (Remember that the actual TLDs have not been released yet)
mofa.go.jp
外務省.go.jp
外務省.試験.日本
がいむしょう.しけん.にほん (多分)

hyperconjugated wrote:
phreadom wrote:We were discussing this in the chat today... and I got the feeling that such domain names really wouldn't be used outside of that specific country or at least language group, thus really limiting their use and appeal.

I can see some countries doing it for nationalistic purposes (China, North Korea spring to mind, but I'm sure many others would do it for the same reason)... but the reality seems to be that this would only limit exposure and splinter the global nature of the web with a Babel effect.

Marshall Unger had an interesting view, expressing certain irony of the situation:
"Even today, the vast majority of those who use Japanese script on computers input data in romanization; to that extent, even though they may refuse to read data in romanized form, they already, in a psychologically fundamental way, make use of an alphabetic representation of Japanese words and phrases."
血まみれ剣術師
 
Posts: 95
Joined: Wed 09.16.2009 8:26 pm
Location: North Carolina
Native language: English


Return to Computers & Technology

Who is online

Users browsing this forum: No registered users and 0 guests