Learn Japanese with JapanesePod101.com

View topic - Basic Reading Material and Vocabulary

Basic Reading Material and Vocabulary

Have a textbook or grammar book that you find particularly helpful? What about a learning tip to share with others?

Basic Reading Material and Vocabulary

Postby xhilononi234 » Fri 12.04.2009 2:34 am

Does anyone know of any good (free) online sources to read Japanese stories online besides TJP? I have a pretty small vocabulary and lists don't really work well for me when I'm studying on my own (I'm in a 101 class right now and already know pretty much everything because the vocabulary is mostly thematic and I use it a lot in class so it stays in my head. Also, the fact that I studied through a 101-level book on my own didn't hurt.).

Most of the new words I come across are from songs, which aren't really accessible, at least legally, to people in America. Also, the best way to memorize words for me is to remember a context they're in for example, in 井上ジョーの(Joe Inoue's) CLOSER, I remembered 最近, 体験, 一体, and 恵む because the verse 「あなたが最近体験した幸せは一体何ですか。恵まれ過ぎていて思い出せないかも。」. Normally I'll only remember a part of a line and maybe one or two words. I remembered so many from this because I just had some of the kanji sent to be in the JTP kanji mailing list.

But I digress, sometimes, when all else fails, I will try and find lists, but most of them are very basic vocabulary that I already know. Does anyone know sites for good vocabulary lists that goes beyond basic expressions, house and restaurant words, etc.?

Thank you!
xhilononi234
 
Posts: 44
Joined: Fri 12.15.2006 8:44 pm

Re: Basic Reading Material and Vocabulary

Postby wccrawford » Fri 12.04.2009 7:31 am

You mentioned learning from music. Have you seen jpopasia.com? It has music videos and people post the lyrics in romaji, kanji, and the translation in English.
wccrawford
 
Posts: 41
Joined: Mon 05.12.2008 8:53 pm

Re: Basic Reading Material and Vocabulary

Postby Darkseed74 » Fri 12.04.2009 10:49 am

xhilononi234 wrote:Does anyone know of any good (free) online sources to read Japanese stories online besides TJP? I have a pretty small vocabulary and lists don't really work well for me when I'm studying on my own !

Perhaps something like Old Stories of Japan?
User avatar
Darkseed74
 
Posts: 28
Joined: Sun 08.31.2008 8:47 am
Native language: Italian
Gender: Male

Re: Basic Reading Material and Vocabulary

Postby xhilononi234 » Fri 12.04.2009 1:40 pm

Thanks, everyone!

Darkseed, do you know if they are in Japanese? I checked out the site, but everything was in English.
xhilononi234
 
Posts: 44
Joined: Fri 12.15.2006 8:44 pm

Re: Basic Reading Material and Vocabulary

Postby Darkseed74 » Fri 12.04.2009 1:51 pm

Yes, you have to click on "Japanese" on the top of each story (near the banner with the title). :)
User avatar
Darkseed74
 
Posts: 28
Joined: Sun 08.31.2008 8:47 am
Native language: Italian
Gender: Male

Re: Basic Reading Material and Vocabulary

Postby clay » Fri 12.04.2009 2:17 pm

I know you said beside TJP, but have you seen these:
http://thejapanesepage.com/ebooks

Hana and Yuki may be difficult, but the other three are targeting beginner / upper beginner.
TheJapanShop.com- Japanese language learning materials
Checkout our iPhone apps: TheJapanesePage.com/iPhone
User avatar
clay
Site Admin
 
Posts: 2809
Joined: Fri 01.21.2005 9:39 am
Location: Florida

Re: Basic Reading Material and Vocabulary

Postby xhilononi234 » Sat 12.05.2009 6:16 pm

Thank you, everyone. I just need to figure out how to find a way to get Old Stories of Japan to work right. I clicked on Japanese and the characters aren't displayed properly I'll try messing with things! (I changed my browser. it worked with Internet Explorer, but not with mozilla
xhilononi234
 
Posts: 44
Joined: Fri 12.15.2006 8:44 pm

Re: Basic Reading Material and Vocabulary

Postby yukamina » Sun 12.06.2009 4:14 pm

xhilononi234 wrote:Thank you, everyone. I just need to figure out how to find a way to get Old Stories of Japan to work right. I clicked on Japanese and the characters aren't displayed properly I'll try messing with things! (I changed my browser. it worked with Internet Explorer, but not with mozilla

Did you try changing the encoding in mozilla?
yukamina
 
Posts: 288
Joined: Tue 06.05.2007 1:41 am

Re: Basic Reading Material and Vocabulary

Postby phreadom » Sun 12.06.2009 4:42 pm

xhilononi234 wrote:Thank you, everyone. I just need to figure out how to find a way to get Old Stories of Japan to work right. I clicked on Japanese and the characters aren't displayed properly I'll try messing with things! (I changed my browser. it worked with Internet Explorer, but not with mozilla


The encoding isn't detected correctly (and should be specified by the people who made the pages technically), so you need to go to "View" → "Character Encoding" → "Japanese (EUC-JP)", and if you don't see it listed there, then go to "View" → "Character Encoding" → "More Encodings" → "East Asian" → "Japanese (EUC-JP)" and then the page will work fine.

("View" being up in the menu at the top of your browser... File, Edit, View, History, etc.)

I just tested it out here. :)

I'd really recommend sticking with FireFox, that way you can also use Rikaichan! :D
猿も木から落ちる
User avatar
phreadom
Site Admin
 
Posts: 1761
Joined: Sun 01.29.2006 8:43 pm
Location: Michigan, USA
Native language: U.S. English (米語)
Gender: Male

Re: Basic Reading Material and Vocabulary

Postby furrykef » Sun 12.06.2009 7:01 pm

The encoding isn't detected correctly (and should be specified by the people who made the pages technically)


"Technically", nothing. If your web page uses anything other than plain ASCII text, it should declare the encoding in the HTTP header, period. Any webmaster who doesn't is clueless and asking for trouble.

Sadly, Japanese websites are often designed by such clueless people. It really isn't that hard to configure a web server to declare the proper encoding (unless you're on a service such as, oh, geocities.co.jp, but in that case it's GeoCities who's clueless).

- Kef
Founder of Learning Languages Through Video Games.
Also see my lang-8 journal, where you can help me practice Japanese (and Spanish, and Italian!)
User avatar
furrykef
 
Posts: 1572
Joined: Thu 01.10.2008 9:20 pm
Native language: Eggo (ワッフル語の方言)
Gender: Male

Re: Basic Reading Material and Vocabulary

Postby phreadom » Sun 12.06.2009 8:45 pm

Well, I was trying to be nice about it. ;) Yes, it's actually a mandatory part of the HTML specs that you specify these things.

The real issue seems to be that while the authors did specify an encoding, they used an invalid one.

Code: Select all
<META HTTP-EQUIV="Content-Type" CONTENT="text/html;charset=X-EUC">


So the page is invalid and hence doesn't work. You have to manually override this with a valid encoding for it to work. IE only works because IE was made to let people get away with horribly broken and sloppy invalid code... which is not a good thing in reality.

Anyway... maybe they meant "x-euc-jp" which might have been a valid experimental encoding years ago... I really don't know. The page is invalid for a number of reasons aside from that, so it's kind of irrelevant to discuss it further. Enough thread hijacking. ;)
猿も木から落ちる
User avatar
phreadom
Site Admin
 
Posts: 1761
Joined: Sun 01.29.2006 8:43 pm
Location: Michigan, USA
Native language: U.S. English (米語)
Gender: Male

Re: Basic Reading Material and Vocabulary

Postby furrykef » Mon 12.07.2009 2:45 am

Actually, the browser isn't supposed to use the meta tag to guess the encoding anyway if the document was transmitted by HTTP (as opposed to, say, being opened on your hard drive). Some browsers probably will, but you're supposed to specify the encoding in the HTTP header, too.
Founder of Learning Languages Through Video Games.
Also see my lang-8 journal, where you can help me practice Japanese (and Spanish, and Italian!)
User avatar
furrykef
 
Posts: 1572
Joined: Thu 01.10.2008 9:20 pm
Native language: Eggo (ワッフル語の方言)
Gender: Male

Re: Basic Reading Material and Vocabulary

Postby phreadom » Mon 12.07.2009 5:50 am

While yes, it should be set on the server level, people don't always have access to their hosts today, so specifying it in the document is also a requirement (recommendation) so that regardless of the server or medium you're loading the page from, the browser will always know which encoding the document is using.

From http://diveintohtml5.org/semantics.html ;

So, how does your browser actually determine the character encoding of the stream of bytes that a web server sends? I’m glad you asked. If you’re familiar with HTTP headers, you may have seen a header like this:

Code: Select all
 Content-Type: text/html; charset="utf-8"


Briefly, this says that the web server thinks it’s sending you an HTML document, and that it thinks the document uses the UTF-8 character encoding. Unfortunately, in the whole magnificent soup of the world wide web, very few authors actually have control over their HTTP server. Think Blogger: the content is provided by individuals, but the servers are run by Google. So HTML 4 provided a way to specify the character encoding in the HTML document itself. You’ve probably seen this too:

Code: Select all
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">


Briefly, this says that the web author thinks they have authored an HTML document using the UTF-8 character encoding.

Both of these techniques still work in HTML5. The HTTP header is the preferred method, and it overrides the <meta> tag if present. But not everyone can set HTTP headers, so the <meta> tag is still around. In fact, it got a little easier in HTML5. Now it looks like this:

Code: Select all
 <meta charset="utf-8">


This works in all browsers.


And....

Q: I never use funny characters. Do I still need to declare my character encoding?

A: Yes! You should always specify a character encoding on every HTML page you serve. Not specifying an encoding can lead to security vulnerabilities.


And more specifically, straight from the World Wide Web Consortium's official HTML 4.01 spec...

To address server or configuration limitations, HTML documents may include explicit information about the document's character encoding; the META element can be used to provide user agents with this information.

For example, to specify that the character encoding of the current document is "EUC-JP", a document should include the following META declaration:

Code: Select all
<META http-equiv="Content-Type" content="text/html; charset=EUC-JP">


The META declaration must only be used when the character encoding is organized such that ASCII-valued bytes stand for ASCII characters (at least until the META element is parsed). META declarations should appear as early as possible in the HEAD element.

For cases where neither the HTTP protocol nor the META element provides information about the character encoding of a document, HTML also provides the charset attribute on several elements. By combining these mechanisms, an author can greatly improve the chances that, when the user retrieves a resource, the user agent will recognize the character encoding.

To sum up, conforming user agents must observe the following priorities when determining a document's character encoding (from highest priority to lowest):

  1. An HTTP "charset" parameter in a "Content-Type" field.
  2. A META declaration with "http-equiv" set to "Content-Type" and a value set for "charset".
  3. The charset attribute set on an element that designates an external resource.
In addition to this list of priorities, the user agent may use heuristics and user settings. For example, many user agents use a heuristic to distinguish the various encodings used for Japanese text. Also, user agents typically have a user-definable, local default character encoding which they apply in the absence of other indicators.

User agents may provide a mechanism that allows users to override incorrect "charset" information. However, if a user agent offers such a mechanism, it should only offer it for browsing and not for editing, to avoid the creation of Web pages marked with an incorrect "charset" parameter.


And further from the XHTML 1.0 (second edition) spec...

Historically, the character encoding of an HTML document is either specified by a web server via the charset parameter of the HTTP Content-Type header, or via a meta element in the document itself. In an XML document, the character encoding of the document is specified on the XML declaration (e.g., <?xml version="1.0" encoding="EUC-JP"?>). In order to portably present documents with specific character encodings, the best approach is to ensure that the web server provides the correct headers. If this is not possible, a document that wants to set its character encoding explicitly must include both the XML declaration an encoding declaration and a meta http-equiv statement (e.g., <meta http-equiv="Content-type" content="text/html; charset=EUC-JP" />). In XHTML-conforming user agents, the value of the encoding declaration of the XML declaration takes precedence.


So in short, for what appears to be all modern versions of HTML, XHTML, and XML for web content, the order is defined as follows, and should be correctly defined within the document itself as a best practice both to avoid cases where the server either isn't configured correctly or doesn't support sending content type headers, and to avoid security issues;

  1. HTTP Content-Type (headers)
  2. then (if applicable) xml declaration
  3. then look for a meta
  4. then fall back to utf-8

Hopefully that sums it up well enough? ;)
猿も木から落ちる
User avatar
phreadom
Site Admin
 
Posts: 1761
Joined: Sun 01.29.2006 8:43 pm
Location: Michigan, USA
Native language: U.S. English (米語)
Gender: Male

Re: Basic Reading Material and Vocabulary

Postby furrykef » Mon 12.07.2009 11:01 am

phreadom wrote:people don't always have access to their hosts today


And that's the real problem. Hosts should provide an option for setting that sort of thing. (The best way would be .htaccess, but other solutions should be possible in an environment that doesn't allow .htaccess.) The impression I have is that hosts that don't allow it are basically just too lazy to implement it.

I guess they just feel that they have other priorities, but you'd think that a reasonably correct implementation of HTTP, which is the backbone of the entire web, would have at least a little priority...

And, of course, the very fact that a web browser, even today, mangled somebody's page demonstrates that it's a real problem, not just an imaginary problem that nobody ever actually has. (Heck, I run into this sort of problem from time to time myself, even in Firefox.)

- Kef
Founder of Learning Languages Through Video Games.
Also see my lang-8 journal, where you can help me practice Japanese (and Spanish, and Italian!)
User avatar
furrykef
 
Posts: 1572
Joined: Thu 01.10.2008 9:20 pm
Native language: Eggo (ワッフル語の方言)
Gender: Male

Re: Basic Reading Material and Vocabulary

Postby phreadom » Mon 12.07.2009 12:22 pm

furrykef wrote:
phreadom wrote:people don't always have access to their hosts today


And that's the real problem. Hosts should provide an option for setting that sort of thing. (The best way would be .htaccess, but other solutions should be possible in an environment that doesn't allow .htaccess.) The impression I have is that hosts that don't allow it are basically just too lazy to implement it.

I guess they just feel that they have other priorities, but you'd think that a reasonably correct implementation of HTTP, which is the backbone of the entire web, would have at least a little priority...

And, of course, the very fact that a web browser, even today, mangled somebody's page demonstrates that it's a real problem, not just an imaginary problem that nobody ever actually has. (Heck, I run into this sort of problem from time to time myself, even in Firefox.)

- Kef


I wouldn't say the browser itself mangled the page as though it were the browsers fault. The server didn't provide a content type, and the web page itself provided an invalid type. So the browser probably tried to honor the content type provided and that ended up breaking things. Perhaps we could say that if an invalid content type is given, perhaps the browser should try sniffing the content type from the page (which is in EUC-JP itself). But that's not the standard. The standard is to provide the content type in the HTTP headers, or to provide it in the code (in the xml declaration, or in the meta tag, in that order), or to fall back to utf-8. So even were the browser to have done exactly that, it would have still broken because the page wasn't in utf-8 either. Beyond that is outside of the scope of the standards.

I don't think the standards people make such a big deal about the content type being delivered by the server (and they don't as far as I can see).

Even though that's listed as the most correct way to do it as a recommendation, they're pretty clear about providing it in the code yourself as a fall-back, in case you're loading it from your hard drive, or viewing it by some other means that doesn't support content typing... or perhaps its old server software that doesn't support configuring the content type etc.

You're making much too big a deal about setting it on the server when you should just accept the fact that that's the very reason the ability to set it exists in the code in the first place is to avoid making that such a big deal. Set it correctly in your code and the problem is solved. If the server sets it (correctly), then great. If not, it doesn't matter at all because you set it correctly in your code as well (as you should, otherwise your code would be invalid if you checked the code itself alone, as through the input method of the w3 validator etc).

As a matter of fact, it's not mandatory per the HTTP spec that Content-type be set on the server either. It recommends it (SHOULD), but does not require it (MUST), and there is a real difference there.... so you don't exactly have the grounds to argue that people aren't correctly implementing the spec or following it in their servers either. It's best practice today if they do, and allow you the means of configuring it yourself, but even when they started supporting sending the content-type they didn't always offer you the ability to configure it yourself etc...

The key is that they don't have to, and they're not violating the spec if they don't.

However if you allow your page to be sent without a content type because you refuse to set it, or set it incorrectly, or think that the server should have it configured when they're technically not required to do so... the fault rests squarely on your shoulders... thus set the content type yourself on every document and set it correctly. (And there ARE specs for which content types are valid, so if you break that spec in setting your content type, the fault is still yours.)
猿も木から落ちる
User avatar
phreadom
Site Admin
 
Posts: 1761
Joined: Sun 01.29.2006 8:43 pm
Location: Michigan, USA
Native language: U.S. English (米語)
Gender: Male

Next

Return to Learning Materials Reviews & Language Learning tips

Who is online

Users browsing this forum: No registered users and 10 guests