ApolloHoax.net

Off Topic => Tech Support => Topic started by: Bob B. on January 16, 2013, 08:54:58 AM

Title: ASCII Characters
Post by: Bob B. on January 16, 2013, 08:54:58 AM
Since many scientific variables are represented by Greek letters, I've often tried to use ASCII codes to display these when posting to the forum.  I'm almost certain that at one time this worked, but lately no.  (Perhaps the last software update disabled something.)  Doubly frustrating is that often the characters seem to display OK in a preview, but when I post, they're replaced by a ? symbol.  Do you know of anyway to use or allow special characters?

(edit)  Oops!  I actually meant to post this in The Space Race Forum but the same question might apply here because I believe we're using the same software.
Title: Re: ASCII Characters
Post by: Bob B. on January 16, 2013, 09:13:08 AM
The following is a test

∞ ÷ Δ ω

(edit)  OK, nevermind.  It looks like all the characters displayed correctly.  The problem I'm having at The Space Race forum is apparently not happening here.  That's odd because I believe Lunar Orbit is using the same software at both forums.
Title: ASCII Characters
Post by: LunarOrbit on January 16, 2013, 11:30:34 AM
Hmmm. Yeah, it's the same software. I'll have to look into it when I get home.
Title: Re: ASCII Characters
Post by: Not Myself on January 27, 2013, 12:27:16 PM
I've often tried to use ASCII codes to display these when posting to the forum.

May I ask what exactly this method is?  Did you enter some kind of code for characters in the 128-255 range?  Is that the way you entered the characters in the later post, or was that a cut-and-paste job?

Title: Re: ASCII Characters
Post by: grmcdorman on January 27, 2013, 04:25:37 PM
By the way, just to be pedantic: those characters aren't ASCII. ASCII is only the first 96 characters (ordinals 32 through 127, basically the characters on a US keyboard: letters, digits, and punctuation). Characters above 128 vary; the three most common 8-bit character sets are ISO LATIN-1 (ISO-8859-1), the Windows character set (Windows-1252), and UTF-8. Given that Bob B. is trying to enter Greek characters, he probably expects Latin 1 or UTF-8.


My Web developer tools in Firefox report this site as UTF-8, by the way.

/pedant
Title: Re: ASCII Characters
Post by: ka9q on January 27, 2013, 10:12:23 PM
To contribute to the pedanticism, UTF-8 is not an 8-bit character set. It is a variable length (1-4 byte) encoding of the (very large) Unicode character set, designed such that the first 128 entries have the same encoding as 8-bit ASCII. (ASCII is actually a 7-bit code, so 8-bit ASCII has a '0' in the most significant bit.)

When other Unicode characters are needed, UTF-8 always encodes them into two or more bytes.

So how do we enter Greek or other non-ASCII characters?
Title: Re: ASCII Characters
Post by: Bob B. on January 27, 2013, 10:28:44 PM
If you have a table of codes like this one,

http://www.asciitable.com/

you can display the character by typing in the number of the character while holding down the ALT key.  For example, if I press and hold ALT while typing 234 235 236 237 238, I get Ωδ∞φε.  You can also copy and paste characters from other sources.
Title: Re: ASCII Characters
Post by: Not Myself on January 28, 2013, 01:14:11 AM
If you have a table of codes like this one,

http://www.asciitable.com/

you can display the character by typing in the number of the character while holding down the ALT key.  For example, if I press and hold ALT while typing 234 235 236 237 238, I get Ωδ∞φε.  You can also copy and paste characters from other sources.

Ah I see.  Then I would say the problem is almost certainly one of encoding, and specifically what grmcdorman cites.

By the way, just to be pedantic: those characters aren't ASCII. ASCII is only the first 96 characters (ordinals 32 through 127, basically the characters on a US keyboard: letters, digits, and punctuation). Characters above 128 vary; the three most common 8-bit character sets are ISO LATIN-1 (ISO-8859-1), the Windows character set (Windows-1252), and UTF-8. Given that Bob B. is trying to enter Greek characters, he probably expects Latin 1 or UTF-8.


My Web developer tools in Firefox report this site as UTF-8, by the way.

/pedant

Getting in the spirit of things:

[pedant]Pretty much all the common encodings (including UTF-8) agree on the meanings of up to 127; from 128 to 255 is encoding-specific.  Many old systems used the eighth bit for parity or similar purposes.  When that stopped being cool and trendy, the eighth bit became available to convey non-redundant information, and was frequently used to encode characters commonly used in non-English languages, which were not included in the lower 128.  But there are too many Greek, Russian, Hebrew, etc. letters, to fit in the range 128-255.  As a result, in computer in Israel may well use the range 128-255 to use different characters than a computer in Russia, and opening a document produced in one country on a computer in another may result in the display of gibberish, if the software on the target computer is not capable of identifying (or being told) and using the correct encoding.

So I would think that what is happening on the other board is, you are entering characters using one encoding, and it displays them using a different encoding, the result of which will be perfectly OK for characters up to 127, but gibberish after that.  So options are

a) change the default encoding on the other board to what you want - would have to be something the board software can do, and the board administrator would agree to.

b) use the existing default encoding on the other board - the default encoding would have to support the characters you want, and you would have to cope with entering the same character one way on that board, and a different way on other boards.

c) change the encoding on a message-by-message basis - I don't know whether this is possible, but maybe there is a bbcode or something of the like that does it.

Seems to me the world is moving towards UTF-8 (as per above, a 1-4 byte encoding, in which certain bytes indicate that this character is continued into the next byte, but coincides with ASCII up to 127), but maybe there are good reasons to stick with other encodings.
[/pedant]

Title: Re: ASCII Characters
Post by: Not Myself on January 28, 2013, 01:24:57 AM
Decided to register at the other place to see if I could work out what the encoding was, but got

Quote
The user Oxyartes with Email <email address> (IP <IP address>) is a Spam, please contact forum administrator.

with no immediately obvious way of contacting the aforementioned administrator.

This board can even handle things like 紅毛, I'd be surprised if it weren't UTF-8.
Title: Re: ASCII Characters
Post by: Not Myself on January 28, 2013, 01:42:17 AM
If you have a table of codes like this one,

http://www.asciitable.com/

you can display the character by typing in the number of the character while holding down the ALT key.  For example, if I press and hold ALT while typing 234 235 236 237 238, I get Ωδ∞φε.  You can also copy and paste characters from other sources.

Looks to me like you are using Code Page 437.

http://en.wikipedia.org/wiki/Code_page_437

Given that this works for you here at this board, which I think must be using UTF-8, I'm actually scratching my head a bit wondering how this is actually working.

So on your computer, you enter the code for Greek letters, based on Code Page 437.  Your browser queues these up displaying (I assume) them properly, and then when you click "post", transfers the whole lot to this board, where I can view what you typed properly.

So one possibility is, this board knows that you are using Code Page 437, and interprets your post using this Code Page whenever it is displayed.  But if that's the case, then I shouldn't be able to do this: 紅毛, since these characters aren't in the CP-437 character set.

Another possibility is, the board stores everything in UTF-8, but you are entering things in CP-437 (which does not match UTF-8 on 128-255), so some piece of software does the mapping form CP-437 to UTF-8.  Which piece of software that is, and whether it is on your computer, or the computer the board is served from, I do not know.

Title: Re: ASCII Characters
Post by: LunarOrbit on January 28, 2013, 04:00:29 AM
Decided to register at the other place to see if I could work out what the encoding was, but got

Quote
The user Oxyartes with Email <email address> (IP <IP address>) is a Spam, please contact forum administrator.

with no immediately obvious way of contacting the aforementioned administrator.

This board can even handle things like 紅毛, I'd be surprised if it weren't UTF-8.

I'm the admin of the other forum. Send me a PM here with the email address you used and I'll create an account for you.
Title: Re: ASCII Characters
Post by: LunarOrbit on January 28, 2013, 04:19:00 AM
Never mind, I got the email address from the error log. I tried setting up an account for you and got an error saying that address was being used by another account already.

Quote
This board can even handle things like 紅毛...

Yeah, I meant to talk to you about that. I would like you to change your display name back to what it was when you registered because people can't easily refer to you by name if they can't figure out how to type the characters. If you don't want to use that name for some reason then another name using a-z and 0-9 characters is okay too. Thanks.
Title: Re: ASCII Characters
Post by: Not Myself on January 28, 2013, 05:31:59 AM
In the opposite order:

Yeah, I meant to talk to you about that. I would like you to change your display name back to what it was when you registered because people can't easily refer to you by name if they can't figure out how to type the characters. If you don't want to use that name for some reason then another name using a-z and 0-9 characters is okay too. Thanks.

It is now changed to something that should be easily typed on western keyboards.

Never mind, I got the email address from the error log. I tried setting up an account for you and got an error saying that address was being used by another account already.

 :-[  I guess I forgot.

I'll go have a look and try to locate my old ID.
Title: Re: ASCII Characters
Post by: Not Myself on January 28, 2013, 05:35:47 AM
Ah yes.  It was an ID I hadn't used in quite a long time  :-[
Title: Re: ASCII Characters
Post by: LunarOrbit on January 28, 2013, 07:43:23 AM
It is now changed to something that should be easily typed on western keyboards.

Thanks.
Title: Re: ASCII Characters
Post by: Not Myself on January 28, 2013, 09:35:29 AM
If you have a table of codes like this one,

http://www.asciitable.com/

you can display the character by typing in the number of the character while holding down the ALT key.  For example, if I press and hold ALT while typing 234 235 236 237 238, I get Ωδ∞φε.  You can also copy and paste characters from other sources.

My best guess is that a conversion is taking place, and it is on your computer.  The same non-ASCII codes entered by me produce êëìíî.  The "ALT" method doesn't work for me, as I'm on a different OS - I wrote a quicky C program to output these character codes to a text file, then cut and paste into the browser window.

Title: Re: ASCII Characters
Post by: cjameshuff on January 28, 2013, 10:28:19 AM
That "extended ASCII" table is one of multiple 8-bit encodings that add to ASCII, not compatible with UTF-8 and not itself part of ASCII. In particular, that looks like IBM code page 437 (http://en.wikipedia.org/wiki/Code_page_437), which at this point I would only expect to work on MS operating systems. Some browsers may assume you intend to send the exact characters you type, others may convert to Unicode.

You might be better off copying characters from a site like: http://unicodelookup.com/
Title: Re: ASCII Characters
Post by: Bob B. on January 28, 2013, 10:32:03 AM
I've noticed that I can't get all the codes to work here.  For instance, the entire Greek alphabet can be displayed using numbers in the 900-range, as seen here:

http://chemistry.about.com/od/chartstables/a/htmlgreek.htm

Using the ALT+number method in the forum yields different characters then indicated in the above web page.  For instance ALT+916 yields ö instead of the letter delta.  What I've done in the past is to type the code into Word and then copy and past the letter into the forum, thus I get Δ.

I noticed last night for the first time that when I type the codes into Word on my home computer I get the same characters as displayed in the forum, but when I do it on my work computer I get the Greek letters.  My work computer has a newer version of Word, so maybe that's the reason.  (Of course neither version is exactly "new".  I use Office 2003 at work and Office 2000 at home.)

Unfortunately, nothing I do at TheSpaceRace works.  I can't get that form to display special characters no matter what I do.
Title: Re: ASCII Characters
Post by: Bob B. on January 28, 2013, 10:55:24 AM
Just for reference, below is the entire Greek alphabet - typed into Word using the 900-series codes and then copied and pasted here:

Α α Β β Γ γ Δ δ Ε ε Ζ ζ Η η Θ θ Ι ι Κ κ Λ λ Μ μ Ν ν Ξ ξ Ο ο Π π Ρ ρ Σ σ ς Τ τ Υ υ Φ φ Χ χ Ψ ψ Ω ω
Title: Re: ASCII Characters
Post by: Bob B. on January 28, 2013, 11:00:56 AM
Another solution is to add a Symbol font to the forum.  Is that possible?
Title: Re: ASCII Characters
Post by: cjameshuff on January 28, 2013, 11:40:27 AM
Another solution is to add a Symbol font to the forum.  Is that possible?

There's already LaTeX support.

[jstex] \alpha \thetao \tau \beta \vartheta \pi \upsilon \gamma \gamma \varpi \phi \delta \kappa \rho \varphi \epsilon \lambda \varrho \chi \varepsilon \mu \sigma \psi \zeta \nu \varsigma \omega \eta \xi

 \Gamma \Lambda \Sigma \Psi \Deltai \Upsilon \Omega \Theta \Pi \Phi[/jstex]

...though the broken preview makes it a bit of a pain to use...
Title: Re: ASCII Characters
Post by: Not Myself on January 28, 2013, 11:44:27 AM
Just for reference, below is the entire Greek alphabet - typed into Word using the 900-series codes and then copied and pasted here:

Α α Β β Γ γ Δ δ Ε ε Ζ ζ Η η Θ θ Ι ι Κ κ Λ λ Μ μ Ν ν Ξ ξ Ο ο Π π Ρ ρ Σ σ ς Τ τ Υ υ Φ φ Χ χ Ψ ψ Ω ω

The 900-series codes are for UTF-8, and when I cut the above text here, paste it into a text editor on my computer, and then write a quick C program to print out the decimal values of the bytes, I get that each Greek letter above is two bytes, beginning with either 206 or 207 decimal.  Haven't specifically checked, but I'm pretty confident this is UTF-8.

The early codes you linked were not UTF-8 encoding, but CP-437, and I'm surprised they worked at all.  I suspect, as cjameshuff proposed, that your browser was intelligent about it and converted the CP-437 encodings to UTF-8 encodings when shipping them off to the board.

If I may ask for a confirmation - the trouble you have at this board occurs when you try to enter the 900-series codes into a browser window, then post?

I think I understand what is happening.  When you enter the 916 code, which is supposed to specify the UTF-8 encoding for a Greek delta, you are getting an ö.  In the CP-437 encoding, this letter has the eight-bit code 148 (decimal).  In hexadecimal, 916 is 0x394, and 148 is 0x94.  I suspect that is not coincidence.  So you are entering the code for the UTF-8 encoding of a Greek delta, but your browser thinks you want CP-437, and misinterprets your input (also throwing away the "3" digit, since CP-437 codes are all two hexadecimal digits).

So I would have a look around all the browser settings, to see how you have "encoding" or something similar set.  I suspect it is set for "CP-437", "US", or something like that.  If so, and you have a UTF-8 option, try changing to that, and see if it works any better.

Title: Re: ASCII Characters
Post by: Not Myself on January 28, 2013, 11:46:19 AM
Let's see if this works.

[jstex]\int_{-\infty}^{+\infty}\frac{1}{\sqrt{2\pi}}e^{-\frac{u^{2}}{2}}d u=1[/jstex]
Title: Re: ASCII Characters
Post by: Bob B. on January 28, 2013, 12:00:16 PM
If I may ask for a confirmation - the trouble you have at this board occurs when you try to enter the 900-series codes into a browser window, then post?

Correct.  It's when I'm in a reply text box in the Browser.

Quote
So I would have a look around all the browser settings, to see how you have "encoding" or something similar set.  I suspect it is set for "CP-437", "US", or something like that.  If so, and you have a UTF-8 option, try changing to that, and see if it works any better.

Thanks.  I'll look into that when I get an opportunity.
Title: Re: ASCII Characters
Post by: grmcdorman on January 28, 2013, 01:46:30 PM
In Firefox, the default encoding is in Options|Content, in the dialog box shown by Advanced... under Fonts & Colors.

However, the server can also specify, in the headers, the character set it is using. From a quick web search, user agents (that is, browsers) can choose to return POST content using the same value. It is also possible to explicitly specify the character set to be used in a FORM (i.e. the type-in boxes for posting messages).

Inspecting the content of this page, the Quick Reply box does just that; see the following. Note the accept-charset="UTF-8". That means that, at least for that box, the browser must send UTF-8 to the server.
Code: [Select]
<form action="http://www.apollohoax.net/forum/index.php?action=quickmod2;topic=336.15" method="post" accept-charset="UTF-8" name="quickModForm" id="quickModForm" style="margin: 0;" onsubmit="return oQuickModify.bInEditMode ? oQuickModify.modifySave('ff2e0a69ff2041e702688f31cd533f0f', 'adc929d5') : false">
For reference, here are the headers supplied by this server; note that the character set is UTF-8.
Code: [Select]
HTTP/1.1 200 OK
Date: Mon, 28 Jan 2013 18:39:48 GMT
Server: Apache/2.2.22 (Unix) mod_ssl/2.2.22 OpenSSL/0.9.8e-fips-rhel5 mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/5.0.2.2635 mod_fcgid/2.3.5
X-Powered-By: PHP/5.3.15
Pragma: no-cache
Cache-Control: private
Expires: Mon, 26 Jul 1997 05:00:00 GMT
Set-Cookie: PHPSESSID=a1caf8341533381e6b1f79998bce9669; path=/
Last-Modified: Mon, 28 Jan 2013 18:39:48 GMT
Connection: close
Content-Type: text/html; charset=UTF-8

ETA: Bob B., what browser are you using? (product, e.g. IE, and version, e.g. 9).
Title: Re: ASCII Characters
Post by: Bob B. on January 28, 2013, 02:51:33 PM
Bob B., what browser are you using? (product, e.g. IE, and version, e.g. 9).

At work I'm using IE8.  I don't remember what I'm using at home, but probably the same.
Title: Re: ASCII Characters
Post by: Bob B. on January 28, 2013, 02:59:36 PM
So I would have a look around all the browser settings, to see how you have "encoding" or something similar set.  I suspect it is set for "CP-437", "US", or something like that.  If so, and you have a UTF-8 option, try changing to that, and see if it works any better.

I just found "Encoding" listed under the "View" menu.  It is currently set to "Unicode (UTF-8)".
Title: Re: ASCII Characters
Post by: Not Myself on January 28, 2013, 09:30:01 PM
Very strange.

I think there are two distinct phenomena at work here.  One is an issue with the way the other board is set up (this is a conjecture, not proved).  The other has to do with the way your computer is set up.

This page

http://en.wikipedia.org/wiki/Alt_code

suggests that a registry hack is needed in Windows to get the Unicode (the 900-series) codes to work.  It also explains how the CP-437 codes (which you seem able to use successfully here, but not at the other board) get converted to UTF-8 - this seems to have been a deliberate measure by Microsoft to maintain compatibility with what had become a popular input method.

So I think that explains pretty much all the behaviour you see at this board, except that you are able to use the 900-series codes in Word.  I wonder if specific applications have the ability to access/override the normal key handling methods, and the new version of Word has chosen to do this.

Regarding the other board, I have been able to replicate the behaviour you describe in PMs to myself - the Greek letters (entered in Unicode) look fine in the preview, then become question marks in the final version.  I suspect (no proof whatsoever) that the final "post" fails to specify (or specifies incorrectly) the encoding, so the nice Unicode characters in the preview are forced to be converted to CP-437 or Windows-1252 or something like that, and they just get killed instead.

So if you are willing to futz around with your registry, you could enable the Unicode alt-codes (like the 900-series).  It sounds like you could still use the CP-437 codes (the ones from 128-255), just pick whichever is most convenient in a given situation.  I think this would probably eliminate the need to go through Microsoft Word, and would get the 900-series codes working on your other computer as well, at least at this board.

I suspect the issue at the other board is beyond your control.  I'll see if I can have a look at the page source, and work out what the encoding is.
Title: Re: ASCII Characters
Post by: LunarOrbit on January 28, 2013, 09:49:47 PM
I think there are two distinct phenomena at work here.  One is an issue with the way the other board is set up (this is a conjecture, not proved).

This is probably true, but I can't figure out what the difference is. Both forums are using the same software, on the same server. TheSpaceRace.com is over a decade old now, and has gone through many software updates, so I'm thinking the database might be corrupt. ApolloHoax.net, on the other hand, is less than a year old (in it's current incarnation) and has a "fresher" database. It's only had a few minor updates installed.

The weird thing is that I can copy & paste Bob's greek characters into the forum, hit preview and see the characters fine... it's only after I have saved the post that the characters get converted into question marks. So I think the MySQL database is doing the character conversion, not the forum software.

Quote
The other has to do with the way your computer is set up.

Maybe, but like I said, I can reproduce the same problem that Bob is experiencing, so I don't think it's related to his computer.

Quote
I'll see if I can have a look at the page source, and work out what the encoding is.

The forum software is configured to use UTF-8 and the template also has UTF-8 declared in the header:

Code: [Select]
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
That is another strange thing about the problem. Both forums are not only using the same software, they're both using modified versions of the same template.
Title: Re: ASCII Characters
Post by: Not Myself on January 28, 2013, 10:02:58 PM
Quote from: LunarOrbit link=topic=336.msg11091#msg11091
Quote
The other has to do with the way your computer is set up.

Maybe, but like I said, I can reproduce the same problem that Bob is experiencing, so I don't think it's related to his computer.

Yes, that's right, I did not mean to suggest there was something idiosyncratic about his computer - it appears to be the default way Windows computers work, not to accept the Alt-key codes for UTF-8.

The really weird thing is that he reports these codes do work in MS Word.  I guess that software must have its own key-handling code that overrides the system.
Title: Re: ASCII Characters
Post by: Bob B. on January 28, 2013, 10:23:22 PM
The really weird thing is that he reports these codes do work in MS Word.  I guess that software must have its own key-handling code that overrides the system.

It works in some cases.  It doesn't work on my home computer, which uses Office 2000.  However, it does work on my work computer, which uses Office 2003.  I also tried a computer at work using Office 2007 and it also worked.  I'm going to see if I can find anything different in the setup between my work computer and my home computer other than the software version.
Title: Re: ASCII Characters
Post by: Not Myself on January 29, 2013, 01:35:35 AM
ΩöΔ

This is with Firefox, on a Windows PC, after I used the registry edit hack linked to in my earlier post.

First character: ALT 234
Second character: ALT 916
Third character: ALT +394

The "+" was on the numeric keypad, and 394 is hex for 916 decimal.  I don't know a way to do it in decimal, at least not yet.  Also, the third character came out as something different before I implemented the registry hack.

So do the registry hack, and you can probably due the old-fashioned CP-437 codes by using ALT followed by the decimal number, and UTF-8 codes by using ALT followed by the "+" on the numeric keypad, followed by the hexadecimal code.

Uh oh, just discovered something :(  I can't do all the hexadecimal codes, because ALT followed by "b" is intercepted by my browser and interpreted as a command.  Maybe there is a way around this.
Title: Re: ASCII Characters
Post by: Not Myself on January 29, 2013, 02:05:59 AM
Wow, it just keeps getting better and better.

Found one report that the method I used does not work in IE.  Didn't test it myself.

This method might be helpful also.  A free component of Windows that allows you to type in the characters using a pop-up.

http://blogs.msdn.com/b/michkap/archive/2005/05/18/419117.aspx

I suspect none of this will change the situation at the other board - as per LO's speculation, it might be an issue with the SQL database.
Title: Re: ASCII Characters
Post by: ka9q on January 29, 2013, 03:57:54 AM
My best guess is that a conversion is taking place, and it is on your computer.  The same non-ASCII codes entered by me produce êëìíî.  The "ALT" method doesn't work for me, as I'm on a different OS
Agreed. I have the same problem here, and I'm running Linux on a Sony Vaio laptop. Some way to enter the hex for an arbitrary Unicode character and have it turned into UTF-8 would be nice. (The Unicode tables I see only show the code points, not their UTF-8 encodings.)


Title: Re: ASCII Characters
Post by: cjameshuff on January 29, 2013, 10:16:23 AM
Agreed. I have the same problem here, and I'm running Linux on a Sony Vaio laptop. Some way to enter the hex for an arbitrary Unicode character and have it turned into UTF-8 would be nice. (The Unicode tables I see only show the code points, not their UTF-8 encodings.)

If you've got Ruby installed, run interactive Ruby (irb or pry) and use .chr():
Code: [Select]
pry(main)> 0x3A9.chr('UTF-8')
=> "Ω"
Title: Re: ASCII Characters
Post by: grmcdorman on January 29, 2013, 10:24:44 AM
There are two applications you can try that may help:
Both let you enter characters via mnemonic sequences instead of those blasted numeric values. For example, æ (ligature) would be <key>, A, E in AllChars; Δ is <key>, g, D (where <key> is the configured compose key, e.g. the Menu key).

AllChars seems to mess up upper case when the compose sequence contains upper case, but fiddling with Shift and/or Caps Lock afterwards fixes it.

Here are some characters entered by AllChars:

Ω - Menu, g, W
ö - Menu, ", o
Δ - Menu, g, D

Unfortunately, both look pretty quiescent; AllChars was last updated in 2009, and FreeCompose in 2011.

On Linux/Unix, the compose sequences are standard, although they may need to be enabled via configuration.

Bob B., can you try a different browser? IE, in particular, is notorious for doing its own thing with standards.
Title: Re: ASCII Characters
Post by: Bob B. on January 29, 2013, 10:41:15 AM
Bob B., can you try a different browser?

I fear change.
Title: Re: ASCII Characters
Post by: grmcdorman on January 29, 2013, 11:07:53 AM
I wouldn't suggest necessarily switching to a different browser; rather, to get an additional data point, see what Chrome or Firefox does. If they behave and IE doesn't, then I would blame IE.
Title: Re: ASCII Characters
Post by: grmcdorman on January 29, 2013, 11:08:42 AM
Let me try AllChars from IE 9: (9.0.8112)

AllChars:

ö
Ω
Δ

Alt+NumPad:
234: Û
235: Ù
Title: Re: ASCII Characters
Post by: Bob B. on January 29, 2013, 12:29:59 PM
I didn't realize it, but my work computer already has Chrome installed on it.

Using Chrome I get the exact same results as with Internet Explorer, that is, characters in the 1-255 range display properly, but those in the 900-series display different characters than the Greek letters I want.  For instance, 945, 946 & 947 should be lower case alpha, beta, gamma, but I get ▒ ▓ │
Title: Re: ASCII Characters
Post by: Not Myself on January 29, 2013, 12:48:21 PM
I didn't realize it, but my work computer already has Chrome installed on it.

Using Chrome I get the exact same results as with Internet Explorer, that is, characters in the 1-255 range display properly, but those in the 900-series display different characters than the Greek letters I want.  For instance, 945, 946 & 947 should be lower case alpha, beta, gamma, but I get ▒ ▓ │

If you take 945, 946, and 947 modulo 256, you get 177, 178, and 179.  And those three characters on Code Page-437 are what you got above, even though they have been translated to the corresponding encodings in UTF-8.

From the sources I've found, it looks like you either need some supplemental software, or the registry hack I linked to earlier, which will allow you to use both the CP-437 and the UTF-8 codes.
Title: Re: ASCII Characters
Post by: Echnaton on January 29, 2013, 01:19:13 PM
▒±▒▒▓│
I didn't realize it, but my work computer already has Chrome installed on it.

Using Chrome I get the exact same results as with Internet Explorer, that is, characters in the 1-255 range display properly, but those in the 900-series display different characters than the Greek letters I want.  For instance, 945, 946 & 947 should be lower case alpha, beta, gamma, but I get ▒ ▓ │

I get the same characters in Firefox and in MS Word.
Title: Re: ASCII Characters
Post by: Echnaton on January 29, 2013, 01:24:04 PM
For Firefox and Word I get

Alt 224 = α
Alt 225 = ß
Alt 228  = Σ
Alt 226 = Γ

ETA This is on Win XP.
Title: Re: ASCII Characters
Post by: grmcdorman on January 29, 2013, 01:39:12 PM
Windows 7, default settings, Firefox:

Alt 224 = Ó
Alt 225 = ß
Alt 228 = õ
Alt 226 = Ô
Alt 945 = ▒
Alt 946 = ▓
Alt 947 = │

I don't think that using the Alt-codes is a reliable (or easy-to-remember) way to enter these characters, especially as it depends on the system and browser character set.

I would suggest trying AllChars or FreeCompose.

To add: All of the above characters show up that way in the Quick Reply window.

I suspect that what may be happening for Bob B. is that something along the line is translating to Code Page 437, and then to UTF-8.
Title: Re: ASCII Characters
Post by: grmcdorman on January 29, 2013, 01:54:30 PM
Hmm. Wiki page on the Alt code has good information:

http://en.wikipedia.org/wiki/Alt_code

It appears that, for three digits, the expected result is Code Page 437 if your system is using English, and Code Page 850 otherwise.

For four digits, with a leading 0 (e.g. 0161) the result is a character in the Windows-1252 character set.

The characters Bob B. wants do not exist in any of the above three character sets; all three are eight-bit character sets.

None of the references I can find mention anything about codes in the 900-range.
Title: Re: ASCII Characters
Post by: Bob B. on January 29, 2013, 03:06:24 PM
I've just been playing around in Word and have discovered a few things.  For any character in Word, if you place the cursor to the right of the symbol and press ALT+X, the symbol is replaced by its Unicode value in hexadecimal.  The reverse is also true.  If you type the hexadecimal Unicode, place the cursor to the right of the code, and press ALT+X, the code value changes into the symbol.

There appears to be something in Word that converts the 900-series decimal values to hexadecimal.  As stated before, when I press and hold ALT while typing 945 I get a lower case alpha.  However, when I place the cursor to the right of the alpha and press ALT+X, the symbol changes to 03B1 - the hexadecimal form of 945.  When I press ALT-X again it changes back to the alpha symbol.  However, when I type 945, place the cursor to the right and press ALT-X, I get the symbol whose hexadecimal value is 945.

I think the reason my Word at home doesn't work is because of the older version, though I'm going to experiment with it further.  I'm interested to see if I can get Greek symbols to work using the ALT+X method with the hexadecimal Unicodes.
Title: Re: ASCII Characters
Post by: grmcdorman on January 29, 2013, 05:42:22 PM
Bob, what Word does is not necessarily related to what your browser does. Word provides considerably more sophisticated user-input mechanisms, including the Alt-X mechanism you describe. In Word 2007, this is bound to ToggleCharacterCode, which (with the horrible Ribbon UI) is under 'Commands not in the Ribbon' in the keyboard shortcuts.

In general, as I posted, it seems the Alt+9xx sequence isn't documented, so there is no guarantee it will work - as you have found. Take a look at the Wikipedia article I linked to; that describes the "usual" Alt+ combinations.
Title: Re: ASCII Characters
Post by: Bob B. on January 29, 2013, 05:59:41 PM
I don't have a problem with this forum; I'm perfectly fine with it just as it is.  When I have a long reply I typically type it into Word first and then copy and paste it into the Browser, so I'm far more interested in getting Word to work for me than the Browser.  My complaint was never about Apollohoax, it was about TheSpaceRace.  The problem over there is that the symbols won't display not matter what I do.

I'm sure now that my problem with Word is the older version.  None of the ALT+X functions work with the version I have at home.  The only way I can use symbols is to insert them from the Symbols template.  That's probably easier than remembering the hex codes anyway.  Of course I'm way past due for an upgrade.
Title: Re: ASCII Characters
Post by: Not Myself on January 29, 2013, 11:35:05 PM
Hmm. Wiki page on the Alt code has good information:

http://en.wikipedia.org/wiki/Alt_code

What a great page.  I wonder why no one posted this link before?  ;)

None of the references I can find mention anything about codes in the 900-range.

I have had limited success using the registry hack described in the wikipedia.org link.  The problem is that these codes need to be entered in hex, and (at least in Firefox), ALT-b is interpreted as a browser command, not part of a character code.  So if the hex code for a character has a "b" in it, I can't do it :(

I suspect that what may be happening for Bob B. is that something along the line is translating to Code Page 437, and then to UTF-8.

I believe that is exactly what is happening, as described in some of the earlier posts.  The wikipedia.org link claims that this is a deliberate action by Microsoft to maintain compatibility with codes people had already memorised before the transition to Unicode.  So when entering a 900-series code, it is misinterpreted as a CP-437 code.  Because CP-437 only works on 128-255, the code entered is taken modulo 255.  The result is a CP-437 character, which is then promptly translated back to the UTF-8 equivalent character (I guess CP-437 doesn't have anything that isn't included in Unicode) for posting at the web page.
Title: Re: ASCII Characters
Post by: Not Myself on January 29, 2013, 11:41:32 PM
I don't have a problem with this forum; I'm perfectly fine with it just as it is.  When I have a long reply I typically type it into Word first and then copy and paste it into the Browser, so I'm far more interested in getting Word to work for me than the Browser.  My complaint was never about Apollohoax, it was about TheSpaceRace.  The problem over there is that the symbols won't display not matter what I do.

I'm sure now that my problem with Word is the older version.  None of the ALT+X functions work with the version I have at home.  The only way I can use symbols is to insert them from the Symbols template.  That's probably easier than remembering the hex codes anyway.  Of course I'm way past due for an upgrade.

Per some of my posts, there's a way to be able to do the 900-codes (sort of) with a registry hack.  However (1) you seem happy with the way things work at this board, and (2) I don't think this will fix the problem at the other board, so this hack may be of limited value to you :)

Inspired by cjameshuff, I tried LaTeX at the other board.  Didn't work.
Title: Re: ASCII Characters
Post by: ka9q on January 30, 2013, 04:00:21 AM
So how do the Chinese handle this problem? Their alphabet is far larger than any reasonable number of keyboard keys, so they must have to compose them with sequences of keystrokes.

I know the Japanese work something like this. When personal computers were new they typed in "Romaji", Japanese transliterated into Roman characters, but I think they now spell a word phonetically, e.g., in kana, and the keyboard handler turns it into kanji. But I don't speak the language.
 
Title: Re: ASCII Characters
Post by: Not Myself on January 30, 2013, 10:48:19 AM
There appear to be many systems for entering Chinese, including spelling the characters in Pinyin.  As there are often many characters with the same Pinyin spelling, you then have to choose which one from a menu.

Other systems (which are supposedly faster for a skilled user) have one compose the character from key combinations which designate the strokes in the character.

My phone allows me to draw Chinese characters - either my drawing or its recognition is a little bit spotty, though.