An email to the Beta group went as follows:

From: toadbeta@yahoogroups.com [mailto:toadbeta@yahoogroups.com] On Behalf Of Brad Boddicker
Sent: Thursday, April 30, 2009 10:22 AM
To: toadbeta@yahoogroups.com
Subject: [toadbeta] Beta 9.8.0.29 released

Unicode support added 

Mark Lerch almost immediately replied with the following:

Emphasis in various places by Norm

This is huge.  The most joyous email I've ever seen posted on the boards.  I'm going to save it.  Brad's email was intentionally subdued, as though we just "slipped it in."  I love it.  We're all very happy here these days regarding this.

Way back in May of 1999 Vinny (Quest owner & ceo at the time) tasked us with putting Unicode support into Toad, as he saw the Asian markets critical to Toad's future.  Of course we had to wait this long for Borland to put support for it into Delphi to effectively do it.  But it has certainly been an albatross around our neck for a long time. ("Toad International" was basically a one-off hack, a port done by an outside company we hired).  And it [Unicode Support] doesn't become official until the commercial release of Toad 10 later this fall. [2009] But the great majority of it should all be working very well.

I'll anticipate the most common problem right up front:  

  • You must make sure the NLS_LANG value on your client (registry or environment variable) matches your client character set. 

This has nothing to do with the database you are connecting to.  There is a new warning in Toad Advisor telling you if NLS_LANG does not properly match your character set.  (A connection is required only so an active Home is set, which is used to pull the NLS_LANG for that Home)

NLS_LANG on your client computer is how Oracle knows what your client character set is, so it knows if it needs to transform characters going back & forth to your client.  This seems to be one of the most misunderstood things out there related to getting Unicode properly working in Oracle.  It has nothing to do with NLS_CHARACTERSET, which is the database character set.  If NLS_LANG matches the database character set, the Oracle client will not do any transformation.  But that’s not our concern anyway.

Toad itself is now a fully Unicode application.  However, the various tools Toad calls – Oracle Utilities, SQL Plus and so forth – are not.  This means they require NLS_LANG to be properly set, and in most cases, requires the data they work with to be available in the client character set.  For example, if you have NLS_LANG set to a Big5/Double Byte character set with Chinese, (which means your Windows “Language for non-Unicode Programs” – XP’s silly name for your character set – is set to Chinese) you can “Select Chinese Stuff from dual” and run that in SQL Plus.  You cannot do “Select Turkish stuff from dual” because the Turkish characters won’t be available in your client character set.  SQL Plus will complain.  This *will* work in Toad, however, because the OCI has a Unicode switch flipped on via Toad, and so all data is Unicode-encoded on the client, regardless of the various encodings of the database.

Clear so far?  Good.  : )

For most external files Toad creates, we chose UTF-8 as the encoding to use.  Some files can potentially contain Unicode data so this ensures they can be properly saved if your windows character set is different than the character data in the file. Therefore, external files will contain the UTF-8 BOM (Byte Order Mark) of EF BB BF (hex).  These are non-visible characters which Unicode applications require in order to know what encoding the file is in.  If you are using Toad-created files in other applications and they have a hiccup somehow, it is possible they don't know what to do with that BOM.  I can't think of any cases perhaps beyond Editor files where this could potentially be an issue for your other applications.  But notice that the Editor "Save As" dialog has a new drop down – ‘Encoding' – where you can chose an encoding to use.  If you use the typical choice – ANSI – of course no header/BOM will be used and things operate as they always have.

If you see squares somewhere in a window, you are not using a Unicode-friendly font.  Try a Unicode font.  In general, squares mean the data is fine, the font just can't show it.  Question marks generally means the data has been corrupted, for example, one byte of a double byte character was lost, so the character is no longer known.

We have a very unofficial list of issues we have found, things not supported.  Qualifications, additions, subtractions are likely – I simply did a quick snag of various comments we put into some to-do list files to pull these out for the sake of sharing:

  • Export File Browser (this may never end up supporting it, not sure yet)
  • ASM Manager
  • DB Link names – Oracle doesn't support
  • CodeXpert – some items unsupported – waiting on work from others
  • DataPump – Job names cannot be in unicode
  • SQL Monitor
  • SQL Optimizer (object names)
  • Java Manager – Java itself doesn't allow unicode class names or file names
  • Benchmark Factory (this is beyond our need to note anyway)
  • KnowledgeXpert – shows only ???? when you put unicode chars in for comments
  • Team Coding – Objects w/Unicode content are supported, objects w/Unicode names aren't.
    • Supported VCP's do not support Unicode names
  • TNSPing
  • Connections to a TNS entry with a Unicode character
  • Oracle Wrap Utility

I think that's enough for one email, if anyone has lasted this long.

About the Author

Steve Hilker

Steve Hilker was a Product Manager for Quest Software. Steve has over 35 years technical experience spanning application development, system administration, database management and various management positions at several software companies. Steve was the founder of RevealNet, best known for its desktop knowledge bases and unique database tools such as PL/Formatter. RevealNet was acquired by Quest Software in 2001. He's had the pleasure of being the product manager for many of Quest's database tools.

Start the discussion at forums.toadworld.com