Sunday, May 08, 2011

Excel on Mac: Goodbye VBA, Hello AppleScript

Among my myriad headaches in switching back to the Mac as my primary machine, here is another doozy.

So, I have this cool social media news research job I do in the mornings for a client.  It involves collecting a lot of buzz from various corners of the web, putting them into an Excel file, and having to spit out a report in the form of a text file.  The formatting has to be a certain way so I put together a VBA script (macro) to do this. 

I grab soundbites from from all over the world, so needless to say the text file I output needs to be in UTF-8 format.  Incredibly, this doesn't come naturally to Excel.

In VBA on Windows Excel, there was the ADODB.Stream class from which we can instantiate a file object which allows us to set the encoding using the Charset property, as I do in the example VBA snippet below.

    Set fs = CreateObject("ADODB.Stream")
    fs.Type = 2
    fs.Charset = "utf-8"
    fs.Open
   
    ' Loop thru your cells
    fs.writetext myCell & Chr(10)

    fs.SaveToFile "OutputSheet.txt", 2

Anywhoo, after moving to the Mac I realized to my dismay that ADODB.Stream is not available.  Which makes sense, as it is an Active X.

I spent weeks Googling around for a VBA solution to this problem which, for the hell of it, I'll restate: The ability to output multilingual text from worksheet cells in UTF-8 format to a text file, in Excel on the Mac.  But to no avail.  Am I really the only person in the world struggling with this?  Or do I just suck as a Googler?  Could it be the martinis?

To make a long story short, I took the bold step of ditching VBA and adopted AppleScript as my language of choice on MS Office for the Mac.  When in Rome... and all that.

I have attached an AppleScript file, and a corresponding Excel test file to illustrate.  To test it out you will need to open the Excel file, open the AppleScript file with the AppleScript Editor, and hit Run from the menu (or Cmd-R).

Here is the entire AppleScript.

    tell application "Microsoft Excel"
        activate
       
        set outFile to (path of active workbook)
        set outFile to (outFile & ":OutputUTF8.txt")
        set openFile to open for access file outFile with write permission
        set eof openFile to 0
       
        set title to (name of active workbook) & return
        write title to openFile as «class utf8»
       
        set rowNum to 1
        repeat
            set rowNum to (rowNum + 1)
            set cellVal to (value of cell rowNum of column 1 of active sheet)
           
            if (cellVal = "") then
                exit repeat
            else
                set langStr to (value of cell rowNum of column 1 of active sheet)
                set textStr to (value of cell rowNum of column 2 of active sheet)
                set outStr to langStr & ":" & textStr & return
                write outStr to openFile as «class utf8»
            end if
           
        end repeat
        close access openFile
       
    end tell

Here is the salient line of code that allows me to specify the encoding.  It doesn't seem to be at the file level but at the level of each write statement.
    write outStr to openFile as «class utf8»


Download the AppleScript:
http://www.box.net/shared/rbjxybmuts 


Download the Excel file:
http://www.box.net/shared/bc047mefh2

6 comments:

Rohan Moore said...

You're not alone battling with encoding problems with Excel VBA on the Mac, since its otherwise welcome renaissance under Office 2011. In my case, I'm receiving a CSV file encoded with character set 949 (Korean), and I'm trying to import it into a worksheet that uses the data for other functions. Whilst I can use the text import wizard to set the origin of the text, via VBA you can only choose between the character sets Macintosh, Windows or MS-DOS. I'll investigate an AppleScript solution also!

Arka Roy said...

Hi Rohan. It is definitely sucky, and kind of strange, that you can import it when operating Excel and choose that encoding in the wizard, yet that functionality is not available in VBA.

Bryan said...

Indeed, you're not alone. Of course, the solution in Windows itself is a nightmare, as UTF-8 is the most common encoding out there and Excel uses Unicode internally, yet you have to resort to ActiveX just to get things working in Windows. It's infuriating how little spreadsheets have come along over the last decade.

The script you provided gave me a really good starting point, and within a few minutes I had exactly what I needed. Thanks!

Paulo said...

Excellent script for those who can't manage without Excel.

After extensive Googling, I have also found that NUMBERS (from Apple iLife) DOES have an option to EXPORT using UTF-8. Hit "Export", choose CSV and voilá! You are presented with an encoding option.

As much as Excel 2011 for the Mac is cool (with VBA and full Windows compatibility - at least to my experience so far!), one can always find a glitch.

Arka Roy said...

Kuma-san has provided this information in Japanese.
http://www.kuma-de.com/blog/2012-05-16/3471

Rohan Moore said...

This is the second time I've discovered—and commented upon—this script... This time, the script's precisely what I'm looking for! In my particular case, I've built an Excel VBA application that organises and builds a set of product data for me to upload to Lightspeed POS and BigCommerce. The app generates a UTF-8 text file for the actual upload.

For this case, I have the entire script pasted into a single cell of a specially added worksheet, and create a named range for that cell. I then add a VBA procedure to enable me to output the file as a UTF-8 text file from within Excel, without having to open and run the AppleScript independently. The VBA procedure contains the following line:

MacScript Range("NamedRange")