Archive for January, 2005

Engrish Lord of the Rings Subtitles

Wednesday, January 19th, 2005

I always forget where to find this, because it’s great for some laughs.

Engrish LOTR:TT Captions - a mirror of the original, taken down due to C&D.

NuSOAP for PHP and .NET Client

Tuesday, January 18th, 2005

Investigating the use of the NuSOAP library for implementing web services on top of PHP APIs. NuSOAP was originally written by the same guy who brought us Foxylicious, the firefox plugin for del.icio.us. It allows for easy integration with an existing API to provide a SOAP service, and also allows for fairly simple SOAP client creation.

The NuSOAP project page and the developer’s page. Scott Nichols’ SOAP / NuSOAP page. He’s the active developer currently working on NuSOAP. A note about an RPC method format required for integration with .NET(?). An article about using a NuSOAP client to consume a .NET based web service. W. Jason Gilmore explains how to publish web services for .NET with NuSOAP A Dilbert a day with PHP and NuSOAP Pocket SOAP TCPTrace, a useful tool for SOAP/HTTP debugging. Various VS.NET consuming NuSOAP server references: * http://www.vbdotnetforums.com/showthread.php?t=1093 * http://www.hardforum.com/printthread.php?t=812041 * Short mention of .NET using Document-style SOAP while NuSOAP uses RPC-style SOAP. * Example VB code to call NuSOAP service. * Adding a WebReference to VS.NET to consume NuSOAP wsdl.

Don’t name your wikis “Wiki”

Sunday, January 9th, 2005

I noticed while visitng the Mojavi Project that they have a navigation bar with a “Wiki” link. I was looking for a “Documentation” link and was confused for quite some time before realizing that they store all documentation on their wiki.

It’s fine to store documentation on a wiki. Because wikis are so neat, and they’re a new technology, nerds forget that almost nobody knows what a wiki is yet. I’d be shocked if there was more than 10% familiarity within the entire IT world. Documentation should still be “Documentation” or “Manual”, regardless of the format. You wouldn’t create a web-based documentation site and name it “HTML”, would you? Or, for that matter, make a manual and name it DOC or PDF.

Remember that technologies are neat, but just because we use wikis all the time doesn’t mean that the format is inherently more important than the content.

Parsing PDF-extracted text through Excel & VBA

Thursday, January 6th, 2005

Situation:

Extracting data as text from a PDF is never as simple as it first seems. For this project, the data in the PDF, after being extracted, had no natural delimiters except for spaces. However, some of the columns contained proper nouns for cities, which would mess up Excel when going to do a data import from the extracted text.

Starting with a script I found to clear out non-numeric values in a single column with VBA, I came up with this:

Sub ClearNonNumeric() EndCol = 20 EndRow = 500 For i = 1 To EndRow For j = 1 To EndCol While Not IsNumeric(Cells(i, j)) ‘Cells(i, ColumnToLookIn).Value = Cells(i, ColumnToLookIn + 1).Value ‘ Shift left Call ShiftLeft(i, j) Wend Next Next End Sub Sub ShiftLeft(Row, StartCol) EndCol = 20 For i = StartCol To EndCol Cells(Row, i).Value = Cells(Row, i + 1).Value Next End Sub

After importing space-delimited text, you will naturally be left with an excel document that has unevenly distributed data. If the data you want to get to is purely numeric, then you can run this script.

It jumps through each row, shifting all numeric data left so that only separate numeric values appear on each column. If only numeric data was desired, this should do the trick.

The data set I worked on was California Department of Public Health Service Planning Area by Zip Code data. I’m thinking of making a form for anyone to type in their zip code and get back their SPA, but i’m not sure if it would be in the public interest. If I get any comments that it’s desired, i’ll be happy to put it up.