Goodbye Magic Cards – Part 1: Reverse Engineering, Notepad Action and Spreadsheet Magic

I want to make a small program to get the value of my Magic Card collection. The script will automatically calls the information from a 3rd party website. Therefore I just had to sit down for approx. 2 hours and enter the card names into a spreadsheet application. As a miser I use OpenOffice.Calc for private. For my personal needs that’s more than enough even if its scope cannot compete with Microsoft Excel.

I enter all the card names and their edition in my table. The edition is necessary because it might have an impact on the card’s value.

OpenOffice Calc Tabellenblatt
OpenOffice Calc Table

Now I can start to search for a suitable website to get the price information. To keep it simple I search for a site that hands over the parameters in the site URL. Therefore I use the site’s search function so see how it is working.

In my case I just search for the first of my cards with the flowery name “Woodripper” out of “Nemesis” series. That’s the sites response

Testabfrage
Test query

Looking a the parameters after the “/?” I understand that the card’s name is handed over in clear text: “cardName=Woodripper”. The variable names are easy to understand so I see that idExpansion must be the series. Unfortunately there is no clear text, it’s coded in an indentifier.The other variables are search options that are not useful for me at the moment. I also see that the result is displayed in a nice HTML table and that will simplify parsing later on.

My next job is to find out how the idExpansion relates to the name of the series. The series names are shown in the website, therefore this information has to somewhere be in the HTML code. I have a look on the source code of the search page. This is pretty simple and works in all browsers similar. You have to right-click somewhere on the website and select the item called like “display source code”. Unfortunately I get the result almost unformatted. That’s not unusual but at the first look I cannot see anything interested. It would be possible to use an advanced HTML editor to beautify my code but I use my beloved Notepad++ instead.

I copy the complete code into a new file and search for the string “Nemesis”. I find the  search string embedded intags. Understanding the syntax of option tags I know that they will tell me what I have searched for: the ID and the clear text.

In my example the option tags look like this:

<option  value=”32″>Nemesis</option>

It is <option value=”idExpansion”>clear text</option>

I change the formatting of the code using the search and replace function of Notepad++. I add a line break (\r\n) after each </option> tag.

Umformatieren
Change the formatting

When I scroll down the document now I can easily find the data I’m searching for. Now I just delete everything before the firstand after the lasttag. Now the document contains the payload only. I now also delete the first row because the option ALL has no use for me.

Anfang meiner Nutzdaten
Beginning of the payload

Now I have to put it into a format that my spreadsheet application can work with. I use again the search and replace function and replace <option value =” with an empty string, “>with a tabulator (\t) and with an empty string. Now I have a nice tabstop separated text file.

Tabstopp getrennter Text
tabstop separated text file

Now it’s time for a bit of Excel-Magic (I know politically correct it should be called “Openoffice.Calc-Magic or even better “Spreadsheet-Application-Magic” but that sounds to weird to me).  I import the text file into a separate tab and add the series ID in my card list using VLOOKUP.

Importierte Zuordnungsliste
Imported series id lookup table
Liste mit Verweis
complete list

Now I can save my file in ODT format first and then I save it additionally in CSV format. Now I the date is prepared for the next step. Next time I will continue with the script.

Goodbye Magic Cards – The Script