XML is a format that some applications use to share data. For example I use a metronome called ‘Tempo’ on my iPhone. It allows me to share the tempos of all my songs by exporting an ‘XML’ file to an email.
This document can be read by the tempo app and it contains all the information about my tempos but it’s very hard to read and I would like to share the tempos with other people in a clearer format.
Here’s how to get the data out of such a file and into a spreadsheet. This should work for any XML file that you need to parse.
If I look at the exported ‘XML’ file it looks something like this:
I can see the data. It is obvious here that the tempo for “Amazing Grace” is “63”.
But is not a very user-friendly format!
You can ‘parse’ this data using Google sheets.
Copy the XML file to your Google Drive.
To do this you will need to have Google Drive set up. Just copy the XML file into your Google Drive folder.
2. Share the XML file with Google Drive
Just go to the file in Google Drive and select Link Sharing and ‘ON – Anyone with a link’
3. Get the web address of the Google Drive file.
To do this go into Google Drive in Chrome, right click on ‘get shareable link.’
This will copy something to your clipboard that looks something like this:
This is the address of the Google file but you do not need the entire address unit just need the filename. It is the selection in bold above.
Select the portion of the filename that is just the filename.
In this example it is: 1vBz2fr177V5-cjn6ZOusKn3sVLFTVyLt
4. Open a new Google Sheet and paste the following text into the first cell of the document:
(Replace ###FILEID### with the filename of your file.)
So in my case it looked like this:
5. This should import your XML file into the Google spreadsheet.
My results in Google Sheets looked like this:
As you can see it has extracted the information out of the XML file and pasted it into the Google sheet automatically!
I can now copy this information and paste it into another spreadsheet or word processor.
Note: You may need to play around with the second variable in the importxml command. (e.g. “//dict”)
It’s called an ‘xpath’.
I used “//dict” to extract the entire contents of the file but you can use this to select the exact contents that you want.
For more information on the syntax of this path command read this page on how to format an Xpath.