Monday 2 July 2012

Scraping HTML Table data into a Google Docs Spreadsheet



Tried something interesting today.

I needed to scrape the Root Zone Database data from the Internet Assigned Numbers Authority (IANA) website at [http://www.iana.org/domains/root/db] .So I imported the data to a google docs spreadsheet and then exported it to a csv as follows:

  1. Create a new Google Docs spreadsheet
  2. In the first row, import the data using the ImportHtml() function with the following syntax [=ImportHtml("http://www.iana.org/domains/root/db","table",0) ] 
  3. This should now populate the rest of the table with the data from the web page.
  4. I can now download the spreadsheet as a csv ( File ⇒ Download as ⇒ Comma Separated Values )




No comments:

Post a Comment