First you will need Python 2.7.3 . You can find instructions for installing and using Python in my previous blog entry here.
Source can be extracted using the urllib2 module in python. This will request the user to enter URL of a webpage (Note: the URL must be in complete form i.e the url must start with “http://” , For example: the input must be http://www.example.com and not www.google.com or else it might produce errors).
I have defined two functions get_source(page) which will retrieve and print the source , and write_source(location) which will save the source to a user specified location in your hard disk.
The program will initially ask the user to enter the URL of the webpage (for example, you could enter http://www.google.com ). And then it will ask the user to enter the location where to save the source in your hard disk, you can enter any location you prefer. (For example, you could enter C:/source.txt to save the source as a text file or C:/source.html to save it as a html page itself).
# Source Extractor # extr3metech.wordpress.com import urllib2 def get_source(page): url=urllib2.urlopen(page) print url source=url.read() return source def write_source(location): fob=open(location,"w") for line in get_source(webpage): fob.write(line) print "Source saved in ", location fob.close() webpage=raw_input("Enter URL to get source : ") # Example: http://www.google.com path=raw_input("Enter location to save source : ") # Example: C:/source.txt print get_source(webpage) write_source(path) raw_input("Press any key to exit..")