Is there any open source webcrawler project, developed in Python, for study?
I've been studying / researching for some time, but I do not find anything ready about it. My goal is to study to create an open source with the following Features:
- Download the HTML of a specific link
- Gets the content of specific tags, for example: < p & gt ;, < h1>
- Save the contents of the MySQL database
So I would like to have a basis on how to develop this in Python in a simple way. If you have an idea how to do (in code) please give me this help!
obs: My domain in Python is currently basic