Scrapit is an API for scrapping webpages for keywords.
Using Scrapit you can extract important keywords from webpages. That are quite relevant to the page that has been scrapped.
Scrapit is builton Python. Since Python has some great libraries for html and text parsing.
Scrapit uses lxml along with BeautifulSoup for processing and parsing html.
Using lxml is significantly caused increase in speed.
It also makes use of Topia.termextract for extracting keywords from the heaps of text from webpages and filtering it to remove stopwords.
Created By: Virendra Rajput
Using the API:
You need to make calls to
q : (required)
url to be fetched
occurs : (optional) Will only return the words that are repeated more that once on the webpage. Set to '1' while you want to enable it
pretty : (optional) Used for
pretty printing the response. Set to '1' while you want to enable it
(Please note that the API is still under development so the results might not be as were expected)