Scrape, normalize and mine Google News with Python
If Google News had a Python library.
A Python wrapper of the Google News RSS feed.
Top stories, topic-related news feeds, geolocation news feed, and an extensive full-text search feed.
This work is more of a collection of all things we could find out about how Google News functions.
- 1.Integrating a news feed to your platform/application/website
- 2.Collecting data by topic to train your own ML model
- 3.Search for the latest mentions for your new product
- 4.Media monitoring of people/organizations — PR
Before we start, if you want to integrate Google News data into your production then I would advise you to use one of the 3 methods described below. Why? Because you do not want your server's IP address to be blocked by Google. Every time you call any function there is an HTTPS request to Google's servers. Don't get me wrong, this Python package still works out of the box.
- 3.Your own proxy — already have a pool of proxies? Each function in this package has
proxiesparameter (python dictionary) where you just paste your own proxies.
v0.1.1 -- fixed language-country issues