Get More Than 10 000 Articles
Learn to fetch more than 10,000 articles
Last updated
Learn to fetch more than 10,000 articles
Last updated
When you make an API call using the /v2/search
endpoint, our news API returns some key information. For example, an API call about "Tesla" for the last week
returns
Here, total_hits
tells you how many articles are found.
One API search can yield a maximum of 10,000 hits.
When your API call returns 10,000 total_hits
, chances are that there are more than 10,000 news articles for your query. The above Telsa example API call actually matches 25,290 articles. To get all of these results, you have two choices:
Manually divide your search query into smaller time periods and then combine the results.
Use the get_search_all_articles
method from our Python SDK to automatically divide your query.
The idea is simple, break down your queries into smaller time periods so our API can accommodate all the matches.
Continuing the Tesla search example, to get all the 25,290 news articles you'll need to make a number of smaller search requests. Although technically you should be able to fetch all the results in just three API calls, that's not advisable because you don't know how many articles were published in a time period without making additional API calls.
As Day 4 has 7,841 articles, it would cause issues regardless of whether you combine it with Day 3 or Day 5. And you can't figure out if there is a case like Day 4 for your search queries without making a bunch of API calls.
The recommended approach for dividing your search query is to go down one level in the time scale. So if your search query spans a year or multiple years, you should break it down to individual months. If it spans months, break it down to weeks, and so on and so forth.
So our Tesla query would be divided into seven sub-queries:
and so on.
Once you have the responses to all these sub-queries you can combine the lists of articles to get all the news articles that match your search query.
get_search_all_articles
MethodYou can also automate this process of dividing your queries with the get_search_all_articles
method. It takes your query and based on the by
parameter divides it into a bunch of sub-queries. It then returns the combination of all their results.
To get all the articles for the Telsa query you can simply:
The by
parameter accepts four values: 'hour'
, 'day'
, 'week'
, and 'months'
. And it is set to 'week'
by default.