v3 Search News
Main endpoint that allows you to find news article by keyword, date, language, country, etc.
Get News
GET
https://v3-api.newscatcherapi.com/api/search?q=Apple&from_=1 day ago&countries=CA&page_size=1
Query Parameters
Name | Type | Description |
---|---|---|
q* | string | Keyword/keywords you're searching for. This is the most important part of your query.
You can set this to |
lang | array | Specifies the languages of the search. For example, |
not_lang | array | Inverse to the |
search_in | string | By default, we search what you specified in the
|
countries | array | Countries where the news publisher is located.
Important: This parameter is not responsible for the countries mentioned in the news article.
One or multiple countries can be used in the search.
The only acceptable format is ISO 3166-1 alpha-2
For example, |
not_countries | array | The inverse of the |
sources | array | One or more news sources to narrow down your search. The format should be a domain url from your URL. Subdomains, like |
not_sources | array | One or more sources to be excluded from the search. Comma-separated string or a list/array. For example, |
predefined_sources | string | Use our TOP predifined sources per country. Later we are going to improve it and add more functionality, like top categories etc. The format should be strictly like this: - starting with word - put the number of desired sources top source - 2 letter country code ISO 3166-1 alpha-2 For example:
It is also possible to put multiple countries with custom number of top sources, should be comma separated. For example:
|
source_name | array | NewsCatcher team does not suggest using source_name in Production. The best parameter to get data from a specific News Domain is sources. One or more news source names to narrow down your search. Comma-separated string or a list/array. For example:
|
parent_url | array | One or more categorical URL to filter your search. It should be the normal form of the URL,
For example, |
ranked_only | boolean | Default: |
from_rank | integer |
|
to_rank | integer |
|
sort_by | string |
|
page_size | integer |
|
page | integer | The number of the page. Use it to scroll through the results. This parameter is used to paginate: scroll through results because one API response cannot return more than 1000 articles. |
to_ | string | Until which point in time to search for. The default timezone is UTC.
Availabe formats :
English phrases like |
from_ | string | From which point in time to start the search. Defaults to the past week.
Availabe formats :
|
published_date_precision | string | There are 3 types of date precision we define:
|
by_parse_date | boolean | When set to Be aware that a new variable parse_date will be added to the output list with each article. |
is_headline | boolean | When set to |
all_links | array | Search for desired URL mentioned in the article. Please, refer to the All Links And Domains Format pagefor more examples and explanations. |
not_author_name | array or string | List of author names that you want to exclude from your search. Can be a list of string or a comma-separated string containing all the author names. Usually, you might want to exclude articles where one of the authors is Associated Press, or PRNewswire. For example:
|
all_domain_links | array | Search for desired domain URL mentioned in the article. Please, refer to the All Links And Domains Format page for more examples and explanations. |
word_count_min | integer | Set a minimum number of words that an article must contain. To be used for avoiding avoid articles with small content. |
word_count_max | integer | Set a maximum number of words that an article must contain. To be used for avoiding avoid articles with big content. |
is_paid_content | boolean | [Still in development phase] When set to Some news publishers partially block content of their articles, so we get only several sentences from them. This filter will help you get full content. |
is_opinion | boolean | [Still in development phase]
When set to |
include_nlp_data | boolean | When set to Not available for all plans. Please contact us to enable it. |
has_nlp | boolean | [Available only if NLP is enabled for your API key] When set to |
theme | string | [Available only if NLP is enabled for your API key] A general topic of an article. Topic labelling is based on the actual content of an article. Accepted values:
Comma-separated string or a list/array. Multiple themes can be selected. For example:
|
not_theme | string | [Available only if NLP is enabled for your API key] Inverse of the |
title_sentiment_min | float | [Available only if NLP is enabled for your API key] Narrow down your search to only positive or negative news based on the article's title sentiment. The value can vary from |
title_sentiment_max | float | [Available only if NLP is enabled for your API key] Narrow down your search to only positive or negative news based on the article's title sentiment. The value can vary from |
content_sentiment_min | float | [Available only if NLP is enabled for your API key] Narrow down your search to only positive or negative news based on the article's content sentiment. The value can vary from |
content_sentiment_max | float | [Available only if NLP is enabled for your API key] Narrow your search to only positive or negative news based on the article's content sentiment. The value can vary from |
clustering_enabled | boolean | [Available only if NLP is enabled for your API key] When set to True, enables clustering on articles. Instead of showing a list of articles, you will be given a list of clustering to put together similar articles. Please refer to the Clustering News Articles page for more examples and explanations. |
clustering_threshold | float | [Available only if NLP is enabled for your API key] Set a threshold for an article to be similar. Default value: The value can vary from |
clustering_variable | string | [Available only if NLP is enabled for your API key] Select the data on which you want the similarity to be calculated on. Accepted values:
Default value:
|
ORG_entity_name | string | [Available only if NLP is enabled for your API key] ORG stands for Organisation. We identify company names mentioned in articles and enable you to search for them. More information on Search By Entity |
PER_entity_name | string | [Available only if NLP is enabled for your API key] PER stands for Person. We identify people's names mentioned in articles and enable you to search for them. More information on Search By Entity |
LOC_entity_name | string | [Available only if NLP is enabled for your API key] LOC stands for Location. We identify geographical locations mentioned in articles and enable you to search for them. More information on Search By Entity |
MISC_entity_name | string | [Available only if NLP is enabled for your API key] MISC stands for Miscellaneous. We identify products and other names mentioned in articles and enable you to search for them. More information on Search By Entity |
iptc_tags | string | [Available only if tags are enabled for your API key] We label articles with IPTC tags based on the content and enable you to filter articles based on the tags. Only IPTC tag IDs can be used in this parameter. For example, |
not_iptc_tags | string | [Available only if tags are enabled for your API key] Inverse of the |
exclude_duplicates | boolean | [Available only for English-language articles]
If |
Headers
Name | Type | Description |
---|---|---|
x-api-token* | string | Your unique authentication token |
Get News
POST
https://v3-api.newscatcherapi.com/api/search?
Headers
Name | Type | Description |
---|---|---|
x-api-token* | string | Your unique authentication token |
Request Body
Name | Type | Description |
---|---|---|
q* | string | Keyword/keywords you're searching for. This is the most important part of your query. Please, refer to the Advanced Query Parameter section for more examples and explanations. |
lang | array | Specifies the languages of the search. For example, |
not_lang | array | Inverse to the |
published_date_precision | string | There are 3 types of date precision we define:
|
search_in | string | By default, we search what you specified in the
|
countries | array | Countries where the news publisher is located.
Important: This parameter is not responsible for the countries mentioned in the news article.
One or multiple countries can be used in the search.
The only acceptable format is ISO 3166-1 alpha-2
For example, |
not_countries | array | The inverse of the |
sources | array | One or more news sources to narrow down your search. The format should be a domain url from your URL. Subdomains, like |
not_sources | array | One or more sources to be excluded from the search. Comma-separated string or a list/array. For example, |
ranked_only | boolean | Default: |
from_rank | integer |
|
to_rank | integer |
|
sort_by | string |
|
page_size | integer |
|
page | integer | The number of the page. Use it to scroll through the results. This parameter is used to paginate: scroll through results because one API response cannot return more than 1000 articles. |
to_ | string | Until which point in time to search for. The default timezone is UTC.
Availabe formats :
English phrases like |
from_ | string | From which point in time to start the search. Defaults to the past week.
Availabe formats :
|
by_parse_date | boolean | When set to Be aware that a new variable parse_date will be added to the output list with each article. |
is_headline | boolean | When set to |
is_opinion | boolean | [Still in development phase]
When set to |
parent_url | array | One or more categorical URL to filter your search. It should be the normal form of the URL,
For example, |
all_links | array | Search for desired URL mentioned in the article. Please, refer to the All Links And Domains Format section for more examples and explanations. |
all_domain_links | array | Search for desired domain URL mentioned in the article. Please, refer to the All Links And Domains Format section for more examples and explanations. |
word_count_min | integer | Set a minimum number of words that an article must contain. To be used for avoiding avoid articles with small content. |
word_count_max | integer | Set a maximum number of words that an article must contain. To be used for avoiding avoid articles with big content. |
include_nlp_data | boolean | When set to Not available for all plans. Please contact us to enable it. |
theme | string | [Available only if NLP is enabled for your API key] A general topic of an article. Topic labelling is based on the actual content of an article. Accepted values:
Comma-separated string or a list/array. Multiple themes can be selected. For example:
|
ORG_entity_name | string | [Available only if NLP is enabled for your API key] ORG stands for Organisation. We identify company names mentioned in articles and enable you to search for them. More information on Search By Entity |
has_nlp | boolean | [Available only if NLP is enabled for your API key] When set to |
title_sentiment_min | float | [Available only if NLP enabled for your plan] Narrow down your search to only possitive or negative news based on article's title sentiment. The value can vary from |
title_sentiment_max | float | [Available only if NLP is enabled for your API key] Narrow down your search to only positive or negative news based on the article's title sentiment. The value can vary from |
content_sentiment_min | float | [Available only if NLP is enabled for your API key] Narrow down your search to only positive or negative news based on the article's content sentiment. The value can vary from |
content_sentiment_max | float | [Available only if NLP is enabled for your API key] Narrow down your search to only positive or negative news based on the article's content sentiment. The value can vary from |
is_paid_content | String | [Still in development phase] When set to Some news publishers partially block content of their articles, so we get only several sentences from them. This filter will help you get full content. |
clustering_enabled | boolean | [Available only if NLP is enabled for your API key] When set to True, it enables clustering on articles. Instead of showing a list of articles, you will be given a list of clustering to put together similar articles. Please refer to the Deduplicate Data With Clustering section for more examples and explanations. |
clustering_threshold | float | [Available only if NLP is enabled for your API key] Set a threshold for an article to be similar. Default value: The value can vary from |
clustering_variable | string | [Available only if NLP is enabled for your API key] Select the data on which you want the similarity to be calculated. Accepted values:
Default value:
|
PER_entity_name | string | [Available only if NLP is enabled for your API key] PER stands for Person. We identify people's names mentioned in articles and enable you to search for them. More information on Search By Entity |
LOC_entity_name | string | [Available only if NLP is enabled for your API key] LOC stands for Location. We identify geographical locations mentioned in articles and enable you to search for them. More information on Search By Entity |
MISC_entity_name | string | [Available only if NLP is enabled for your API key] MISC stands for Miscellaneous. We identify products and other names mentioned in articles and enable you to search for them. More information on Search By Entity |
predefined_sources | string | Use our TOP predifined sources per country. Later we are going to improve it and add more functionality, like top categories etc. The format should be strictly like this: - starting with word - put the number of desired sources top source - 2 letter country code ISO 3166-1 alpha-2 For example:
It is also possible to put multiple countries with custom number of top sources, should be comma separated. For example:
|
not_iptc_tags | string | [Available only if tags are enabled for your API key] Inverse of the |
iptc_tags | string | [Available only if tags are enabled for your API key] We label articles with IPTC tags based on the content and enable you to filter articles based on the tags. Only IPTC tag IDs can be used in this parameter. For example, |
not_author_name | array | List of author names that you want to exclude from your search. Usually, you might want to exclude articles where one of the authors is Associated Press, or PRNewswire. For example:
|
not_theme | string | [Available only if NLP is enabled for your API key] Inverse of the |
source_name | array | NewsCatcher team does not suggest using source_name in Production. The best parameter to get data from a specific News Domain is sources. One or more news source names to narrow down your search. Comma-separated string or a list/array. For example:
|
exclude_duplicates | boolean | [Available only for English-language articles]
If |
Successful Request Response
Return Body Fields
Object | Sub Object | Description |
| Returns Returns | |
| How many news articles match your search criterion. Maximum is 10,000 | |
| The page where you are at | |
| How many pages you can access given your page_size parameter | |
| How many news articles are in the returned JSON object | |
| News articles found. | |
| The title of the article | |
| The author of the article | |
| List of all author names | |
| Clean list of journalists. No news publishcation names, only people. | |
| Published date & time | |
| Accuracy of the There are 3 types of date precision we define:
| |
| Updated date & time | |
| Accuracy of the There are 3 types of date precision we define:
| |
| Full URL where the article was originally published | |
| The domain URL of the article's source | |
| The full domain URL with a subcategory of the article's source | |
| The common name of the News Source | |
| True when an article has been seen on the main page of the news source. | |
| The URL where an article was initially found | |
| The country of the publisher | |
| Copyright | |
| The page rank of the source website (which is given in the | |
| A link to a thumbnail image of the article | |
| The language of the article | |
| Short summary of the article provided by the publisher | |
| The full content of the article | |
| Number of words in the article's content | |
|
| |
| The Twitter account of the publisher | |
| All URL links embedded in the article's content HTML | |
| All domain URL embedded in the article's content HTML | |
| Depending on your plan your can have :
- | |
|
| Newscatcher API's unique identifier for each news article |
| How well the article is matching your search criteria. | |
| The number of duplicates associated with the original article. | |
| A unique identifier for duplicates associated with the original article. | |
| An object that returns how the API saw your request. It shows you which parameters have been used to perform a search. Useful for debugging, especially to check if there is any problem with URL encoding |
Supercharge Your News Searches
A lot can be done using the search
endpoint and we are constantly working on new functionalities to make it more powerful and valuable. Here are the functionalities that we recommend
Make More Precise Query
There's a lot you can do with simple keyword-based searches. Using exact matching and boolean operators, you can exclude a set of words or combine multiple queries related to a topic of interest into one query. There's also proximity-based searching for more sophisticated querying.
Advanced QueryingGroup Similar Articles
Most significant events and topics get massive coverage from 10s of news publications. Some media conglomerates even republish the articles. You or your analysts don't need to sift through all of these articles with duplicate information; use the clustering functionality in our search
endpoint to get groups of articles with distinct bits of information.
Filter Out Duplicates
Multiple sources often publish the same news stories, leading to duplicated content. Our deduplication feature filters out these redundant articles, ensuring you receive only unique and relevant news content for your analysis.
Deduplicate ArticlesApple Or apple?
Sometimes, it's necessary to specify whether the keyword you're looking for is an organization or a person. For instance, take the tech giant Apple. If you were looking for articles about a recent development about Apple, you would want to get articles about apple prices or orchids, would you?
Search By EntityWhat Is This Linked To?
The links in an article can serve as valuable markers. Want to find all articles talking about a specific research paper or a press release? Which company named XYZ is this referring to?
Search By URLLast updated