Links

v3 Search News

Main endpoint that allows you to find news article by keyword, date, language, country, etc.
get
https://v3-api.newscatcherapi.com
/api/search?q=Apple&from_=1 day ago&countries=CA&page_size=1
Get News
Parameters
Query
q*
string
Keyword/keywords you're searching for. This is the most important part of your query. You can set this to '*' if you don't want to look for specific keywords. Please, refer to the Advanced Querying page for more examples and explanations.
search_in
string
By default, we search what you specified in the q parameter in both title and content of the article. However, you can choose between:
-title
-content
-summary (if enabled for your plan)
-title,summary
-content,summary
sources
array
One or more news sources to narrow down your search.
The format should be a domain url from your URL. Subdomains, like finance.yahoo.com are also accepted. Comma-separated string or a list/array. For example, nytimes.com,theguardian.com,finance.yahoo.com
not_sources
array
One or more sources to be excluded from the search. Comma-separated string or a list/array.
For example, cnn.com,wsj.com
predefined_sources
string
Use our TOP predifined sources per country.
Later we are going to improve it and add more functionality, like top categories etc.
The format should be strictly like this:
- starting with word top
- put the number of desired sources top source
- 2 letter country code ISO 3166-1 alpha-2
For example:
top 100 US
top 33 AT
top 5 GB
It is also possible to put multiple countries with custom number of top sources, should be comma separated.
For example:
top 100 US, GB
top 33 AT, 55 IT
lang
array
Specifies the languages of the search. For example, en. The only accepted format is ISO 639-1 — 2 letter code. Refer to the language format section for more details.
not_lang
array
Inverse to the lang parameter
countries
array
Countries where the news publisher is located. Important: This parameter is not responsible for the countries mentioned in the news article. One or multiple countries can be used in the search. The only acceptable format is ISO 3166-1 alpha-2 For example, US,CA,MX or just US
not_countries
array
The inverse of the countries parameter.
from_
string
From which point in time to start the search. Defaults to the past week. Availabe formats : YYYY/mm/dd YYYY/mm/dd HH:MM:SS English phrases like 1 day ago
to_
string
Until which point in time to search for. The default timezone is UTC. Availabe formats : YYYY/mm/dd YYYY/mm/dd HH:MM:SS
English phrases like1 day ago
published_date_precision
string
There are 3 types of date precision we define: full — day and time of an article is correctly identified with the appropriate timezone timezone unknown — day and time of an article is correctly identified without timezone date — only the day is identified without an exact time
by_parse_date
boolean
When set to True, transforms your from_ and to_ parameters to filter by parse_date instead of published_date
Be aware that a new variable parse_date will be added to the output list with each article.
sort_by
string
relevancy (default value) — the most relevant results first date — the most recently published results first rank — the results from the highest-ranked sources first
ranked_only
boolean
Default: True Limit the search only for the sources which are in the top 1 million online websites. Unranked sources are assigned a rank that equals 999999
from_rank
integer
[0:999999] The lowest boundary of the rank of a news website to filter by. Important: lower rank means that a source is more popular
to_rank
integer
[0:999999] The upper boundary of the rank of a news website to filter by.
is_headline
boolean
When set to True, only articles that were posted on the home page of a given news domain will be shown.
is_paid_content
boolean
[Still in development phase]
When set to False, only articles that publish full public available content will be shown.
Some news publishers partially block content of their articles, so we get only several sentences from them. This filter will help you get full content.
parent_url
array
One or more categorical URL to filter your search. It should be the normal form of the URL, For example, https://www.washingtonpost.com/politics,https://www.washingtonpost.com/technology,https://www.washingtonpost.com/business
all_links
array
Search for desired URL mentioned in the article.
Please, refer to the All Links And Domains Format pagefor more examples and explanations.
all_domain_links
array
Search for desired domain URL mentioned in the article.
Please, refer to the All Links And Domains Format page for more examples and explanations.
word_count_min
integer
Set a minimum number of words that an article must contain.
To be used for avoiding avoid articles with small content.
word_count_max
integer
Set a maximum number of words that an article must contain.
To be used for avoiding avoid articles with big content.
page_size
integer
[1:1000] How many articles to return per page.
page
integer
The number of the page. Use it to scroll through the results. This parameter is used to paginate: scroll through results because one API response cannot return more than 1000 articles.
clustering_enabled
boolean
[Available only if NLP is enabled for your API key]
When set to True, enables clustering on articles. Instead of showing a list of articles, you will be given a list of clustering to put together similar articles.
Please refer to the Clustering News Articles page for more examples and explanations.
clustering_threshold
float
[Available only if NLP is enabled for your API key]
Set a threshold for an article to be similar.
Default value: 0.6
The value can vary from 0 to 1.
clustering_variable
string
[Available only if NLP is enabled for your API key]
Select the data on which you want the similarity to be calculated on.
Accepted values:
content, title, summary
Default value:
content
include_nlp_data
boolean
When set to True, adds to each article a NLP layer.
Not available for all plans. Please contact us to enable it.
has_nlp
boolean
[Available only if NLP is enabled for your API key]
When set to True, filter data only to those articles that have an NLP layer.
theme
string
[Available only if NLP is enabled for your API key]
Accepted values:
Business, Economics, Entertainment, Finance, Health, Politics, Science, Sports, Tech, Crime, Lifestyle, Automotive , Travel, Weather, General
Comma-separated string or a list/array.
Multiple themes can be selected.
For example:
Business
Business, Finance
Topic labeling is based on the actual content of an article.
ORG_entity_name
string
[Available only if NLP is enabled for your API key]
ORG stands for Organisation.
We identify company names mentioned in articles and enable you to search for them.
More information on Search By Entity
PER_entity_name
string
[Available only if NLP is enabled for your API key]
PER stands for Person.
We identify people's names mentioned in articles and enable you to search for them.
More information on Search By Entity
LOC_entity_name
string
[Available only if NLP is enabled for your API key]
LOC stands for Location.
We identify geographical locations mentioned in articles and enable you to search for them.
More information on Search By Entity
MISC_entity_name
string
[Available only if NLP is enabled for your API key]
MISC stands for Miscellaneous.
We identify products and other names mentioned in articles and enable you to search for them.
More information on Search By Entity
title_sentiment_min
float
[Available only if NLP is enabled for your API key]
Narrow down your search to only positive or negative news based on the article's title sentiment.
The value can vary from -1 to 1.
title_sentiment_max
float
[Available only if NLP is enabled for your API key]
Narrow down your search to only positive or negative news based on the article's title sentiment.
The value can vary from -1 to 1.
content_sentiment_min
float
[Available only if NLP is enabled for your API key]
Narrow down your search to only positive or negative news based on the article's content sentiment.
The value can vary from -1 to 1.
content_sentiment_max
float
[Available only if NLP is enabled for your API key]
Narrow your search to only positive or negative news based on the article's content sentiment.
The value can vary from -1 to 1.
iptc_tags
string
[Available only if tags are enabled for your API key]
We label articles with IPTC tags based on the content and enable you to filter articles based on the tags.
Only IPTC tag IDs can be used in this parameter.
For example, 20000183,20000199,20000188 or just 20000188
not_iptc_tags
string
[Available only if tags are enabled for your API key]
Inverse of the iptc_tags parameter; it enables you to filter articles based on their IPTC tags.
Header
x-api-token*
string
Your unique authentication token
Responses
200
Success
403: Forbidden
Invalid API Key
406
Unsupported Parameters
408
Request Timeout
422: Unprocessable Entity
Parameter is not allowed
429
Too many API calls
post
https://v3-api.newscatcherapi.com
/api/search?
Get News
Parameters
Header
x-api-token*
string
Your unique authentication token
Body
q*
string
Keyword/keywords you're searching for. This is the most important part of your query. Please, refer to the Advanced Query Parameter section for more examples and explanations.
search_in
string
By default, we search what you specified in the q parameter in both title and content of the article. However, you can choose between:
-title
-content
-summary (if enabled for your plan)
-title,summary
-content,summary
sources
array
One or more news sources to narrow down your search.
The format should be a domain url from your URL. Subdomains, like finance.yahoo.com are also accepted. Comma-separated string or a list/array. For example, nytimes.com,theguardian.com,finance.yahoo.com
not_sources
array
One or more sources to be excluded from the search. Comma-separated string or a list/array.
For example, cnn.com,wsj.com
predefined_sources
string
Use our TOP predifined sources per country.
Later we are going to improve it and add more functionality, like top categories etc.
The format should be strictly like this:
- starting with word top
- put the number of desired sources top source
- 2 letter country code ISO 3166-1 alpha-2
For example:
top 100 US
top 33 AT
top 5 GB
It is also possible to put multiple countries with custom number of top sources, should be comma separated.
For example:
top 100 US, GB
top 33 AT, 55 IT
lang
array
Specifies the languages of the search. For example, en. The only accepted format is ISO 639-1 — 2 letter code. Refer to the language format section for more details.
not_lang
array
Inverse to the lang parameter
countries
array
Countries where the news publisher is located. Important: This parameter is not responsible for the countries mentioned in the news article. One or multiple countries can be used in the search. The only acceptable format is ISO 3166-1 alpha-2 For example, US,CA,MX or just US
not_countries
array
The inverse of the countries parameter.
from_
string
From which point in time to start the search. Defaults to the past week. Availabe formats : YYYY/mm/dd YYYY/mm/dd HH:MM:SS English phrases like 1 day ago
to_
string
Until which point in time to search for. The default timezone is UTC. Availabe formats : YYYY/mm/dd YYYY/mm/dd HH:MM:SS
English phrases like1 day ago
published_date_precision
string
There are 3 types of date precision we define: full — day and time of an article is correctly identified with the appropriate timezone timezone unknown — day and time of an article is correctly identified without timezone date — only the day is identified without an exact time
by_parse_date
boolean
When set to True, transforms your from_ and to_ parameters to filter by parse_date instead of published_date
Be aware that a new variable parse_date will be added to the output list with each article.
sort_by
string
relevancy (default value) — the most relevant results first date — the most recently published results first rank — the results from the highest-ranked sources first
ranked_only
boolean
Default: True Limit the search only for the sources which are in the top 1 million online websites. Unranked sources are assigned a rank that equals 999999
from_rank
integer
[0:999999] The lowest boundary of the rank of a news website to filter by. Important: lower rank means that a source is more popular
to_rank
integer
[0:999999] The upper boundary of the rank of a news website to filter by.
is_headline
boolean
When set to True, only articles that were posted on the home page of a given news domain will be shown.
is_paid_content
[Still in development phase]
When set to False, only articles that publish full public available content will be shown.
Some news publishers partially block content of their articles, so we get only several sentences from them. This filter will help you get full content.
parent_url
array
One or more categorical URL to filter your search. It should be the normal form of the URL, For example, https://www.washingtonpost.com/politics,https://www.washingtonpost.com/technology,https://www.washingtonpost.com/business
all_links
array
Search for desired URL mentioned in the article.
Please, refer to the All Links And Domains Format section for more examples and explanations.
all_domain_links
array
Search for desired domain URL mentioned in the article.
Please, refer to the All Links And Domains Format section for more examples and explanations.
word_count_min
integer
Set a minimum number of words that an article must contain.
To be used for avoiding avoid articles with small content.
word_count_max
integer
Set a maximum number of words that an article must contain.
To be used for avoiding avoid articles with big content.
page_size
integer
[1:1000] How many articles to return per page.
page
integer
The number of the page. Use it to scroll through the results. This parameter is used to paginate: scroll through results because one API response cannot return more than 1000 articles.
clustering_enabled
boolean
[Available only if NLP is enabled for your API key]
When set to True, it enables clustering on articles. Instead of showing a list of articles, you will be given a list of clustering to put together similar articles.
Please refer to the Deduplicate Data With Clustering section for more examples and explanations.
clustering_threshold
float
[Available only if NLP is enabled for your API key]
Set a threshold for an article to be similar.
Default value: 0.6
The value can vary from 0 to 1.
clustering_variable
string
[Available only if NLP is enabled for your API key]
Select the data on which you want the similarity to be calculated.
Accepted values:
content, title, summary
Default value:
content
include_nlp_data
boolean
When set to True, adds to each article a NLP layer.
Not available for all plans. Please contact us to enable it.
has_nlp
boolean
[Available only if NLP is enabled for your API key]
When set to True, filter data only to those articles that have an NLP layer.
theme
string
[Available only if NLP is enabled for your API key]
Accepted values:
Business, Economics, Entertainment, Finance, Health, Politics, Science, Sports, Tech, Crime, Lifestyle, Automotive , Travel, Weather, General
Comma-separated string or a list/array.
Multiple themes can be selected.
For example:
Business
Business, Finance
Topic labeling is based on the actual content of an article.
ORG_entity_name
string
[Available only if NLP is enabled for your API key]
ORG stands for Organisation.
We identify company names mentioned in articles and enable you to search for them.
More information on Search By Entity
PER_entity_name
string
[Available only if NLP is enabled for your API key]
PER stands for Person.
We identify people's names mentioned in articles and enable you to search for them.
More information on Search By Entity
LOC_entity_name
string
[Available only if NLP is enabled for your API key]
LOC stands for Location.
We identify geographical locations mentioned in articles and enable you to search for them.
More information on Search By Entity
MISC_entity_name
string
[Available only if NLP is enabled for your API key]
MISC stands for Miscellaneous.
We identify products and other names mentioned in articles and enable you to search for them.
More information on Search By Entity
title_sentiment_min
float
[Available only if NLP enabled for your plan]
Narrow down your search to only possitive or negative news based on article's title sentiment.
The value can vary from -1 to 1.
title_sentiment_max
float
[Available only if NLP is enabled for your API key]
Narrow down your search to only positive or negative news based on the article's title sentiment.
The value can vary from -1 to 1.
content_sentiment_min
float
[Available only if NLP is enabled for your API key]
Narrow down your search to only positive or negative news based on the article's content sentiment.
The value can vary from -1 to 1.
content_sentiment_max
float
[Available only if NLP is enabled for your API key]
Narrow down your search to only positive or negative news based on the article's content sentiment.
The value can vary from -1 to 1.
iptc_tags
string
[Available only if tags are enabled for your API key]
We label articles with IPTC tags based on the content and enable you to filter articles based on the tags.
Only IPTC tag IDs can be used in this parameter.
For example, 20000183,20000199,20000188 or just 20000188
not_iptc_tags
string
[Available only if tags are enabled for your API key]
Inverse of the iptc_tags parameter; it enables you to filter articles based on their IPTC tags.
Responses
200
Success
403: Forbidden
Invalid API Key
406
Unsupported Parameters
408
Request Timeout
422: Unprocessable Entity
Parameter is not allowed
429
Too many API calls

Successful Request Response

1
{
2
"status": "ok",
3
"total_hits": 6969,
4
"page": 1,
5
"total_pages": 70,
6
"page_size": 100,
7
"articles": [
8
{
9
"title": "Rabbit sells more than 10,000 units of its extremely interesting pocket AI companion",
10
"author": "Jak Connor",
11
"authors": [
12
"Jak Connor"
13
],
14
"journalists": [
15
"Jak Connor"
16
],
17
"published_date": "2024-01-17 16:05:03",
18
"published_date_precision": "full",
19
"updated_date": null,
20
"updated_date_precision": null,
21
"link": "https://www.tweaktown.com/news/95657/rabbit-sells-more-than-10-000-units-of-its-extremely-interesting-pocket-ai-companion/index.html",
22
"domain_url": "tweaktown.com",
23
"full_domain_url": "tweaktown.com",
24
"name_source": "TweakTown",
25
"is_headline": true,
26
"paid_content": false,
27
"parent_url": "https://www.tweaktown.com/news",
28
"country": "US",
29
"rights": "tweaktown.com",
30
"rank": 7324,
31
"media": "https://static.tweaktown.com/news/9/5/95657_8481_rabbit_full.png",
32
"language": "en",
33
"description": "AI startup company Rabbit has announced that it has sold out two batches of its R1, an AI companion device designed to be a pocket virtual assistant.",
34
"content": "AI startup company Rabbit has announced that it has sold out two batches of its R1, an AI companion device designed to be a pocket virtual assistant.\nStartup company Rabbit gained massive attention at the CES event this year and has had to open up a second production run of its new AI-powered pocket assistant, the R1, after it sold 10,000 units on the very first day of pre-orders. After just one day of pre-orders being live, Rabbit announced it has sold out of their first run of AI companions, taking to social platform X to say, \"When we started building R1, we said internally that we'd be happy if we sold 500 devices on launch day.\" absolutely smashing that they added, \"In 24 hours, we already beat that by 20x!\" Rabbit unveiled the funky orange pocket pal during a showcase on Tuesday, which comes with a 2.88-inch touchscreen and runs on Rabbits OS. The device uses its \"Large Action Model\" as a universal controller for apps to allow it to do things like play music, order an Uber, buy groceries, and send messages through one interface without the need for a phone or computer. The device is also trainable, allowing users to set how the R1 interacts with apps. 2 VIEW GALLERY - 2 IMAGES Although Rabbit is completely sold out on their first line of production, due between March and April of this year, you can still pre-order the R1 directly through Rabbit's website. Rabbit says consumers can expect the delivery date for the device to be between April and May of this year, meaning those who missed out on the first set of pre-orders won't have to wait very long.",
35
"word_count": 283,
36
"is_opinion": false,
37
"twitter_account": "@TweakTown",
38
"all_links": [
39
"https://twitter.com/1/status/1745186588475502718",
40
"https://click.linksynergy.com/deeplink?id=RZrrn*9L87M&mid=44583&murl=https://www.newegg.com",
41
"https://www.amazon.com/dp/B0BY3J3ZPZ?tag=twea-20",
42
"https://www.tweaktownforum.com",
43
"https://www.amazon.com/dp/B07WTS8T2W?tag=twea-20",
44
"https://www.amazon.com/dp/B0B469JRGC?tag=twea-20",
45
"https://www.youtube.com/user/tweaktown?sub_confirmation=1",
46
"https://www.amazon.com/dp/B00OAJ412U?tag=twea-20",
47
"https://twitter.com/TweakTown",
48
"https://www.amazon.com/dp/B07GCKQD77?tag=twea-20",
49
"https://www.amazon.com/dp/B08166SLDF?tag=twea-20",
50
"https://www.pinterest.com/tweaktown/",
51
"https://www.amazon.com/dp/B07NY9ZRZG?tag=twea-20",
52
"https://www.facebook.com/TweakTown",
53
"https://www.amazon.com/dp/B0BLCBLCDR?tag=twea-20",
54
"https://www.amazon.com/dp/B09WCHGP12?tag=twea-20",
55
"https://www.amazon.com/dp/B07C438TMN?tag=twea-20&linkCode=ogi&th=1&psc=1",
56
"https://www.amazon.com/dp/B07SZXBTNW?tag=twea-20",
57
"https://www.amazon.com/dp/B0BP8B6M7Y?tag=twea-20",
58
"https://www.theverge.com/2024/1/10/24033498/rabbit-r1-sold-out-ces-ai"
59
],
60
"all_domain_links": [
61
"tweaktownforum.com",
62
"twitter.com",
63
"facebook.com",
64
"amazon.com",
65
"theverge.com",
66
"linksynergy.com",
67
"pinterest.com",
68
"youtube.com"
69
],
70
"nlp": {
71
"theme": "Business, Tech",
72
"summary": "Rabbit sold 10,000 units of their pocket virtual assistant R1 on the first day of pre-orders. The R1 is an AI-powered device with a 2.88-inch touchscreen and runs on Rabbits OS. The first production run of the R1 sold out in 24 hours. The second production run will be ready between April and May.",
73
"sentiment": {
74
"title": 0,
75
"content": 0.9992566704750061
76
},
77
"ner_PER": [],
78
"ner_ORG": [
79
{
80
"entity_name": "Rabbit",
81
"count": 7
82
},
83
{
84
"entity_name": "Uber",
85
"count": 1
86
}
87
],
88
"ner_MISC": [
89
{
90
"entity_name": "R1",
91
"count": 5
92
},
93
{
94
"entity_name": "CES",
95
"count": 1
96
},
97
{
98
"entity_name": "Rabbits OS",
99
"count": 1
100
},
101
{
102
"entity_name": "allowing",
103
"count": 1
104
},
105
{
106
"entity_name": "VIEW",
107
"count": 1
108
}
109
],
110
"ner_LOC": [],
111
"iptc_tags_name": [
112
"science and technology / technology and engineering / agricultural technology",
113
"economy, business and finance / products and services / consumer goods / consumer electronics",
114
"economy, business and finance / business information / business strategy and marketing / new product or service"
115
],
116
"iptc_tags_id": [
117
"20000192",
118
"20000205",
119
"20000170",
120
"13000000",
121
"20000243",
122
"04000000",
123
"20001258",
124
"20000756",
125
"20000209",
126
"20000759"
127
]
128
},
129
"id": "164de9168279ea19a096bc5e24428753",
130
"score": 30.279182
131
},
132
...
133
],
134
"user_input": {...}
135
}

Return Body Fields

Object
Sub Object
Description
status
Returns ok if everything went well.
Returns error in case of an error (plus 2 additional fields in case of error — error_code and message)
total_hits
How many news articles match your search criterion. Maximum is 10,000
page
The page where you are at
total_pages
How many pages you can access given your page_size parameter
page_size
How many news articles are in the returned JSON object
articles:
News articles found. list
title
The title of the article
author
The author of the article
authors
List of all author names
journalists
Clean list of journalists. No news publishcation names, only people.
published_date
Published date & time
published_date_precision
Accuracy of the published_date field.
There are 3 types of date precision we define:
full — day and time of an article is correctly identified with the appropriate timezone
timezone unknown — day and time of an article is correctly identified without timezone
date — only the day is identified without an exact time
updated_date
Updated date & time
updated_date_precision
Accuracy of the updated_datefield.
There are 3 types of date precision we define:
full — day and time of an article is correctly identified with the appropriate timezone
timezone unknown — day and time of an article is correctly identified without timezone
date — only the day is identified without an exact time
link
Full URL where the article was originally published
domain_url
The domain URL of the article's source
full_domain_url
The full domain URL with a subcategory of the article's source
name_source
The common name of the News Source
is_headline
True when an article has been seen on the main page of the news source.
parent_url
The URL where an article was initially found
country
The country of the publisher
rights
Copyright
rank
The page rank of the source website (which is given in the clean_url)
media
A link to a thumbnail image of the article
language
The language of the article
description
Short summary of the article provided by the publisher
content
The full content of the article
word_count
Number of words in the article's content
is_opinion
True if the article is an "Opinion" article
twitter_account
The Twitter account of the publisher
all_links
All URL links embedded in the article's content HTML
all_domain_links
All domain URL embedded in the article's content HTML
nlp
Depending on your plan your can have : - summary - sentiment - theme - ner - embeddings - iptc_tags_name - iptc_tags_ids
id
Newscatcher API's unique identifier for each news article
score
How well the article is matching your search criteria. _score is different for each search you make. The best matching article has the highest score
user_input
An object that returns how the API saw your request. It shows you which parameters have been used to perform a search. Useful for debugging, especially to check if there is any problem with URL encoding

Supercharge Your News Searches

A lot can be done using the search endpoint, and we are constantly working on new functionalities to make it more powerful and valuable. Here are the functionalities that we recommend

Make More Precise Query

There's a lot you can do with simple keyword-based searches. Using exact matching and boolean operators, you can exclude a set of words or combine multiple queries related to a topic of interest into one query. There's also proximity-based searching for more sophisticated querying.

Deduplicate Articles

Most significant events and topics get massive coverage from 10s of news publications. Some media conglomerates even republish the articles. You or your analysts don't need to sift through all of these articles with duplicate information; use the clustering functionality in our search endpoint to get groups of articles with distinct bits of information.

Apple Or apple?

Sometimes, it's necessary to specify whether the keyword you're looking for is an organization or a person. For instance, take the tech giant Apple. If you were looking for articles about a recent development about Apple, you would want to get articles about apple prices or orchids, would you?

What Is This Linked To?

The links in an article can serve as valuable markers. Want to find all articles talking about a specific research paper or a press release? Which company named XYZ is this referring to?