v3 Authors
Get the articles published by an author
Last updated
Get the articles published by an author
Last updated
GET
https://v3-api.newscatcherapi.com/api/authors?author_name=Fiona Jackson&by_parse_date=false&sort_by=relevancy&page=1&page_size=100
Name | Type | Description |
---|---|---|
POST
https://v3-api.newscatcherapi.com/api/authors?
Name | Type | Description |
---|---|---|
Name | Type | Description |
---|---|---|
Name | Type | Description |
---|---|---|
author_name*
string
The author you're searching for. This parameter returns exact matches only.
not_author_name
string
Inverse to the author_name
parameter
lang
array
Specifies the languages of the search. For example, en
.
The only accepted format is ISO 639-1 — 2 letter code.
Refer to the language format section for more details.
not_lang
array
Inverse to the lang
parameter
published_date_precision
string
There are 3 types of date precision we define:
full
— day and time of an article is correctly identified with the appropriate timezone
timezone unknown
— day and time of an article is correctly identified without timezone
date
— only the day is identified without an exact time
countries
array
Countries where the news publisher is located.
Important: This parameter is not responsible for the countries mentioned in the news article.
One or multiple countries can be used in the search.
The only acceptable format is ISO 3166-1 alpha-2
For example, US,CA,MX
or just US
not_countries
array
The inverse of the countries
parameter.
sources
array
One or more news sources to narrow down your search.
The format should be a domain url from your URL. Subdomains, like finance.yahoo.com
are also accepted. Comma-separated string or a list/array.
For example, nytimes.com,theguardian.com,finance.yahoo.com
not_sources
array
One or more sources to be excluded from the search. Comma-separated string or a list/array.
For example, cnn.com,wsj.com
ranked_only
boolean
Default: True
Limit the search only for the sources which are in the top 1 million online websites. Unranked sources are assigned a rank that equals 999999
from_rank
integer
[0:999999]
The lowest boundary of the rank of a news website to filter by.
Important: lower rank means that a source is more popular
to_rank
integer
[0:999999]
The upper boundary of the rank of a news website to filter by.
sort_by
string
relevancy
(default value) — the most relevant results first
date
— the most recently published results first
rank
— the results from the highest-ranked sources first
page_size
integer
[1:1000]
How many articles to return per page.
page
integer
The number of the page. Use it to scroll through the results. This parameter is used to paginate: scroll through results because one API response cannot return more than 1000 articles.
to_
string
Until which point in time to search for. The default timezone is UTC.
Availabe formats :
YYYY/mm/dd
YYYY/mm/dd HH:MM:SS
English phrases like1 day ago
from_
string
From which point in time to start the search. Defaults to the past week.
Availabe formats :
YYYY/mm/dd
YYYY/mm/dd HH:MM:SS
English phrases like 1 day ago
by_parse_date
boolean
When set to True
, transforms your from_ and to_ parameters to filter by parse_date instead of published_date
Be aware that a new variable parse_date will be added to the output list with each article.
is_headline
boolean
When set to True
, only articles that were posted on the home page of a given news domain will be shown.
is_opinion
boolean
[Still in development phase]
When set to True
, only articles that are determined to be an opinion piece will be returned. Set False
to exclude opinion-based articles and receive news only.
parent_url
array
One or more categorical URL to filter your search. It should be the normal form of the URL,
For example, https://www.washingtonpost.com/politics
,
https://www.washingtonpost.com/technology,https://www.washingtonpost.com/business
all_links
array
Search for desired URL mentioned in the article.
Please, refer to the All Links And Domains Format pagefor more examples and explanations.
all_domain_links
array
Search for desired domain URL mentioned in the article.
Please, refer to the All Links And Domains Format page for more examples and explanations.
word_count_min
integer
Set a minimum number of words that an article must contain.
To be used for avoiding avoid articles with small content.
word_count_max
integer
Set a maximum number of words that an article must contain.
To be used for avoiding avoid articles with big content.
include_nlp_data
boolean
When set to True
, adds to each article a NLP layer.
Not available for all plans. Please contact us to enable it.
theme
string
[Available only if NLP is enabled for your API key]
A general topic of an article. Topic labelling is based on the actual content of an article.
Accepted values:
Business
, Economics
, Entertainment
, Finance
, Health
, Politics
, Science
, Sports
, Tech
, Crime
, Lifestyle
, Automotive
, Travel
, Weather
, General
Comma-separated string or a list/array.
Multiple themes can be selected.
For example:
Business
Business
, Finance
ORG_entity_name
string
[Available only if NLP is enabled for your API key]
ORG stands for Organisation.
We identify company names mentioned in articles and enable you to search for them.
More information on Search By Entity
has_nlp
boolean
[Available only if NLP is enabled for your API key]
When set to True
, filter data only to those articles that have an NLP layer.
title_sentiment_min
float
[Available only if NLP is enabled for your API key]
Narrow down your search to only positive or negative news based on the article's title sentiment.
The value can vary from -1
to 1
.
title_sentiment_max
float
[Available only if NLP is enabled for your API key]
Narrow down your search to only positive or negative news based on the article's title sentiment.
The value can vary from -1
to 1
.
content_sentiment_min
float
[Available only if NLP is enabled for your API key]
Narrow down your search to only positive or negative news based on the article's content sentiment.
The value can vary from -1
to 1
.
content_sentiment_max
float
[Available only if NLP is enabled for your API key]
Narrow your search to only positive or negative news based on the article's content sentiment.
The value can vary from -1
to 1
.
is_paid_content
boolean
[Still in development phase]
When set to False
, only articles that publish full public available content will be shown.
Some news publishers partially block content of their articles, so we get only several sentences from them. This filter will help you get full content.
clustering_enabled
boolean
[Available only if NLP is enabled for your API key]
When set to True, enables clustering on articles. Instead of showing a list of articles, you will be given a list of clustering to put together similar articles.
Please refer to the Clustering News Articles page for more examples and explanations.
clustering_threshold
float
[Available only if NLP is enabled for your API key]
Set a threshold for an article to be similar.
Default value: 0.6
The value can vary from 0
to 1
.
clustering_variable
string
[Available only if NLP is enabled for your API key]
Select the data on which you want the similarity to be calculated on.
Accepted values:
content
, title
, summary
Default value:
content
PER_entity_name
string
[Available only if NLP is enabled for your API key]
PER stands for Person.
We identify people's names mentioned in articles and enable you to search for them.
More information on Search By Entity
LOC_entity_name
string
[Available only if NLP is enabled for your API key]
LOC stands for Location.
We identify geographical locations mentioned in articles and enable you to search for them.
More information on Search By Entity
MISC_entity_name
string
[Available only if NLP is enabled for your API key]
MISC stands for Miscellaneous.
We identify products and other names mentioned in articles and enable you to search for them.
More information on Search By Entity
predefined_sources
string
Use our TOP predifined sources per country.
Later we are going to improve it and add more functionality, like top categories etc.
The format should be strictly like this:
- starting with word top
- put the number of desired sources top source
- 2 letter country code ISO 3166-1 alpha-2
For example:
top 100 US
top 33 AT
top 5 GB
It is also possible to put multiple countries with custom number of top sources, should be comma separated.
For example:
top 100 US, GB
top 33 AT, 55 IT
iptc_tags
string
[Available only if tags are enabled for your API key]
We label articles with IPTC tags based on the content and enable you to filter articles based on the tags.
Only IPTC tag IDs can be used in this parameter.
For example, 20000183,20000199,20000188
or just 20000188
not_iptc_tags
string
[Available only if tags are enabled for your API key]
Inverse of the iptc_tags
parameter; it enables you to filter articles based on their IPTC tags.
not_author_name
array
List of author names that you want to exclude from your search.
Usually, you might want to exclude articles where one of the authors is Associated Press, or PRNewswire.
For example:
PRNewswire, AOL Staff
not_theme
string
[Available only if NLP is enabled for your API key]
Inverse of the theme
parameter; it enables you to filter articles based on their general topic.
source_name
array
NewsCatcher team does not suggest using source_name in Production. The best parameter to get data from a specific News Domain is sources.
One or more news source names to narrow down your search.
Comma-separated string or a list/array.
For example:
CryptoPotato,thethings
x-api-token*
string
Your unique authentication token
x-api-token*
string
Your unique authentication token
author_name*
string
The author you're searching for. This parameter returns exact matches only.
not_author_name
string
Inverse to the author_name
parameter
lang
array
Specifies the languages of the search. For example, en
.
The only accepted format is ISO 639-1 — 2 letter code.
Refer to the language format section for more details.
not_lang
array
Inverse to the lang
parameter
published_date_precision
string
There are 3 types of date precision we define:
full
— day and time of an article is correctly identified with the appropriate timezone
timezone unknown
— day and time of an article is correctly identified without timezone
date
— only the day is identified without an exact time
countries
array
Countries where the news publisher is located.
Important: This parameter is not responsible for the countries mentioned in the news article.
One or multiple countries can be used in the search.
The only acceptable format is ISO 3166-1 alpha-2
For example, US,CA,MX
or just US
not_countries
array
The inverse of the countries
parameter.
sources
array
One or more news sources to narrow down your search.
The format should be a domain url from your URL. Subdomains, like finance.yahoo.com
are also accepted. Comma-separated string or a list/array.
For example, nytimes.com,theguardian.com,finance.yahoo.com
not_sources
array
One or more sources to be excluded from the search. Comma-separated string or a list/array.
For example, cnn.com,wsj.com
ranked_only
boolean
Default: True
Limit the search only for the sources which are in the top 1 million online websites. Unranked sources are assigned a rank that equals 999999
from_rank
integer
[0:999999]
The lowest boundary of the rank of a news website to filter by.
Important: lower rank means that a source is more popular
to_rank
integer
[0:999999]
The upper boundary of the rank of a news website to filter by.
sort_by
string
relevancy
(default value) — the most relevant results first
date
— the most recently published results first
rank
— the results from the highest-ranked sources first
page_size
integer
[1:1000]
How many articles to return per page.
page
integer
The number of the page. Use it to scroll through the results. This parameter is used to paginate: scroll through results because one API response cannot return more than 1000 articles.
to_
string
Until which point in time to search for. The default timezone is UTC.
Availabe formats :
YYYY/mm/dd
YYYY/mm/dd HH:MM:SS
English phrases like1 day ago
from_
string
From which point in time to start the search. Defaults to the past week.
Availabe formats :
YYYY/mm/dd
YYYY/mm/dd HH:MM:SS
English phrases like 1 day ago
by_parse_date
boolean
When set to True
, transforms your from_ and to_ parameters to filter by parse_date instead of published_date
Be aware that a new variable parse_date will be added to the output list with each article.
is_headline
boolean
When set to True
, only articles that were posted on the home page of a given news domain will be shown.
is_opinion
boolean
[Still in development phase]
When set to True
, only articles that are determined to be an opinion piece will be returned. Set False
to exclude opinion-based articles and receive news only.
parent_url
array
One or more categorical URL to filter your search. It should be the normal form of the URL,
For example, https://www.washingtonpost.com/politics
,
https://www.washingtonpost.com/technology,https://www.washingtonpost.com/business
all_links
array
Search for desired URL mentioned in the article.
Please, refer to the All Links And Domains Format section for more examples and explanations.
all_domain_links
array
Search for desired domain URL mentioned in the article.
Please, refer to the All Links And Domains Format section for more examples and explanations.
word_count_min
integer
Set a minimum number of words that an article must contain.
To be used for avoiding avoid articles with small content.
word_count_max
integer
Set a maximum number of words that an article must contain.
To be used for avoiding avoid articles with big content.
include_nlp_data
boolean
When set to True
, adds to each article a NLP layer.
Not available for all plans. Please contact us to enable it.
theme
string
[Available only if NLP is enabled for your API key]
A general topic of an article. Topic labelling is based on the actual content of an article.
Accepted values:
Business
, Economics
, Entertainment
, Finance
, Health
, Politics
, Science
, Sports
, Tech
, Crime
, Lifestyle
, Automotive
, Travel
, Weather
, General
Comma-separated string or a list/array.
Multiple themes can be selected.
For example:
Business
Business
, Finance
ORG_entity_name
string
[Available only if NLP is enabled for your API key]
ORG stands for Organisation.
We identify company names mentioned in articles and enable you to search for them.
More information on Search By Entity
has_nlp
boolean
[Available only if NLP is enabled for your API key]
When set to True
, filter data only to those articles that have an NLP layer.
title_sentiment_min
float
[Available only if NLP enabled for your plan]
Narrow down your search to only possitive or negative news based on article's title sentiment.
The value can vary from -1
to 1
.
title_sentiment_max
float
[Available only if NLP is enabled for your API key]
Narrow down your search to only positive or negative news based on the article's title sentiment.
The value can vary from -1
to 1
.
content_sentiment_min
float
[Available only if NLP is enabled for your API key]
Narrow down your search to only positive or negative news based on the article's content sentiment.
The value can vary from -1
to 1
.
content_sentiment_max
float
[Available only if NLP is enabled for your API key]
Narrow down your search to only positive or negative news based on the article's content sentiment.
The value can vary from -1
to 1
.
is_paid_content
String
[Still in development phase]
When set to False
, only articles that publish full public available content will be shown.
Some news publishers partially block content of their articles, so we get only several sentences from them. This filter will help you get full content.
clustering_enabled
boolean
[Available only if NLP is enabled for your API key]
When set to True, it enables clustering on articles. Instead of showing a list of articles, you will be given a list of clustering to put together similar articles.
Please refer to the Deduplicate Data With Clustering section for more examples and explanations.
clustering_threshold
float
[Available only if NLP is enabled for your API key]
Set a threshold for an article to be similar.
Default value: 0.6
The value can vary from 0
to 1
.
clustering_variable
string
[Available only if NLP is enabled for your API key]
Select the data on which you want the similarity to be calculated.
Accepted values:
content
, title
, summary
Default value:
content
PER_entity_name
string
[Available only if NLP is enabled for your API key]
PER stands for Person.
We identify people's names mentioned in articles and enable you to search for them.
More information on Search By Entity
LOC_entity_name
string
[Available only if NLP is enabled for your API key]
LOC stands for Location.
We identify geographical locations mentioned in articles and enable you to search for them.
More information on Search By Entity
MISC_entity_name
string
[Available only if NLP is enabled for your API key]
MISC stands for Miscellaneous.
We identify products and other names mentioned in articles and enable you to search for them.
More information on Search By Entity
predefined_sources
string
Use our TOP predifined sources per country.
Later we are going to improve it and add more functionality, like top categories etc.
The format should be strictly like this:
- starting with word top
- put the number of desired sources top source
- 2 letter country code ISO 3166-1 alpha-2
For example:
top 100 US
top 33 AT
top 5 GB
It is also possible to put multiple countries with custom number of top sources, should be comma separated.
For example:
top 100 US, GB
top 33 AT, 55 IT
not_iptc_tags
string
[Available only if tags are enabled for your API key]
Inverse of the iptc_tags
parameter; it enables you to filter articles based on their IPTC tags.
iptc_tags
string
[Available only if tags are enabled for your API key]
We label articles with IPTC tags based on the content and enable you to filter articles based on the tags.
Only IPTC tag IDs can be used in this parameter.
For example, 20000183,20000199,20000188
or just 20000188
not_author_name
array
List of author names that you want to exclude from your search.
Usually, you might want to exclude articles where one of the authors is Associated Press, or PRNewswire.
For example:
PRNewswire, AOL Staff
not_theme
string
[Available only if NLP is enabled for your API key]
Inverse of the theme
parameter; it enables you to filter articles based on their general topic.
source_name
array
NewsCatcher team does not suggest using source_name in Production. The best parameter to get data from a specific News Domain is sources.
One or more news source names to narrow down your search.
Comma-separated string or a list/array.
For example:
CryptoPotato,thethings
Object
Sub Object
Description
status
Returns ok
if everything went well.
Returns error
in case of an error (plus 2 additional fields in case of error — error_code
and message
)
total_hits
How many news articles match your search criterion. Maximum is 10,000
page
The page where you are at
total_pages
How many pages you can access given your page_size parameter
page_size
How many news articles are in the returned JSON object
articles
:
News articles found. list
title
The title of the article
author
The author of the article
authors
List of all author names
journalists
Clean list of journalists. No news publishcation names, only people.
published_date
Published date & time
published_date_precision
Accuracy of the published_date
field.
There are 3 types of date precision we define:
full
— day and time of an article is correctly identified with the appropriate timezone
timezone unknown
— day and time of an article is correctly identified without timezone
date
— only the day is identified without an exact time
updated_date
Updated date & time
updated_date_precision
Accuracy of the updated_date
field.
There are 3 types of date precision we define:
full
— day and time of an article is correctly identified with the appropriate timezone
timezone unknown
— day and time of an article is correctly identified without timezone
date
— only the day is identified without an exact time
link
Full URL where the article was originally published
domain_url
The domain URL of the article's source
full_domain_url
The full domain URL with a subcategory of the article's source
name_source
The common name of the News Source
is_headline
True when an article has been seen on the main page of the news source.
parent_url
The URL where an article was initially found
country
The country of the publisher
rights
Copyright
rank
The page rank of the source website (which is given in the clean_url
)
media
A link to a thumbnail image of the article
language
The language of the article
description
Short summary of the article provided by the publisher
content
The full content of the article
word_count
Number of words in the article's content
is_opinion
True
if the article is an "Opinion" article
twitter_account
The Twitter account of the publisher
all_links
All URL links embedded in the article's content HTML
all_domain_links
All domain URL embedded in the article's content HTML
nlp
Depending on your plan your can have :
- summary
- sentiment
- theme
- ner
- embeddings
- iptc_tags_name
- iptc_tags_ids
id
Newscatcher API's unique identifier for each news article
score
How well the article is matching your search criteria. _score
is different for each search you make. The best matching article has the highest score
user_input
An object that returns how the API saw your request. It shows you which parameters have been used to perform a search. Useful for debugging, especially to check if there is any problem with URL encoding