Article

Learn the common fields for article PDP endpoints and example domains.

Article PDP Cache agents extract structured data from article and news detail pages, including headline, body, dates, authors, media, and breadcrumbs.

Description

Article agents return normalized fields for blog posts, news articles, and similar content: title, full text, HTML body, publication and modification dates (ISO and raw), authors, language, breadcrumbs, main image, lists of images/videos/audios, and page/canonical URLs.

Common Fields

FieldTypeDescription
headlinestringArticle headline or title
articleBodystringFull text content of the article
articleBodyHtmlstringHTML markup of the article body
descriptionstringShort summary or description of the article
datePublishedstringPublication date in ISO 8601 format
datePublishedRawstringPublication date as displayed on the page
dateModifiedstringLast modified date in ISO 8601 format
dateModifiedRawstringLast modified date as displayed on the page
authorsarrayList of article authors. Each item: name (string), nameRaw (string)
inLanguagestringLanguage code of the article (e.g., en)
breadcrumbsarrayNavigation breadcrumb trail. Each item: url (string), name (string)
mainImageobjectPrimary image. Structure: url (string)
imagesarrayAll images in the article. Each item: url (string)
videosarrayAll videos in the article. Each item: url (string)
audiosarrayAll audio files in the article. Each item: url (string)
urlstringURL of the article page
canonicalUrlstringCanonical URL of the article

Example Domains / Websites

Article PDP agents are typically available for commonly used news and blog domains, for example:

  • BBC — e.g. https://www.bbc.com/news/..., https://www.bbc.co.uk/news/articles/...
  • The New York Times — e.g. https://www.nytimes.com/...
  • Medium — e.g. https://medium.com/...
  • Reuters — e.g. https://www.reuters.com/...
  • The Guardian — e.g. https://www.theguardian.com/...
  • CNN — e.g. https://edition.cnn.com/...
  • TechCrunch — e.g. https://techcrunch.com/...
  • Wikipedia — e.g. https://en.wikipedia.org/wiki/...

Exact availability depends on the Marketplace. Check the marketplace for the full list of supported article domains.

On this page