Operations

Below are all API operations listed.

Document search

CopyXML

/documents - gets the latest documents or documents matching a query.

Parameter Explanation
q The query string to search for.
types Comma separated list of document types that should be returned. Valid types are:
  • Article
  • Blog
  • Fact Sheet
  • Press Release
  • Audio/Video
pageSize Number of documents to return. Default is 20.
pageNumber Window the result to a specific page. First page is 0.
sortBy How the result should be sorted. Default is by publication date. The following sort values are supported:

relevance - by relevance with most relevant document first.

publicationdate - by publication date with most recent document first.

Example: Return the 5 most relevant blogs mentioning "United Nations". This can be requested using the following uri: /documents?q=United Nations&types=blog&pageSize=5&sortBy=relevance

The result can look like this:

<?xml version="1.0" encoding="utf-16"?>
<ResultList xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Items>
    <DocumentData>
      <Id>5_2261700798855512087</Id>
      <Description>NYT's Krugman: Hopefully, 'Globalization 2.0' will fare better than 1.0</Description>
      <Type>Blog</Type>
      <PublicationDate>2008-12-22T14:19:00Z</PublicationDate>
      <Publisher>BloggingStocks</Publisher>
    </DocumentData>
    <DocumentData>
      <Id>5_2261700811388092416</Id>
      <Description>US Refuses to Sign UN Resolution on Gay Rights [Dispatches from the Culture Wars]</Description>
      <Type>Blog</Type>
      <PublicationDate>2008-12-22T14:31:00Z</PublicationDate>
      <Publisher>ScienceBlogs : Combined Feed</Publisher>
    </DocumentData>
    <DocumentData>
      <Id>5_2261701110693625856</Id>
      <Description>Michealene Cristini Risley: "A Christmas Carol in Zimbabwe"</Description>
      <Type>Blog</Type>
      <PublicationDate>2008-12-22T18:53:00Z</PublicationDate>
      <Publisher>The Huffington Post Full Blog Feed</Publisher>
    </DocumentData>
    <DocumentData>
      <Id>5_2261701122454454272</Id>
      <Description>The protectionism comeback and a possible solution</Description>
      <Type>Blog</Type>
      <PublicationDate>2008-12-22T19:00:00Z</PublicationDate>
      <Publisher>The Curious Capitalist - TIME.com</Publisher>
    </DocumentData>
    <DocumentData>
      <Id>5_2261701122940993536</Id>
      <Description>Allan Clear: Harm Reduction and Allan's Diplomatic Faux Pas, on the Final Day of the U.N. Drug Treatment Conference, Vienna</Description>
      <Type>Blog</Type>
      <PublicationDate>2008-12-22T19:01:00Z</PublicationDate>
      <Publisher>The Huffington Post Full Blog Feed</Publisher>
    </DocumentData>
  </Items>
  <TotalCount>1093</TotalCount>
</ResultList>

Entity search

CopyXML

/entities - searches for entities matching a query string.

Parameter Explanation
q The query string to search for.
types Comma separated list of entity types that should be returned. Valid types are:
  • Person
  • Company
  • Organization
  • Keyphrase
  • Country
  • City
pageSize Number of entities to return. Default is 20.
pageNumber Window the result to a specific page. First page is 0.
Note:

Entities are always sorted by relevance with most relevant entity first.

Example: Return the 5 most relevant entities having Obama in its name. This can be requested using this uri: /entities?q=Obama&pageSize=5

This can give the following result.

<?xml version="1.0" encoding="utf-16"?>
<ResultList xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Items>
    <TermData>
      <Id>11_98035</Id>
      <Description>Obama (Japan)</Description>
      <Type>City</Type>
    </TermData>
    <TermData>
      <Id>11_8081593</Id>
      <Description>Michelle Obama</Description>
      <Type>Person</Type>
    </TermData>
    <TermData>
      <Id>11_11332962</Id>
      <Description>Hussein Obama</Description>
      <Type>Person</Type>
    </TermData>
    <TermData>
      <Id>11_46954190</Id>
      <Description>Natasha Obama</Description>
      <Type>Person</Type>
    </TermData>
    <TermData>
      <Id>11_240886</Id>
      <Description>Barack Obama</Description>
      <Type>Person</Type>
    </TermData>
  </Items>
  <TotalCount>516</TotalCount>
</ResultList>

Related entity search

/documents/{query}/entities - returns entities mentioned in documents matching the query string.

/documents/{query}/entities/{types} - returns entities of specific types mentioned in documents matching the query string.

Parameter Explanation
entityTypeLimits Maximum number of entities for each type that will be included in the result. Can only be used in uris containing {types} parameter.
pageSize Number of entities to return. Default is 20.
pageNumber Window the result to a specific page. First page is 0.

Entity Graph

/entitygraph/{query} - gets an entity graph based on a query string.

/entitygraph/{query}/{types} - gets and entity graph containing only specific entity types. Multiple types can be separated by commas. Valid types are:

  • Person
  • Company
  • Organization
  • Keyphrase
  • Country
  • City

Parameter Explanation
noNodes Number of nodes to return. Default is 20.
graphEntities List of entities that should be included in the graph. Entities are referenced with ids and separated by commas.
entityTypeLimits Maximum number of entities for each type that will be included in the graph. Must be combined with {types} parameter.
expandEntities Entities that will be included and expanded in the graph.

Top Stories

/topstories - gets today's top stories.

/topstories/{query} - gets the top stories related to a query string.

Parameter Explanation
noStories Number of stories to return. Default is 5.
minRelevance The mimium relevance for a story to be included in the result. The relevance value should be somewhere between 0 and 1, where 1 means very relevant and indicates a story having documents only mentions the query itself and nothing else. Default is aproximatelly 0.1
minNoDocuments The mimium number of documents that a story should consist of. Default is 2.
Extras Provides extra data to each result row. The following values are supported (and can be combined as a comma separated list):

RelatedEntities - when given, the result will also include a list of the top most related entities to each story.

DocumentTeasers - when given, the result will include a short summary of each story.

Images - when given, the result will include images on each story, if available.

Authentication

Copy 

It is required to provide a key to make requests to the api. If no key is provided, it is interpreted as the key being set to the empty string. The key is set by adding the parameter "apiKey" to the url.

Keys may require a digest to ensure that the request is valid. This is provided by adding the parameter "digest" to the url. When using the api through an external service, this service will have to calculate the digest if it is required. This is done by using a shared secret key that is provided when the apiKey is acquired.

The digest is calculated as described by the following pseudo-code:

The final request will look like http://www.example.com/documents/Sweden?apiKey=MySharedKey&digest=Gh9YtyvJvF39Isfn9QFo2nNkQkI%3D

Key = "MySharedKey"

UTF8Key = UTF8Encode(Key)

// This is a HTTP GET request
Message = "GET http://www.example.com/documents/Sweden"

UTF8Message = UTF8Encode(Message)

// Will output Gh9YtyvJvF39Isfn9QFo2nNkQkI=
Signature = Base64(HMAC-SHA1(UTF8Message, UTF8Key));

Response Types

The API supports several of response format types. The most common onces are Xml and Json which both can be used for all requests. Other more specialized formats are Atom 1.0 and Html which can be used for document list results.

To switch response type, simply set parameter "type" to the desired format. Example: /documents/United Nations?type=atom10.

The following formats are supported:

Type Parameter Response Format
xml Xml (default)
json Json
jsonp JsonP (use this with paramter "callback" to specify callback function)
atom10 Atom 1.0 (only supported by document lists)
html Html (only supported by document lists)

Error Codes

When something goes wrong when requesting the API the following HTTP status codes can be returned to the client in the response.

Code Explanation
200 OK No error.
400 BAD REQUEST Invalid request URI or unsupported parameter.
401 UNAUTHORIZED Authorization is required.
500 INTERNAL SERVER ERROR Internal error occurred. The default error code that is used for all unrecognized errors.