The Search API allows queries to be made across FT content via a RESTful service.
The base URI for calls to the Search API is https://api.ft.com/content/search/v1.
To access the API using this URI you must provide an API Key and Content-Type header as documented in the API Reference.
The Search API accepts HTTP POST requests with Content-type set to application/json The simplest request has a form of:
{ “queryString”: “banks” }
The response will echo back the provided query and show the queryContext and resultContext of the executed query along with the results:
{ “query”:{ “queryString”:“banks”, “queryContext”:{ “curations”:[ “ARTICLES”, “BLOGS” ] }, “resultContext”:{ “maxResults”:100, “offset”:0, } }, “results”:[ { “indexCount”:100, “curations”:[ “ARTICLES”, “BLOGS” ], “results”:[ { “aspectSet”:“article”, “modelVersion”:“1”, “id”:“cbc4a190-638d-11e1-8e79-00144feabb8e”, } ] } ], … }
FT content is organised into curations. For example, the following query can be used to request only articles:
{ “queryString”: “banks”, “queryContext” : { “curations” : [ “ARTICLES”] } }
For more information on curations, see the curations discovery method. If no curations are specified, the default behaviour will be to search across all available curations.
Aspects allow consumers to specify which elements of content they wish to receive within the results. Aspects can be provided in the resultContext:
{ “queryString”: “banks”, “resultContext” : { “aspects” : [ “title”,“lifecycle” ] } }
Which will result in more information being provided for each result in the response for example:
{ “query”:{ … }, “results”:[{ “aspectSet”:“article”, “modelVersion”:“1”, “id”:“72920729-c340-1fb0-7023-ba7436373c78”, “title”:{ “title”:“The temptation of higher leverage” }, “lifecycle”:{ “initialPublishDateTime”:“2012-10-24T04:33:00Z”, “lastPublishDateTime”:“2012-10-24T04:33:00Z” } } ]}
For more information on aspects, see the aspects discovery method.
Pagination is supported through two fields of resultContext:
Example of a request with pagination:
{ “queryString”: “banks”, “resultContext” : { “maxResults” : “20”, “offset” : “21”, } }
It should be noted that the maximum number of addressable results is 4000.
Sorting is supported through two fields of resultContext:
Both fields must be provided. Example of a request with sorting:
{ “queryString”: “banks”, “resultContext” : { “sortOrder” : “ASC”, “sortField” : “title” } }
For more information on sortable fields, see the sortable fields discovery method.
Facets allow consumers to navigate through their results by refining their query. Facets can be provided in the resultContext:
{ “queryString”: “banks”, “resultContext” : { “facets” : {“names”:[ “people”]} } }
Which will result in the facets for people being included in the response:
{ “query”:{ … }, “results”:[ … ], “facets”:[ { “name”:“people”, “facetElements”:[ {“name”:“David Cameron”, “count”:10} ,… } ] ] }
Based on the above facetElement, the query can be refined by making a fielded query using the facet name, in this case people, with the value “David Cameron”. This is explained in more detail below, but in this example the refined query string would take the form:
{“queryString”:“banks AND people:=\“David Cameron\“”}
The number of facet elements can be controlled through the use of maxElements and minThreshold. maxElements is the maximum number of facet elements to return (-1 is all facets) and minThreshold is the minimum count required for inclusion.
{ “queryString”: “banks”, “resultContext” : { “facets” : {“names”:[ “people”],“maxElements”:20,“minThreshold”:1} } }
For more information on available facets, see the facets discovery method.
To search for “banks”, the query will take the form:
banks
and will return results containing the word “banks”.
Query of form:
Financial Times
will return all results containing “Financial” and “Times”, with the keywords potentially separated and in any order. This is because the above example is equal to:
Financial AND Times
and
Financial + Times
AND is implicit and can alternatively be replaced by plus +. It is important to use uppercase AND, otherwise search will return articles containing the keyword “and”. To match the phrase “Financial Times”, quotes should be used in the queryString:
“Financial Times”
To search for content about Financial Times and New York Times we would use the queryString:
“Financial Times” AND “New York Times”
To search for content about “Times” but not “Financial”, negation can be used as follows:
Times -Financial
or
Times NOT Financial
or
NOT Financial Times
as we see NOT is equal to - symbol. Whitespace after - is allow. To ask for all content about Financial Times or New York Times we can use OR operator that must be uppercase:
“Financial Times” OR “New York Times”
or
“Financial Times” | “New York Times”
Operator OR and symbol | are exchangeable.
Let’s complicate our query further. To query for all Financial or “New York” but only in content containing word “Times” we can construct the queryString:
(Financial OR “New York”) AND Times
or
(Financial OR “New York”) Times
Without the brackets above query would be interpreted as:
Financial OR (“new York” AND Times)
This is due to AND having precedance. As we see precedance can be imposed using brackets. Any level of bracket nesting is allow.
Fielded queries are supported across all searchable fields defined by the discovery method. For example, if we are looking for all results about the person David Cameron we would use the queryString:
people:“David Cameron”
For all content about David Cameron last published before 2010 we can construct the queryString:
people:“David Cameron” lastPublishDateTime:<2010-01-01T00:00:00Z
To find content with a title containing a phrase “Cameron turns to Olympics” we can do it by:
title:“Cameron turns to Olympics”
To find content with with exactly this title we have to use query
title:=“Cameron turns to Olympics in difficult week.”
for all content except the last we mentioned above we would use:
title:NOT “Cameron turns to Olympics in difficult week.”
Of course fielded queries can be combined with AND/OR operators and brackets. For example to query for all content about David Cameron or Gordon Brown from 2011 and title containing “Olympics” you can use query:
(people:“David Cameron” OR people:“Gordon Brown”) AND (lastPublishDateTime:>2011-01-01T00:00:00Z AND lastPublishDateTime:<2012-01-01T00:00:00Z) AND title:Olympics
or shorter:
(people:“David Cameron” OR people:“Gordon Brown”) lastPublishDateTime:>2011-01-01T00:00:00Z lastPublishDateTime:<2012-01-01T00:00:00Z title:Olympics
Response | Description |
---|---|
200 - OK | Success - the Response will be returned, containing the Results (if any) |
400 - Bad Request | The request could not be understood by the server due to malformed syntax. The client SHOULD NOT repeat the request without modifications |
415 - Method not Supported | The request was in a HTTP method not supported |
422 - Unprocessable Entity | The request was well-formed but was unable to be followed due to semantic errors. The client SHOULD NOT repeat the request without modifications |
500 - Internal Server Error | The server encountered an unexpected condition which prevented it from fulfilling the request |
501 - Not Implemented | The server does not support the functionality required to fulfill the request |