SIBiLS logo

SIBiLS PubMed central full-texts and annotations fetch API


This API allows to retrieve annotated contents from PMC Open Access. The input is a set of pmcids (up to 1,000 per request). The output is a set of parsed and annotated full-texts, in both JATS and BioC formats. Delivered and annotated fields include for example full texts, paragraphs provided with their hierarchical level in the document structure, or figure captions. Annotations are delivered with many features including the type of the mapped entity (drug, gene, disease...), the vocabulary used, the vocabulary unique identifier and preferred term, or the mapping characters offsets.

URL endpoint :

Mandatory input : the list of requested PMCIDs (&ids=). PMCIDs are separated by any non-digit character.

Example : fetch two PMCIDs,PMC3706742

Code sample : python script for demonstrating POST calls to the API is available at


Output is BioC json formatted :

	[Array of documents corresponding to the query]
			[Array of PubMed like informations about the document]
			[Array of passages splitted from the document]
				[Array of informations relative to the passage]
				{"section_title": section where passage (text) is found
				{"type": nature of the passage (e.g. "title", "abstract")
			{"text": the passage
				[Array of annotations relative to the passage]
				{"text": concept annotated
					[Array of informations about each annotation]
					{"type": type of entity (text) matched
					{"source": terminology used
					{"source_id": ID from the terminology
					{"searchable_id": code(terminology)+";"+ source_id
					{"prefered_term": prefered_term extracted from the terminology
					{"sentence": sentence where the text is found
					{"offset": start position of the text in the sentence
					{"length": length of the text annotated

Creative Commons License

Tagged entities are licensed under a Creative Commons Attribution 3.0 Switzerland License.

Questions specifically relating to these services may be sent to