News

EL PAÍS’s tree of knowledge

26-01-2017

If you were to see it from the sky, it would look like a tree. Or a vast river. But instead of water, what flows here are words. Donald Trump, Ryan Gosling, Brexit, the Gürtel Case ... and so on forming an interlinked network of 130,000 concepts that allows readers to track all the news related to these words. This collabulary, a collaborative vocabulary for tagging Web content and shared by PRISA’s 15 media, endeavors to provide context to the news and a collective memory of all the terms that matter to us.

 

"In One Hundred Years of Solitude, when Gabriel García Márquez tells the story of the plague of insomnia in Macondo, people forget things and what they are for. So one of the characters decides to write down the words that define a specific term, so they won’t forget what objects are. Our work is similar, in that we are trying to endow a word with a concrete reality so that it comes to form part of our conversation,” said Felipe Díez Núñez (Ávila, 1976), in charge of the collabulary, as he described the work that he and the team of documentalists from EL PAÍS are doing with this tool.

 

EL PAÍS tags are not the same as those we might see in a blog or on social media. The latter belong to the language of taxonomy, the ordering of concepts, known as a folksonomy. This language allows each Internet user to create their own tags. "Imagine what would happen if that were our system of ordering things. With hundreds of journalists from 15 different media we would have a serious problem. At EL PAÍS, each term in the collabulary is chosen by our team using documentalist criteria, not SEO (positioning strategies that serve to appear at the top of a Google search or similar). That is to say, we choose the most timeless words to serve as a form of memory," explains Díez, coordinator of the team that manages the tool and monitors the quality of news tagging. The collabulary also establishes a hierarchy of terms, which allows it to encompass big topics with one word that can then be branched into finer details.

 

In the future, collaboration will be one of the keys to the wealth of news and information resources at the disposal of PRISA, with newspaper EL PAÍS at the forefront of this digital transformation. "We are steering the collabulary towards ontology. This tool needs internal semantics so that the machines (search engines’ search bots that detect content) recognize them and know how to interpret them.The future will be the semantic web, something that, in Spain, no media outlet yet has," says Díez. This will allow search bots to recognize not only a particular tag, but all the terms related to it in the abovementioned giant tree or river. "This research no longer merely interrogates how we use the Internet today. It anticipates how we will use it in the future."

 

Díez finished by defining the collabulary and why it is an essential part of the newspaper – and this brought him back to Macondo: "At one point,  the characters, after writing so many labels, forget what they mean as well. Then, there’s another twist as they go one further and add labels to the labels so they know what they mean. ‘Table’ is something we eat off. This is what we are doing with ontology. The collabulary is the table. Ontology, its description. In short, it is a struggle against forgetting".

Source: EL PAÍS

Back to news

Go to the top of the page