The Pandora Papers

The Pandora Papers: How and Where the World's Predator Class Hide and Invest Their Money While Avoiding Taxes

The Pandora Papers are nearly 12 million files leaked from 14 companies that provide corporate services in offshore jurisdictions. The documents offer the most comprehensive look to date at how such service providers help the rich and famous - including celebrities, the ultra-wealthy, politicians and criminals - to hide their money in financial secrecy jurisdictions.

The International Consortium of Investigative Journalists (ICIJ) received the leaked files and coordinated a worldwide investigation into their contents. The project involves more than 600 journalists from 117 countries.

This is a well-organized website about the what was revealed: The Pandora Papers - OCCRP

Three Gibraltar-registered companies among those identified in Pandora Papers leak

The “Pandora Papers” identify 330 politicians and 130 billionaires in 117 countries – with Gibraltar named in connection with three companies linked to reports of bribery and corruption.

Gibraltar registered company Takilant Limited was under the control of Gulnara Karimova, the Daughter of the President of Uzbekistan, when it received a payment of USD220 million in 2010. The money was the result of a controversial deal with a Swedish telecoms company called Telia, and was engineered by Conservative Party donor Mohamed Amersi.

Two other Gibraltar companies named in the Pandora Papers are linked to the former Minister of Mauritania, Mohamed Abdellahi Ould Yaha.

The Pandora Papers: Secret Files Put Big Names in Spain in The Spotlight for Offshore Funds

Spain's king Juan Carols, football club manager Pep Guardiola, crooner Julio Iglesia, and at least five politicians are among 600 Spaniards under scrutiny after being named.

More on Mohamed Amersi - Pandora papers: Tory donor Mohamed Amersi ‘involved in £162m corruption scandal’ – TodayHeadline - The Tory donor was involved in the controversial payment to Ms Karimova using a Gibraltar-based offshore company in 2010, according to an investigation by BBC Panorama and The Guardian.

The Pandora Papers show the true face of global Britain

The British Virgin Islands are just one part of Britain’s offshore network. There are around 18 legislatures across the globe that Westminster is ultimately responsible for. These include some of the worst offenders in the world of money laundering, tax dodging and financial secrecy. The Cayman Islands are British. So is Gibraltar. So are Anguilla and Bermuda.

These places aren’t just British in an abstract sense. Under the 2002 British Overseas Territories Act, their citizens are British citizens. They operate under the protection of the British diplomatic service. And, when need be, they can rely on Her Majesty’s Armed Forces: in the last 40 years, Britain has twice gone to war to defend Overseas Territories.

Although no one knows for sure how much money is hidden in tax havens, of which the British territories make up a significant chunk, the figures involved are so vast that academics at the Transnational Institute in the Netherlands have described them as “the backbone of global capitalism”.

Read more of this revealing article at link.

The Pandora Papers have rocked the world. Since news organisations began publishing their explosive contents on October 3, the giant leak has dominated headlines and posed questions of some of the world’s most powerful people and their financial propriety.

Everyone from former UK prime minister Tony Blair to the King of Jordan have been dragged into a murky world of offshore finance, with stunning allegations being uncovered daily. And not for the first time, calls have been made to crack down on offshore financial products and institutions, and to instigate a fairer tax regime.

The Pandora paper revelations came from an unfathomably big tranche of documents: 2.94 terabytes of data in all, 11.9 million records and documents dating back to the 1970s. But how do you handle a massive leak of such size securely, when documents come in all sizes and formats, some dating back five decades?

The organisation behind the Pandora Papers leak, the International Consortium of Investigative Journalists (ICIJ), has spent the best part of a year coordinating simultaneous reporting from 150 different media outlets in 117 countries. And it involves a lot of technical infrastructure to bring the stories of financial issues to light. “We had data from 14 different offshore providers,” says Delphine Reuter, a Belgian data journalist and researcher at the ICIJ. Work began on analysing the data in November 2020.

“The first challenge for us was to get the data,” explains Pierre Romera, chief technology officer at the ICIJ. “We exchanged for weeks and months with the sources, and at a point we had to find a way to get the data.” Initially, the ICIJ brokered a deal with its sources that would allow them to send the data remotely without needing to travel, but as the size of the document dump grew, so did the challenges in ensuring it all could be sent to a secure server. Some members of the ICIJ team met directly with sources and collected huge hard drives containing the documents.

But the sheer size of the leak was still tricky to navigate. “They’re massive,” Romera says. Analysing such a volume of data isn’t a job for Excel or existing database management programs. “You can’t just go at it with classic tools. There’s nothing in the market for journalists that can ingest so much data.” Worse, four million of the files were PDFs – notoriously bad to interrogate. “PDFs are horrible to extract information from,” says Reuter. And they weren’t ordinary PDFs either: seemingly unrelated documents were scanned together into single PDF files without rhyme or reason. “You might have copies or emails or registers of directors within the information we were interested in,” she adds.

However, the ICIJ has had practice in parsing huge troves of information. The Panama Papers, which in 2016 uncovered the rogue offshore finance industry over 11.5 million leaked documents across 2.6 terabytes of data, gave the coalition of investigative journalists a set of best practices on how to handle all that data. “We created our own tools and technology to extract the text and make it searchable,” says Romera. That task fell to a team including Bruno Thomas, senior developer at the ICIJ, to prepare the data to be accessible for scores of reporters worldwide.

The ICIJ used two self-developed technologies in combination to comb through the documents. One, Extract, is able to share the computational load of extracting information between multiple servers. “When you have millions of documents, Extract is able to tell a server to look at one document and another server to look at another,” Romera says. Extract is part of a larger ICIJ project, called Datashare, which is a data structuring tool. “Everyone has to use Datashare to explore the documents,” says Reuter. “They can download documents to their own machine, but they have to use Datashare to search the documents because it’s not doable to go through 11.9 million documents without the system.”

Datashare was vital because just four per cent of the 11.9 million files the ICIJ received as part of the Pandora Papers were ‘structured’ – that is, organised in table-based file formats such as spreadsheets and CSV files. Those structured files are far easier to handle and interrogate. Emails, PDFs and Word documents are more difficult to search for data. Images, of which there were 2.9 million, are even more complicated to analyse computationally. Datashare parses all the documents, including scanning PDF files through optical character recognition (OCR) through Tesseract, an open-source system. Apache’s Tika Java framework was used to extract text from all the documents. “Tika can handle 50 or more different documents,” says Thomas. The data Tika extracts is then ultimately accessed through Datashare by the end user.

Without some kind of structure, the 600 partner journalists that the ICIJ worked with on the Pandora Papers would struggle to identify newsworthy nuggets of information contained within the millions of files they had access to. “The first step is to get the data and make it searchable,” says Romera.

The ICIJ tries to make it easier by offering them access to Datashare, but also by directing them to the newsworthy stories in each country at the beginning of the project. The team at the ICIJ developed a ‘country list’ – a list of the number of times countries or people of interest appear in the documents. They’re then identified by country, and partners are contacted to say that there is a list of people of interest connected to the country.

One of the ways Datashare manages to pull out those lists of names is through batch searches. The ICIJ has developed a tool that allows people wanting to interrogate the documents to supply a list of names or different queries in CSV format that are cross-checked against the metadata in the documents itself. “That’s incredibly helpful, because then the information is already structured, and you can export the results in CSV into any spreadsheet software and go through the results,” says Reuter. The ICIJ also uses machine learning to try and classify documents into broad clusters, helping differentiate, for instance, between documents related to the creation of a company, or a personal letter, or a duplicate of other documents.

“Graph databases excel at spotting data relationships at scale,” says Emil Eifrem, CEO of Neo4j, a graph technology company whose products are used by the ICIJ. Instead of breaking up data artificially, graph databases more closely mimic the way humans think about information. “Once that data model is coded in a scalable architecture, a graph database is matchless at mining connections in huge and complex datasets,” Eifrem says.

Sorting and interrogating the data was “much harder than the Panama or Paradise Papers,” says Romera. Although the datasets are of a similar size to those two leaks, the individual documents are significantly bigger in page count – around ten times bigger – than the Panama Papers. “The system we used until now to search into the documents was not powerful enough to handle such a massive amount of big documents,” says Romera. As a result, the ICIJ had to improve the configuration of its servers, and the way its search tools operated, to handle these new files. “There were huge 10,000-page PDF files,” says Thomas. “We had to cut those PDF files into pages, gather those pages into logical forms, and then we had to extract the data – like beneficial owners and their nationalities from unstructured data.”

In addition, the Pandora Papers included a broader range of file types and formatting that the machine learning systems the ICIJ previously used had to learn about to be able to parse and identify in order to be able to sort. “It’s now able to read very specific financial documents and very specific PDFs,” says Romera.

The 600 or so partner journalists then interrogated the data by accessing the ICIJ files through a secure authentication platform. Contact with the ICIJ uses PGP to encrypt emails and multi-factor authentication to access the servers – of which there are up to 60 running, a number that can expand to 80 when indexing files. SSL client certificates were also a must-have for partner journalists. “Sometimes it can be hard for partners to just connect to our servers,” admits Romera. However, once they have access to the data, the media partners are able to perform their own analysis on the data. A data-sharing API allows data scientists working for media partners to mine the documents within the Pandora Papers themselves using their own scripts or machine learning tools.

“We have to be ready for anything all the time,” says Romera. “It can turn you paranoid, because there’s so much at stake here.”

And for good reason: the ICIJ believes it has been subject to at least two attempts to break into the servers hosting the Pandora Papers since they and their partners began approaching politicians and businesspeople named in the documents for their stories in the last week. “As soon as we started to send comment papers, we started to have attacks on the servers,” says Romera. On October 1, the ICIJ website withstood a distributed denial of service (DDoS) attack that saw it bombarded with six million requests a minute, Romera says. Another suspected attack occurred on October 3, when the servers started showing unusual behaviour. This is currently under investigation. “When the server’s thought to be crazy, the priority is to fix it, not to find someone in the system,” says Romera. “We’re investigating to know if we had an intrusion.”

It also reinforces the importance of the ICIJ’s standard operating procedure, which is to withdraw partner access to the documents within a few weeks of the first stories breaking, requiring them to restate their interest in getting access to ensure no bad actors can leverage their way in through insecure contacts from third parties.