We are a group of technology enthusiasts. The project started as a Data Science Brigades initiative, and that was CODELAND data science division, but after a while it has grown and today has taken a life of its own — many professionals are not from these comanies. All the code is open source and we have a body of hundreds of volunteers: collaborators from all over the world, including journalists and researchers.
Our team is scattered across different cities and countries. We use the benefits of technology to host meetings, to discuss the project and to work remotely, without needing a physical space.
Open source means every code we write is public, everyone interested in contributing to the evolution of the project can freely collaborate with their ideas.
We use public data – whether it is data made public by the Law on Access to Information, or private data made available publicly by companies such as Google, Foursquare, Yelp etc. In the public sphere, we obtain data from the Chamber of Deputies, Federal Revenue Service, Transparency Portal, data.gov.br etc.
The Quota for Exercise of Parliamentary Activity (CEAP in the acronym in Portuguese) is a monthly amount of up to R$ 45 thousand that each Chamber of Deputies Member is entitled to reimbursement for expenses that are not fit for public bidding. A lunch or taxi payment, for example.
The reports are made directly to the House of Representatives through a panel of cases with the highest likelihood of illegality found by Rosie. We have created some internal tools that we intend to refine in order to be able to offer them publicly soon.
Currently we have an open Telegram group. You are very welcome to join it. The official language there is English.
We have a community created around the project that goes beyond Brazil – that is to say people from other countries interested in contributing. Within the Github repository and the Telegram group we chose to use the English language so we can include the views of these people in the discussion. In addition, several people from other countries have shown interest in our code since day zero. By keeping the code and all technical documentation in English we make it easy for our effort to be used in other countries of the world.
To talk about Machine Learning we have selected a video that summarizes everything we need to know about it: What is machine learning?
We wrote a post about the difference between Artificial Intelligence, Machine Learning and Deep Learning - I think it can help you.
We use Python because it provides tons of libraries for data analysis and machine learning.
Jarbas is a tool created by us to make it easy to visualize the data mined by our robot. It is essential for investigative work. It has emerged as an internal tool but gradually we will improve its interface aiming at a tool of public utility.
Rosie is our robot programmed to identify illegal uses of public funds, beginning with CEAP. She judges every reimbursement claimed by our congresspeople, and she tells us what are the reasons that made each one suspicions.
There three reasons:
1 - We are directly inspired by the famous Toblerone Affair – a famous case in which a politician in Sweden resigned after being caught with a simple Toblerone in the invoice of her corporate card. We want to do this: find corruption at small scale, but in large volume.
2 - It seems to be the a nonsense operation name typical of the Brazilian Federal Police, and this is very cool.
3 - It literally means love serenade — so this is our love serenade to Brazil.
Sure, this idea is awesome — and also this idea is one of the main purposes of our adoption of open source.
Have doubts? Ask in our Facebook page or find us on Twitter. For technical issues you can drop a line in English at our GitHub or our technical group on Telegram. If none of this works Cabral is our communication guru.