Thinking About EU

MEPs' Topics

Discover what MEPs are working on by seeing keywords and categories emerging from their questions.

Discover what MEPs are working on by seeing keywords and categories emerging from their questions.

Show Show

Last Updated: 2017-01-31

About this classification:

The classification works on the corpus of all the parliamentary questions (oral and written) presented during the VIII term. The corpus is projected in a vector space where the dimensions are the keywords selected through a technique that involves the use of Markov Chains.
In this space, every text is represented by the TF-IDF (term frequency–inverse document frequency) vector.

On this vector space we've trained two different classifiers (svm and random forest). Combining the two classifiers we reach a precision of 81% on our test set.

As you may understand, classifying parliamentary texts involves knowledge of the domain, care when combining the classifiers and a high quality training. Even when all these elements are there, this semi-automatic classification can hardly be perfect, but it's good to continously try to improve it.

Every feedback and help is then more than welcome!


Group of the Greens/European Free Alliance
Miljöpartiet de gröna
Sweden
Magic Circle
Industry, Research and Energy Member itre
Transport and Tourism Substitute tran




Keywords Extraction:

Not enough questions for keywords extraction


Share: