Förderjahr 2019 / Stipendien Call #14 / ProjektID: 4563 / Projekt: Automated Identification of Information Disorder in Social Media from Multimodal Data
Since the last blog post in November a lot of research was done on the topic of fake news detection. The term "fake news detection" is only a part of the huge and broad continuum of misinformation detection ....
Other terms related to it are for example: "Spam" or "Rumor". Another outcome of the literature research was, that fake news are not only been spread over "classical" social communication channels such as Facebook or Twitter. A major distribution plattform is also the well known plattform Reddit. Reddit itself is also different in the way of displaying and organizing contents through their concept of subreddits, where different topics are discussed, all moderated by own administrators. See furthermore: https://nealschaffer.com/subreddit/
Over the past few weeks, I made a deep dive into the state-of-the-art and get a very good insight, in what is going on now on the field of fact checking research. There is a huge variety of qualitatively well written papers, but also a lot of crappy papers, mostly having a sonorous title, but more or less, nothing behind.
Furthermore I got a very good insight in the curently used network architectures of fake news detection and what are the most common problems in this field.
Since Fake News is still a relatively news topic in machine learning it is more difficult to find a suitable fake news dataset for training. It is essential to have a huge an well verified dataset, so the Artificial Intelligence (AI) can be properly learned. There are many different fake news fact checking website, which provide an API to download their content, but not for free. Furthermore their content is mostly not well structured and the data preprocessing time I would need to spend on it, would eat up my straight time budget. But they gave a very good picture of the quality of these kind of services.
Some fact verifying websites are:
I recently found some very interesting fake news detection datasets. But they are still not available, so I wrote to the authors, maybe they are answering
So, after showing you my Gains and Pains, what is the current state of my diploma thesis:
I am still verifying some candidates for my problem of a well structured and consistent fake news dataset.
I have developed a first method which need to approved by my advisors.
And now ?
The next few weeks are critical, the deadline for the diploma thesis ( the first attempt in our University ) is on 18th of May. The written part of my thesis is still very thin, due to the problems mentioned above. Next to my thesis I am working in two different startups and doing my work at the University. The time from November till now was very stressful due to a lot of work packages, which need to be done first.
Within the next few weeks, I need to start implementing my method rewrite my state-of-the-art section of my diploma thesis
So stay tuned :)
Contributions: The blog image was created by starline - www.freepik.com