Förderjahr 2019 / Stipendien Call #14 / ProjektID: 4563 / Projekt: Automated Identification of Information Disorder in Social Media from Multimodal Data
Covid 19 and its effect on my master thesis is significat. On the one hand the crisis stopped our daily lives and on the other hand everyone expect you to work in a normal way. An status update...
Since the last blog entry in March 2020 a lot happend. Not only to me but also the whole mankind: Covid-19. It stopped our daily lives in a dramatically manner and forced us to change our daily work routine. It changed our why of thinking and moved most of us into Home Office.
As the head of the students union of the UAS Sankt Pölten it still is a very hard time. My team an me invested a lot of time into information campaigns and managing the workflows together with the UAS Services. I am very glad to study and work at the UAS Sankt Pölten because, beside of the huge time investment, most of our lectures and exams are now online.
It consists out of around 1 Million Reddits and Comments and a really huge amount of image data, which tool several days to download.
The drawback was here the release of a new dataset version a few weeks after I downloaded the dataset and started applying my method on it.
So I have decided to focus on more details of my work and move my deadline from May to August.
So enough from bad news, changing to the good ones:
Since the dataset has text - ( title and comments), images ( around 60 % ) and some meta-data (authors, scores) I decided to try an multimodal approach using:
- Text - Data
- Image - Data
- Social Meta Data
In the last few weeks I got used to a lot of preprocessing methods. I managed to write a working model for processing Text and Image Data simultanously, in and end-to-end manner. Currently I am investigating methods for normalizing image data and improving the classification part.
The next few weeks are very important for me. I want to include features out of the Socia Meta Data provided by the authors, I want to focus on tuning my network to better adapt to my problem and starting with systamatic experiments.
Stay tuned and healthy ;)
Image: Designed by starline / Freepik