It is our mandate to support science with machine learning applications in a collaboration-as-a-service fashion. We support matter research with simulation-based inference, anomaly detection, patter recognition and segmentation as well as image denoising tasks.
What is the data science project’s research question? When conducting workshops or trainings, we typically try to assure the quality of the delivered training by asking learners to fill out surveys. While this is quite standard procedure, an instructor would love to know what annoyed people the most and what did they like most. Predefined questions are unable to address this question, as simply one doesn’t know what learners will like or not the most. Some topics will come in at random too – say on one day the presenter keeps having problems with beamer due to a OS update etc. To fetch these unforeseen events which are important to grasp, we offer free-text fields in the survey.
The research question hence is: What are the major topics that annoyed learners during our workshop and what topics did they mostly like.
What data will be worked on? We have data (from google forms) as .ods spreadsheet in English. This contains 112 replies in total.
What tasks will this project involve?
– identifying common topics in english language
– ranking the identified topics by order of frequency
– ranking the identified topics by the emotional context (if that is possible)
What makes this project interesting to work on? As we deliver a lot of trainings, a free-text field offers learners to directly express their views and opinion on the delivered content. In this fashion, this is the most direct avenue of learning how the workshop was received.
What is the expected outcome? Contribution to software development, Potentially a public blog post
What infrastructure, programs and tools will be used? Can they be used remotely? Not knowing the NLP domain too much, I suspect python, spacy and other NLP libraries and tools
What skills are necessary for this project? Data analytics / statistics, Data mining / Machine learning, Visualization
Is the data open source? not yet
Interested candidates should be at Bachelor level (3+) . Peter Steinbach is looking for 1 visiting scientist, working on the project together with the team.