Ivan Smirnov, PhD

computational social scientist

I am a computational social scientist working at the University of Mannheim. After receiving a doctoral degree from HSE University in Moscow, I founded and led a computational social science research group there. We were using big data and machine learning to better understand human behavior and complex social phenomena. Our main focus was on inequality in education and the psychological well-being of students.

Our research was supported by the Russian Science Foundation, presented at IC2S2 and ICWSM, and published in Proceedings of the National Academy of Sciences, EPJ Data Science, and Royal Society Open Science. Our work was also covered by leading Russian and international media including MIT Technology Review and The Times.

To further promote computational social science in the country, I have developed and taught a course Introduction to Computational Social Science — the first of its kind in Russia — and co-organized the Summer Institute in Computational Social Science at HSE.

I then moved to Germany where I am currently working at Markus Strohmaier's group.

Prior to academia, I worked as a web developer and a technical leader at several startups. I also worked on various non-profit projects and was part of the Teach for Russia RU founding team.

In my free time, I am working on an open online course Introduction to Computational Social Science RU.

Email: ivan@ismirnov.eu
Curriculum Vitae     

Research Highlights

While there are many studies on the friendship between students, most of them focus on students from a single educational institution, i.e. study friendship ties within one school or one university. As a result, little is known about social connections between students from different schools. In this paper, I have used digital traces to investigate interschool friendship on a scale of the whole city. I have analyzed data on 37,000 students from 590 schools and their friendship links on VK and have found that students from similar performing schools tend to become online friends. One might assume that this is a trivial consequence of the geographical segregation of schools. However, by adding data on school locations and apartment prices, I was able to show that segregation in the digital space is in fact much stronger than geographical segregation.

Read the paper

In this paper, I have built a model to predict the academic performance of students from their posts on social media. I have combined unsupervised learning of word embeddings on a large corpus of social media posts with a supervised model trained on data from a nationally representative sample of young adults. This data set contains the academic performance of students measured by a standardized test as well as information on their public activity on social media. I have used a continuous-vocabulary approach that allowed achieving high accuracy using a relatively small training data set. It also allows computing interpretable scores for millions of words that are fun to explore!

Read the paper

Parents’ preference for sons is a well-known phenomenon that manifests in various forms from sex-selective abortions to higher investments in sons. In this paper, we used public posts made by 635,665 users on a popular Russian social networking site, to investigate public mentions of daughters and sons on social media. We find that both men and women mention sons more often than daughters in their posts. We also find that posts featuring sons receive more “likes” on average. Our results indicate that girls are underrepresented in parents’ digital narratives about their children. Previous studies have shown female characters are underrepresented in children’s books, textbooks, movies, and on Wikipedia. Gender imbalance in public posts may send yet another message that girls are less important and interesting than boys and deserve less attention, thus presenting an invisible obstacle to gender equality.

Read the paper

Selected Talks