INTRODUCTION
CONTEXT
Things are increasingly turning digital:
o Financial transactions
o Exercising -> fitness trackers & smartwatches
o Music, video,…
o Shopping
o News consumption
o Social networking
o Navigation
Two important observations in digital age:
1. Vast increase in digital data collection
o “Big data” is collected on what you do online
2. Vast increase in computing power
o ”Computers are everywhere” and becoming better
o ”Ubiquitous computing”, ”internet of things”
o Calculating, sensing, storing and transmitting of information
There has been a lot of data since 90s, but the computing power wasn’t capable of analyzing.
Presence of ubiquituous computing and big data
Think of the digital society as (fully) measurable and amenable context
Everything you do can be/is logged
Exciting? Or Scary?
Exciting:
1. For social researchers: new ways to study social phenoma
2. For data scientists: new opportunities for machine learning/artificial intelligence
development
3. For business: more information on customer behavior (amazon collects huge of
information about their customers)
4. For policy: more/better? information for policy development
Scary:
1. What about privacy?
, 2. Algorithmic development based on big data: what about potential bias? (making
wrong recommendations)
3. Experimental manipulation?
a. part of Chapter 6 of the book
4. Either way, social research is changing and new opportunities/challenges present
itself
DAWN OF COMPUTATIONAL SOCIAL SCIENCE
Computational: we are going to use computational methods to answer research questions
Combination of social and data science
So…
The merger of two perspectives
1. Social science:
a. Departure from theoretical perspectives (TPB, Self determination theory…)
b. Collect data to answer research question
i. ‘custommade’ data -> collected in the right format to answer a specific
research question
ii. Small-scale data-collection (small number of respondents in survey,
interviews….)
iii. Established methods: surveys, interviews, content analysis
2. Data science:
a. Departure from available data -> “Ready made data”
i. Take existing data and extracting relevant information for the research
question
b. Repurposing data for research purposes
c. Computational methods: natural language processing, social network analysis,
predictive modeling, …
Computational social science (CSS) = convergence of two disciplines
Researchers active in CSS require integration of two disciplines in their daily work
, o Data scientists in CSS: strong skills in computational/digital methods, but need to
become more familiar with theoretical frameworks on human behavior
o Social scientists in CSS: familiarity with theoretical frameworks, but need additional
training in computational methods to handle large datasets
DIGITAL METHODS IN ACTION
Studying misinformation on Twitter
RQ: “Can misinformation campaigns alter public opinion and endanger the integrity of the
presidential election?”
Method:
o Collecting 171 million tweets about the two US presidential candidates (2016,
Clinton vs. Trump) from 11 million Twitter users (Twitter API), 5 months prior to the
election
Think about doing this manually?
However: size doesn’t always matter!
o Validity? Reliability? (not because you have more data that the result are more valid
and reliable)
, Result:
Each line is a tweet. Info about this tweet is collected. (person who send the tweet, his or her
location, the description about the user, since when he is joining, followers, friends, …)
What to do with this data?
Method:
o Extract URLs from the tweet linking to websites outside Twitter
o Classify website content into categories: misinformation sites vs fact-based news
outlets (right to left orientation)
– ‘Fake news & extreme bias’, ‘Right’, ‘Right leaning’, ‘center’, ‘left
leaning’, ‘left’
Perform analyses:
o Descriptive statistics
o Social network analysis
Results: