100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
FULL SUMMARY Computational Analysis in Digital Communication $9.76   Add to cart

Summary

FULL SUMMARY Computational Analysis in Digital Communication

 58 views  7 purchases
  • Course
  • Institution

FULL SUMMARY of the course Computational Analysis in Digital Communication for the Master Communication Science on the VU, written in English. The summary includes all lectures, articles, and exam questions. Good luck studying!

Preview 4 out of 123  pages

  • November 22, 2023
  • 123
  • 2023/2024
  • Summary
avatar-seller
Computational Analysis of Digital Communication – FULL SUMMARY

WEEK 1:
Article 1: When Communication Meets Computation: Opportunities, Challenges, and
Pitfalls in Computational Communication Science

The role of computational methods in communication science
The recent acceleration in the promise and use of computational methods for
communication science is primarily fueled by the confluence of different developments:
- A deluge (overflow) of digitally available data, ranging from social media messages
and other digital traces to web archives and newly digitized newspaper and other
historical archives.
- Improved tools to analyze this data, including network analysis methods and
automatic text analysis methods such as supervised text classification, topic
modelling, and syntactic methods.
- The emergence of powerful and cheap processing power, and easy to use computing
infrastructure for processing these data, including scientific and commercial cloud
computing, sharing platforms such as Github and Dataverse, and crowd coding
platforms such as Amazon MTurk and Crowdflower.

Many of these new data sets contain communication artifacts such as tweets, posts, emails,
and reviews. These new methods are aimed at analyzing the structure and dynamics of
human communication.
These three developments have the potential to give an unprecedented boost to progress in
communication science, provided we can overcome the technical, social, and ethical
challenges presented by these developments.

Big data can be defined by:
- Large and complex data sets
- Consisting of digital traces and other naturally occurring data
- Requiring algorithmic solutions to analyze
- Allowing the study of human communication by applying and testing communication
theory

Computational methods do not replace the existing methodological approaches, but rather
complement it. Computational methods are an expansion and enhancement to the existing
methodological toolbox, while traditional methods can also contribute to the development,
calibration, and validation of computational methods.

Oppertunities offered by computational methods
Computational methods allow us to analyze social behavior and communication in ways that
were not possible before and have the potential to radically change our discipline at least in
4 ways:
- From self-report to real behavior: Digital traces of online social behavior can
function as a new behavioral lab available for communication researchers. These
data allow us to measure actual behavior in an unobtrusive way rather than self-
reported attitudes or intentions. This can help overcome social desirability problems,

, and it does not reply on people’s imperfect estimate of their own desires and
intentions. It is methodologically viable to unravel the dynamics underlying human
communication and disentangle the interdependent relationships between multiple
communication processes. It is now possible to trace news consumption in real-time
and combine it with survey data to get a more sophisticated measurement of news
consumption and effects.

- From lab experiments to studies of the actual social environment: We can observe
the reaction of persons to stimuli in their actual environment rather than in an
artificial lab setting. In their daily lives, people are exposed to a multitude of stimuli
simultaneously, and their relations are also conditioned by how a stimulus fits into
the overall perception and daily routine of people. Researchers are mostly interested
in social behavior, and how people act strongly depends on their actions and
attitudes in their social network. The emergence of social media facilitates the design
and implementation of experiment research. Crowdsourcing platforms on social
media lowers the obstacles in research subject recruitment. However, the
implementation of experimental design on social media is not an easy task. Social
media companies will be very selective on their collaborators and on research topics.
The fear of them is to lose reputation and it could also be extremely time-consuming.

- From small-N to large-N: Increasing the scale of measurement can enable the
researchers to study more subtle relations or effects in smaller subpopulations than
possible with the sample sizes normally available in communication research. In
order to leverage the more complex models afforded by larger data sets we need to
change the way we build and test our models. It is useful to consider techniques
developed in machine learning research for model selection and model shrinkage
(penalized regression and cross-validation) which are aimed at out-of-sample
prediction rather than within-sample explanation. These techniques estimate more
parsimonious models and hence alleviate the problems of overfitting that can occur
with large data sets.

- From solitary to collaborative research: Digital data and computational tools make it
easier to share and reuse the resources. An increased focus on sharing data and tools
will also force us to be more rigorous in defining operationalizations and
documenting the data and analysis process. By fostering the interdisciplinary
collaboration needed to deal with larger data sets and more complex computational
techniques can change the way we do research. By offering a change to zoom in from
the macro level down to the individual data points, digital methods can also bring
quantitative and qualitative research closer together, allowing qualitative research to
improve our understanding of data and build theory, while keeping the link to large-
scale quantitative research to test the resulting hypotheses.

Challenges and pitfalls in computational methods
As said before, computational methods offer a wide range of possibilities for communication
researchers to explore new research questions and re-examine classical theories from new
perspectives. By observing actual behavior in the social environment, and if possible of a
whole network of connected people, we get a better measurement of how people actually

,react, rather than of how they react in the artificial isolation of the lab setting. Large-scale
exploratory research can help formulate theories and identify interesting cases or subsets
for further study, while at the same time smaller and qualitative studies can help make
sense of the results of big data research. Big data research can help test whether causal
relations found in experimental studies actually hold in the wild on large populations and in
real social settings.

Using these new methods and data sets also creates a new set of challenges and pitfalls:
- How do we keep research datasets accessible?
Although the volume, variety, velocity, and veracity of big data has been repeatedly bragged
in both news reports and scholarly writings, it is a hard truth that many of the big data sets
are proprietary ones which are highly demanding to access for most communication
researchers. Researchers connected to these actors are generally based only on a single
platform, which makes it challenging to develop a panoramic understanding of user’s
behavior on social media as a holistic ecosystem and increases generalizability problems.
Such privileged access to big data will thwart the reproducibility of computational research
which serves as the minimum standard by which scientific claims are judged.

Samples of big data on social media are made accessible to the public either in its original
form or in aggregate format. External parties also create accessible archives of web data.
However, the sampling, aggregation, and other transformation imposed on the released
data is a black box, which poses great challenges for communication researchers to evaluate
the quality and representativeness of the data and then assess the external validity of their
findings derived from such data.

It is important to make sure that the data is open and transparent and to make sure that
research is not reserved to the privileged few who have the network or resources to acquire
data sets. It is vital that we stimulate sharing and publishing data sets. Where possible these
should be fully open and published on platforms such as dataverse, where needed for
privacy or copyright reasons the data should be securely stored but accessible under clear
conditions. A corpus management tool can help alleviate copyright restrictions by allowing
data to be queried and analyzed even if the full text of the data set cannot be published.
When working with funding agencies and data providers such as newspaper publishers and
social media platforms, you can make standardized data sets available for all researchers.

- Is big data always good data?
Big data is found while survey data is made. Most of the big data are secondary are intended
for other primary uses most of which have little relevance to academic research. On the
other side, most of the survey data are made by researchers who design and implement
their studies and questionnaires with specific research purposes in mind. The big data is
found and then tailored or curated by researchers to address their own theoretical or
practical concerns. The gap between the primary purpose intended for big data and the
secondary purpose found for big data will pose threat to the validity of design,
measurement, and analysis in computational communication research.
That data is ‘big’ does not mean that it is representative for a certain population. Based on
representative survey data, people do not randomly select into social media platforms, and
very limited information is available for communication researchers to assess the

, representativeness of big data retrieved from social media. Specialized actors on social
media (issue experts, professionals, institutional users) are over-represented while the
ordinary publics are under-represented in computational research, which leads to a
sampling bias to be carefully handled. This means that p-values are less meaningful as a
measure of validity. For very large data sets, there representativeness, selection and
measurement biases are a much greater threat to validity than small sample sizes, p-values
are not a very meaningful indicator of effect.
Size of data is neither a sign of validity nor of invalidity of the conclusions. For big data
studies you should focus more on substantive effect size and validity than mere statistical
significance by showing confidence intervals and using simulations or bootstrapping to show
the estimated real effects of the found relations.

- Are computational measurement methods valid and reliable?
The unobtrusiveness of social media data makes them less vulnerable to traditional
measurement bias, such as instrument bias, interviewer bias, and social desirability bias.
However, this does not imply that they are free of measurement errors.
Measurement errors can be introduced when text mining techniques are employed to
identify semantic features in user-generated content, whether using dictionaries, machine
learning, or unsupervised techniques and when social and communication networks are
constructed from user-initiated behavior.

Researchers found that different sentiment dictionaries capture different underlying
phenomena and highlight the importance of tailoring lexicons to domains to improve
construct validity.
Researchers also observe the lack of correlation between sentiment dictionaries, and
similarly argue for the need for domain adaptation of dictionaries. Similar to techniques like
factor analysis, unsupervised methods such as topic modelling require the researcher to
interpret and validate the resulting topics, and although quantitative measures of topic
coherence exist these do not always correlate with human judgments of topic quality.
It should be noted that classical methods of manual content analysis are also no guarantee
of valid or reliable data. Researchers show that using trained manual coders to extract
subjective features such as moral claims can lead to overestimation of reliability and argue
that untrained (crowd) coders can actually be better at capturing intuitive judgements.
The errors can introduce systematic biases in subsequent multivariate analysis and threaten
the validity of statistical inference. This means that we need to emphasize the validity of
measurements of social media and other digital data.

- What is responsible and ethical conduct in computational communication
research?
The scientific community and the general public have expressed growing concern on ethical
conduct in computational social science. Such concerns can exist in different steps of
computational communication research. F.e. in field experiments on social media, how can
researchers get informed consent from the subjects? When users of a social media platform
accept the terms of service of the platform, can researchers assume that the users have
given an explicit or implicit consent to participate in any types of experiments conducted on
the platform? There is no unambiguous answer to these questions but it is also not possible
to ignore these problems and losing the trust of the general public. This calls for a collective

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller juliaschachtschabel. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $9.76. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

75391 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling
$9.76  7x  sold
  • (0)
  Add to cart