Introduction to political science research (73210025IY)
All documents for this subject (26)
1
review
By: markusvrtt • 1 year ago
Seller
Follow
PoliticalScienceUvA
Reviews received
Content preview
Data, visualisation, and measurement
Chapter 1: Introduction - Are Statistics Relevant To Real Life?
● Statistics is all about weighing up the chances of something happening or being true
● Statistics are relevant to real life because without real life we would not need statistics
● Statistics are frequently used to highlight the differences between groups of people or
places
How are statistics used?
● Data provide information which governments and organisations use to make policy
decisions (and to evaluate effectiveness of existing policies)
● Statistics about institutions (e.g. schools) are increasingly collected and made
available to the public
● This century as been called the century of “big data”
○ “Big data” = every piece of knowledge that has been digitised and stored on a
computer hard drive, database, or cloud.
■ Vital to millions of companies e.g. Facebook, Google, Twitter etc.
Two main kinds of statistics:
Descriptive statistics: A set of methods used to describe data and their characteristics
● E.g. If you were investigating the number of visitors to a beach in August, you might
draw a graph to see how the number of visitors varied each day, work out how many
people visit on average and calculate the proportion of male/female visitors.
Inferential statistics: Involves using what we know to make inferences (estimates or
predictions) about what we do not know.
● E.g. If we asked 200 people who they were going to vote for on the day before a local
election we could try and predict which party would win.
○ We would never be able to say for sure who would definitely with the election
but we are able to predict the likely outcome or proportion
The Emergence of statistics
● Using statistics goes back to the earliest city states
○ Babylonians & Egyptians collected numerical data on crops and growing
conditions
● The word statistics is derived from the Latin term for “state” or “government”
● In the Western world it was particularly the birth of industrialism that led to an
interest in social data (18th century)
● The potential of past statistics was shown by Charles Booth (1886)
1
, ○ Work on occupation patterns was derived from an analysis of the 1801-1991
UK censuses
○ The first journal of the Royal Statistical Society in 1883 had lots of articles
describing the social conditions of the time
● Following WW2 statistical methods have been enjoying increasing popularity in
social sciences
● There has also been an increase in large scale national datasets which have been used
by academics and policy-makers alike
○ This increase has been accompanied by an advance in technology - especially
computers, especially from the 1980s and 1990s
○ With this came further access to data for larger groups of people and further
possibilities for its use
○ Specific computer programs such as IBM SPSS, SAS, STATA and R have
been developed which make analysis more accessible
Do we really need to know about statistics?
● For data to be useful it must be processed and analysed - this requires statistical skills
Two major reasons why learning about statistics is useful:
1. You are constantly exposed to statistics every day of your life. Marketing surveys,
polls etc. By learning about stats you will become a more effective consumer of
statistical information
2. You need to be able to understand and interpret statistics at university or in the
workplace. Even if conducting research is not part of your job/degree you will be
expected to understand and learn from other people’s research based on statistical
analysis.
“There are three kinds of lies: lies, damned lies and statistics”
● The expansion of data has also led to increasing debate about how figures are
constructed
● While numbers are often thought of as hard facts, they are the result of different
decisions about how something should be categorised or counted
● Statistical data can be misinterpreted, distorted or selected to serve a specific purpose
● Not all inaccurate statistics are deliberate - mistakes can occur
● Inaccurate conclusions about findings to be made
Chapter 2: Data and Table Manners
Introduction
● Before interpreting data we need to know more about it
○ E.g. how data are measured or how levels of measurement affect how they are
presented and what you can do with them
● It is important to understand how to present data
2
, ● How data is presented can affect the way it is interpreted
○ For E.g. something as simple as whether numbers or percentages are used can
create a very different picture
● This chapter will introduce using data and their presentation in the form of tables
● After this chapter you should be able to:
○ Identify different levels of measurement and associated terminology
○ Construct tables in a clear and appropriate manner
○ Work out and use percentages
○ Understand some of the issues with reporting data
Data
● Variables (e.g. “area” and “rent”)
● Each Variable has attributes (e.g. “yes” or “no”)
● Case (e.g. each person questioned)
● Observations (e.g. the responses that you get from each questioned person)
○ Data then simply is a collection of observations
● Whatever a dataset may refer to, the cases are the individuals in the sample (e.g.
people, countries, cars etc.)
● Variables are the characteristics which make the case different from each other (e.g.
opinion about a topic, type of political system)
● There are an infinite number of possible variables that we might be interested in,
variables themselves can be divided into several types:
Continuous variables: Such variables are measured in numbers, and observation may take
any value on a continuous scale.
● E.g. distance travelled to work: could take a value of 0 miles for people working from
home, 1.6 miles, 4.8 miles or any other value up to 100 miles
Similarly, any variable measured as a percentage can take a value of 0% to 100% or anything
in between. For continuous variables, the standard rules or arithmetic apply. If you travel 4
miles then that is twice the commute of someone who commutes 2 miles.
Discrete variables or categorical variables: are not measured on a continuous numerical
scale. Examples include:
● Sex: Female/Male etc..
● Religion: Buddhist/Chrisitan/Hindu etc.
● Degree subject studied: Politics/Law/Biology etc.
Such variables have no numeric value (we may assign them a number e.g. Politics=1, Law=2,
Biology=3…) But the actual numbers have no mathematical meaning for example Politics is
not two times greater than politics
3
, Most variables can easily be classified as continuous or discrete - specifically in code books
● Codebooks are used to describe your variables
● Codebook gives information about each variable such as:
○ Name
○ Type
○ Units of measurement
○ Categories
It is also possible to split categorical variables further into different levels of measurement:
Nominal, dichotomous, ordinal
● Nominal:
○ Variables are variables that have two or more categories, but which do not
have an intrinsic order or inherent numerical quality in themselves
○ Nominal variables include things like marital status, ethnicity, religion, degree
studies
○ In some cases there may be many attributes to a nominal variable
■ E.g. if we were classifying where people live in the USA by state there
would be 50 attributes/states
● Dichotomous:
○ Variables are nominal variables which have only two categories of levels
○ E.g. if we were looking at gender, we would generally categorise somebody as
“male” or “female”
○ E.g. If we asked someone if they had ever smoked giving them the possible
answers “yes” or “no”
● Ordinal:
○ Have two or more categories, like nominal variables, but the categories can
also be ordered or ranked by moving from greater to smaller variables (or vice
versa)
○ E.g. Opinion poll might ask how likely you are to vote for a party at the next
election with the possible options: “very likely, likely, not sure, unlikely, very
unlinkey”
○ E.g. Respond to the statement “I generally eat healthy.” on a scale from
strongly agree to strongly disagree
○ The distance between the points on the scale is not clear and continuous
Continuous variables can be further categorised as either interval or ratio variables:
● Interval:
○ Measured along a continuum and have a numerical value
○ Distance between ranks/attributes
○ Has an arbitrary zero
○ Interval variables are rarely used in social research
4
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller PoliticalScienceUvA. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $8.18. You're not tied to anything after your purchase.