, 1. INTRODUCTION AND DEALING WITH DATA
Goal
Getting familiar with collecting, analyzing and modeling financial data
1. Learn the standard econometrics techniques typically applied to financial data.
2. Learn how to collect data and how to deal with data
Page | 1 3. Learn how to analyze and model the data using the standard econometric software package Gretl
1.1 INTRODUCTION
Econometrics = “measurement in economics”
General Framework
Financial econometrics = The application of statistical techniques to
problems in finance
≠ Economic econometrics
→ Financial data is less exposed to:
o Small samples problem: if you only have a few data points
it will be harder to get significant results
o Measurement error: the person collecting it makes a
measurement error
o Data revisions
BUT financial data can be noisy ➔ More difficult to separate
trends/patterns from random and uninteresting features
Ex. Stock prices: sometimes there isn’t a fundamental explanation to a certain movement
Things to consider when reading papers in the academic finance literature:
1. Does the paper involve the development of a theoretical model or is it merely a technique looking
for an application, or an exercise in data mining?
2. Is the data of “Good quality”? Is it from a reliable source? Is the size of the sample sufficiently large
for asymptotic theory to be invoked?
3. Have the techniques been validly applied? Have diagnostic tests for violations been conducted for
any assumptions made in the estimation of the model?
4. Have the results been interpreted sensibly? Is the strength of the results exaggerated? Do the
results actually address the questions posed by the authors?
5. Are the conclusions drawn appropriate given the results, or has the importance of the results of the
paper been overstated?
1.1.1 FUNCTIONS
= A mapping or relationship between an input or set of inputs and an output
Y, the output, is a function f(x) of the input x or y = f(x)
→ Y could be a linear function of x where the relationship can be expressed on a straight line
→ Y could be a non-linear function where the relationship can be expressed graphically as a curve
Linear equation:
𝑦 = 𝑎 + 𝑏𝑥
Y and x are called the variables and a and b are parameters. A is the intercept and b is the slope or gradient.
,1.2 TYPES OF FINANCIAL DATA
1. Time-series data (yt for t = 1, …, T) (ex. Stock price every day for several years)
2. Cross-sectional data (yi for i = 1, …, N) (ex. data on the stock price of N companies)
3. Panel data (yit for i = 1, …, N and t = 1, …, T) (ex. stock price for N firms for T days)
Aggregation
Page | 2
Ex. : Individual house price vs house price index
→ Using aggregated data: “You see the big picture, but lose a lot of detail”
1.2.1 TIME-SERIES DATA AND FREQUENCY
Series Frequency
GDP or unemployment Monthly or quarterly
Government budget deficit Annually
Money Supply Weekly
Value of a stock market index As transactions occur
! Data in model should have the same frequency of observation !
1.2.2 CROSS-SECTIONAL DATA
= Data on 1 or more variables collected at a single point in time
Examples:
- A poll of usage of internet stockbroking services
- A cross-section of stock returns on the New York Stock Exchange (NYSE)
- A sample of bond credit ratings for UK banks
1.2.3 TIME-SERIES VS CROSS-SECTIONAL DATA
Examples of problems that could be tackled using a Examples of problems that could be tackled using a
Time-Series Regression Cross-Sectional Regression
How the value of a country’s stock index has varied with that The relationship between company size and the return to
country’s macroeconomic fundamentals. investing in its shares.
How the value of a company’s stock price has varied when it The relationship between a country’s GDP level and the
announced the value of its dividend payment. probability that the government will default on its sovereign
debt.
The effect on a country’s currency of an increase in its interest
rate.
Pooled data VS Panel data
Pooled data treats panel data as a larger cross-sectional sample
1.2.4 QUALITATIVE AND QUANTITATIVE DATA
Quantitative variable = numerical
Ex. Share price is $25
Qualitative variable = not numerical
Ex. In a survey of companies it is asked if investments are financed through debt (as opposed to equity or retained earnings)
Dummy variable = 0 or 1
→ Used for turning qualitative data into quantitative.
Ex. Yes = 1, No = 0
, 1.3 OBTAINING DATA
Primary Data VS Secondary Data
→ Data you collect and process yourself → Data collected and processed by somebody
→ Takes some time to collect else
→ How to obtain? Surveys, questionnaires, → Less time-consuming, but could be
Page | 3 interviews, … expensive
→ Known as databases
→ How to obtain? Paid or free sources
1.3.1 PAID DATA SOURCES
→ Many web sources, often paid
→ University libraries often have access to databases
→ Faculty Economics and Business Administration has access to the following databases:
o Refinitiv workspace/Datastream o Bel-first
o Orbis Europe o Zephyr
o Orbis financial institutions
1.3.2 FREE DATA SOURCES
- US economic and Financial Data: Federal Reserve Bank of St. Louis
- Financial Data Worldwide: Yahoo Finance
- Belgian Data: NBB.stat (online database of the National Bank of Belgium with extensive macroeconomic statistics),
NBB Balanscentrale (provides annual accounts of nearly all legal entities active in Belgium)
- Other: ECB, IMF, OECD
- Many academics also make the data sets they have used available on their websites
1.4 DEALING WITH DATA
1.4.1 DATA TRANSFORMATIONS
Original data is often transformed
Ex: If X = company earnings & W = number of shares, then you can create a new variable: Y = X/W which is earnings per share.
When working with asset prices (price of shares) following transformations are common:
𝑝 −𝑝
→ Simple returns: 𝑅𝑡 = 𝑡 𝑡−1 × 100%
𝑝𝑡−1
𝑝𝑡
→ Continuously compounded returns: 𝑟𝑡 = 100% × ln (𝑝 )
𝑡−1
Advantage: returns are unit-free (prices are not) → Not expressed in a unit, it’s always in percentages
which makes sure that we can compare them easily.
NOTE: these formulas ignore dividends
Why log returns?
→ In continuous compounded return the reinvestments of returns of the previous days is already
included, because of this we can just add the continuous compounded return of several days to get the
total return of these days.
→ Disadvantage: The simple return on a portfolio of assets is a weighted average of the simple returns
on the individual assets: 𝑅𝑝𝑡 = ∑𝑁
𝑖=1 𝑤𝑖 𝑅𝑖𝑡 . This does not work for the continuously compounded
returns.
Index Numbers
→ Stock market index: index measures price of stock market as a whole. Ex. S&P500 stock market index.
→ Are usually normalized so that a base year is 100
→ Macro series:
o Consumer price index (CPI) VS Inflation
o Gross domestic product (GDP) VS GDP growth
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller 02brevetsvanity. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $14.47. You're not tied to anything after your purchase.