Samenvatting van de slides en lessen FSAN van Kris Boudt. Bij gebrek aan een cursus heb ik zelf deze gemaakt. Alles staat er in, alsook de 'R Studio scripts' zoals gezien in de slides.
FSAN 2020-2021
PART 1: TECHNIQUES TO ANALYZE DATA
1. INTRODUCTION TO THE COURSE
1.1. What is FSAN
1.2. FSAN due to changes
Changes in customer behavior
Changes in technology
Changes in profitability
Changes in competition
1.3. Hurdles in transformation
Costs
People
Technology
Vision
1.4. Belgian banks
Phygital approach
Data-driven
Digital-first
Data-driven & AI
1.5. Transform business problem into a data solution
Business problem
Data problem
Data solution
Business solution
1.6. Introduction to R Studio
2. FROM DATA TO INSIGHT USING FUNCTIONS
2.1. Today’s challenge
Analytics that transform the data to actionable insights (data science pipeline)
Engineering
Preparation
Analytics
Functions
R packages
2.2. Example: from clickstream data to marketing decisions
Server log
Transformations from clicks to time spent
2.3. Example: from prices to trend following investment decisions
Central paradigm of finance
Risk ON/OFF decision
Trend following by Mebane Faber
2.4. Example: from prices to return volatility
Returns & volatility
Returns
Return volatility
2.5. Optimization functions
Optimizations
Maximizations
Minimizations
Equivalence
Choice of solver
1
, FSAN 2020-2021
3. UNSUPERVISED & SUPERVISED LEARNING
3.1. Machine learning
Examples of machine learning
Machine learning & prediction
Overfitting
Solution: split the data
What is a good model?
Which features to use?
3.2. Supervised learning
Definitions
Garbage in, garbage out
Split in training set & test set
Choosing the model
Training the model
Inspecting the coefficients
Prediction accuracy
Cross-validation
3.3. Unsupervised learning
Definitions
K-means clustering
1 feature
2 features
Application to detecting macroeconomic regimes
4. ANALYZING HIGH-DIMENSIONAL DATA
4.1. Abundance of data
Data today
Analyze data
4.2. Screening the investment universe
Who is outperforming?
Conclusions
Estimates: differences because of luck
How many unique comparisons can we do?
Focus on testing underperformance compared to the best one
Compare to naïve benchmark portfolio
4.3. Regression analysis with many predictors – two step approach using PCA
Recall the linear model
Illustration on house price predictions
Solutions to reduce number of features
Transformation
Variable selection
Steps in principal component analysis (PCA)
4.4. Regression analysis with many predictors – regression analysis with feature selection
Feature selection
1st approach: k-variance regression
2nd approach: feature selection across models with different size
2
, FSAN 2020-2021
5. DATA CLEANING
5.1. Data cleaning
Importance of data cleaning
Data cleaning tasks
5.2. Duplicated data
Causes
Data entry & human errors
Join or merge errors
Bugs & design errors
Functions for duplicates in R
5.3. Missing data
What is it?
How to handle?
Remove (naïve approach)
Replace (imputation)
Functions in R to handle with missing data
Visualizing
Naïve approach
Imputation
5.4. Outliers
Outliers
Outlier detection
Univariate model
Multivariate model
Multivariate regression model
1st approach: univariate approach to outlier detection
3 sigma rule
Outlier masking
Robust estimators
2nd approach: multivariate mean/covariance approach to outlier detection
Mahalanobis distance
3rd approach: multivariate regression approach to outlier detection
Vertical outliers
Bad leverage points
Good leverage points
Take home message
3
, FSAN 2020-2021
6. ANALYZING TEXTUAL DATA
6.1. Textual data
Past VS present
Journey to become machine readable format
Digitization & digitalization
6.2. Use cases of textual data analysis
Case 1 – chatbots
Case 2 – reputation monitoring
Case 3- investing based on media sentiment
Case 4 – early warnings in credit risk
Case 5 – monitoring the macroeconomy
6.3. Content topic analysis
Learning outcomes
Vocabulary
Corpus
Document
Tokens
Word sequences
→ Unigram
→ Bigram
→ Trigram
Tokenization
Analysis
Removing stop words after tokenization of text
Document feature matrix
How many tokens are there?
Dimension reduction
Illustration
Word cloud
Heterogeneity
Homogeneity
Example of topic analysis: ECB
6.4. Sentiment analysis
What is sentiment?
American association of individual investment sentiment survey
Issues with survey data
Estimated sentiment differs from true sentiment
Release lag
No ‘travel back in time’ possibility
Solutions
Realtime analysis of text
Lexicon approach to textual sentiment analysis
Example
Alternative calculations
Aggregation
4
, FSAN 2020-2021
PART 1: CASE STUDIES OF FSAN
7. ROBO-ADVISORY IN PORTFOLIO MANAGEMENT
7.1. Steps in personal financial advice that are automated in robo-advisory
Robo-advisory
WealthTech
FinTech
Pipeline of personal finance advise
Gather customer data
Gather financial data
Integrated asset allocation process
Chose the securities
Ensure the follow-up
Automation of step 1 – gather customer data
Automation of step 2-4 – profile-portfolio matching
Automation of step 5 – portfolio rebalancing
7.2. Recent trends
Growing popularity of ETFs
Bottom-up approach
Growing popularity of robo-advisors
7.3. Why does robo-advisors exist?
Reason 1 – behavioral finance
Reason 2 – wealth distribution & market segmentation
Reason 3 – availability of technology
Reason 4 – advances in behavioral psychology & investing
7.4. Illustration of portfolio optimization with ETFs
Recap
Modern portfolio theory of Harry Markowitz
7.5. Conclusion
Conclusion about investment algorithm
Comparison of robo-advisors
Conclusion
5
, FSAN 2020-2021
8. DIGITAL TRANSFORMATION IN LOAN AND INSURANCE DECISIONS
8.1. Recall
Motives for digitalization in financial services
8.2. Digitalization and the bank-insurance business model
Bank-insurance is dominant
Digitalization strategy
8.3. Data-driven decisions in loan decisions
General problem
Difficulties
→ PD: probability of default
→ EAD: exposure at default
→ LGD: loss given default
→ EL: expected loss
Steps to take
→ Collect the data
→ Model specification
→ Train & evaluate model
Analytics to predict probability of default
8.4. Data-driven decision rules in insurance
General problem
Analytics to estimate the fair premium for car insurance
8.5. Conclusion
Bank-insurance ecosystem is changing
6
, FSAN 2020-2021
9. ELECTRONIC PAYMENTS, BLOCKCHAIN, DIGITAL CURRENCIES
9.1. Introduction
Shift in payment execution
Centralized hedger
Decentralized hedger (disturbed)
Risk of payment fraud
Data solution
9.2. Fraud detection methods in payments
Fraud detection
Fraud detection algorithms
Authentication method
Something you own
Something you know
Something you are
Recency variable in fraud detection
Fraud detection based on outliers
Best fraud detection technique
Confusion matrix
Implementation of the fraud detecting rules
What is now the best system?
Naïve example
9.3. Secure authentication methods
Online security
Solutions
Illustration of hashing
Consequence of hashing
Hashing & encryption
Public key
Private key
Creating keys in R
Illustration
We need a ledger to avoid double-spending
Add data to the blockchain
Mining
9.4. Bitcoin value analysis
Bitcoins
Value of BTC
Return analysis
Implications for investors seeking for a stable store of values
9.5. Implications for (central) banks
Implication for banks
Losses
→ Direct revenue loss
→ Indirect revenue loss
Solutions
→ Broaden distribution
→ Platformication
Implications for central banks
Conclusion
7
, FSAN 2020-2021
PART 3: LIMITS TO DATA-DRIVEN DECISION-MAKING
10. LIMITS TO DATA-DRIVEN DECISION-MAKING
10.1. Data-driven decision-making
In a digital society, many decisions are automated
Limits to data-driven decision-making in finance
Customers
Profitability
Ethics
Accuracy
Resiliency of the system
10.2. Digital literacy and financial literacy
Robot-recommended decisions lead to increased importance of individual decision making
Attention for this in the media
10.3. Digitalization improves profitability? It depends
Economies of scale
Consequences
The winner takes it all
10.4. Data-driven decision making is not so objective as it may seem
Accurateness is smarter
Illusion of objective
Issue of opaqueness
Issue of non-ethical decisions
Data bias & discrimination
Solution: avoid it
10.5. Specific cases of model risk
Case 1: data-driven decision-making may fail in a changing world
Case 2: model fails when scaling
10.6. Market impact: increase herding
Machine learning programs are often constructed on similar lines
Stop losses
Good outcomes
Bad outcomes
Ugly outcomes
Feedback loops
10.7. Concern from financial institutions regulator
Artificial intelligence is reshaping finance
Diversification
Regulation
Dutch central bank imposes SAFEST principles
Soundness
Accountability
Fairness
Ethics
Skills
Transparency
10.8. Conclusion
8
, FSAN 2020-2021
FSAN – Kris Boudt
PART 1:
TECHNIQUES
TO ANALYZE
DATA
9
, FSAN 2020-2021
LECTURE 1 – INTRODUCTION TO THE COURSE
WHAT IS FSAN?
Analytics: systematical analysis of data with the business objective of growing the business, improving
decisions, optimizing the costs or managing the risks.
Financial services: the sector which contains the following activities:
Banking (handling deposits and money lending)
Insurance (pay a premium to be protected against financial losses due to random events)
Payment services
Wealth & asset management (investment advisory)
Consultancy
WHY FSAN? DUE TO CHANGES!
Changes in consumer behavior
More demanding in terms of user experience
Simple purchasing process
Quick response
Required personalization
Required low costs
Embraced digitalization
High trust in information technology (IT) firms
Interact with an increasing number of digital devices (PC, laptop, smartphone, tablet, smartwatch…)
Accept that user data is used for corporate purposes
Accept to interact with robots (chatbots, automated investment advice…)
Changes in technology
More communications on digital devices
More data is stored and processed
More decisions can be data-driven (evidence based) & even automated
Changes in profitability for banks
The profitability for banks is under pressure
ROE is declining year after year due to the overcapacity & the low margins
Solutions are mergers & digitalization
Aiming for state-of-the-art technology
Stay relevant (grow the business, what do consumers really want?)
Be efficient (reduce operating costs)
Success in digitalization (more revenues & less costs to create more profits)
Changes in competition
While traditional banks operate in payments, lending, deposits… the new entrants in the market have
specialized themselves in one specific service (Apple Pay, Google Pay, Rabobank…)
Incumbent institutions Other finetech firms
KBC, BNP Paribas Fortis, Deutsche Bank... Armor, Transpay, Wirecard, Kickpay, C2FO...
INVESTORS CUSTOMERS
Technology providers & ICT companies
New digital-based institutions
(incl. BigTech frims)
N26, Ion Bank, Hello Bank, Bunq...
Amazon, Google, Apple, Facebook, Microsoft...
10
Les avantages d'acheter des résumés chez Stuvia:
Qualité garantie par les avis des clients
Les clients de Stuvia ont évalués plus de 700 000 résumés. C'est comme ça que vous savez que vous achetez les meilleurs documents.
L’achat facile et rapide
Vous pouvez payer rapidement avec iDeal, carte de crédit ou Stuvia-crédit pour les résumés. Il n'y a pas d'adhésion nécessaire.
Focus sur l’essentiel
Vos camarades écrivent eux-mêmes les notes d’étude, c’est pourquoi les documents sont toujours fiables et à jour. Cela garantit que vous arrivez rapidement au coeur du matériel.
Foire aux questions
Qu'est-ce que j'obtiens en achetant ce document ?
Vous obtenez un PDF, disponible immédiatement après votre achat. Le document acheté est accessible à tout moment, n'importe où et indéfiniment via votre profil.
Garantie de remboursement : comment ça marche ?
Notre garantie de satisfaction garantit que vous trouverez toujours un document d'étude qui vous convient. Vous remplissez un formulaire et notre équipe du service client s'occupe du reste.
Auprès de qui est-ce que j'achète ce résumé ?
Stuvia est une place de marché. Alors, vous n'achetez donc pas ce document chez nous, mais auprès du vendeur jdw99. Stuvia facilite les paiements au vendeur.
Est-ce que j'aurai un abonnement?
Non, vous n'achetez ce résumé que pour €10,99. Vous n'êtes lié à rien après votre achat.