Case Studies and Trends in Data Science
Lecture 1 – James Hinns (UA) ..................................................................................................................... 3
1. Explain the problem of mode collapse within GANs, and using an example, describe how the
architecture can allow this to happen. Discuss how the CycleGAN architecture is more resistant to
such problems. ........................................................................................................................................ 3
2. In the context of CNNs (Convolutional Neural Networks), describe the following concepts and
how they interact with one another: Kernel, Feature Map, Activation Function and (Max) Pooling.
Referring to these concepts, discuss why CNNs are much more prevalent than MLPs for most vision
(image-based) machine learning tasks. .................................................................................................. 3
3. Identify and briefly explain three reasons why deep learning techniques are much more
prevalent now than when they were conceptualised? (2 lines answer) ................................................ 4
Lecture 2 – Véronique Van Vlasselaer (Customs fraud detection) ............................................................. 4
4. Draw an analytical decision process for fraud detection at customs. Incorporate following steps:
data enriching, whitelist/allowed-list, blacklist, business knowledge, taking action (intercepting/
inspecting a package). For each of the processes, mention if it should be done in real-time, or can be
done on a different time (also mention why you would or wouldn’t do this in real-time).................... 4
5. Should an AI take over the job of human custom inspection? Talk about ethical implications, as
well as technical drawbacks. What is the most fruitful reconciliation between the two? ..................... 5
6. What are some of the techniques used to monitor the model performance? .............................. 6
7. Fill in on following table which step(s) can be done in batch, and which need to be done in real-
time 6
Lecture 3 – Tim Waegeman (Robivision) .................................................................................................... 7
8. Describe 3 use cases of deep learning in retail & agriculture, as described by Tim Waegeman. .. 7
9. What are the data related challenges and solutions for these challenges in the Robovision smart
scale case?............................................................................................................................................... 7
10. Explain the need for dynamic AI vision. ...................................................................................... 8
11. What are 5 cases where Robovision AI solutions can be used given by Tim Waegeman? ......... 8
Lecture 4 – Galit Schmueli .......................................................................................................................... 8
12. Explain the business advantage of behavioral manipulation. Why would a company do this? . 8
Lecture 5 – Walter Daelemans (UA) ........................................................................................................... 9
13. What are the potential tasks of computer generated transitions between, speech, text and
images. Also list at least one application (real-life use case) for each transition. .................................. 9
14. What are the main criticisms on Large Language Models such as GPT, as discussed by Walter
Daelemans?............................................................................................................................................. 9
15. What is the main driver of the popularity of deep learning in NLP? ........................................ 10
16. What is an autoregressive Language Model? What is an example of a known autoregressive
language model? ................................................................................................................................... 10
17. What is emergence in the context of training large language models? ................................... 10
Lecture 6 – Kris Laukens (UA) ................................................................................................................... 10
18. Explain the difference between an experimental and computational approach to learn
protein interactions. ............................................................................................................................. 10
19. Why is it beneficial to use AI in order to predict protein interactions? Which problem does
this solve? What are the current challenges with this? How was this done previously? ..................... 10
1
, 20. Discuss how bioinformatics can improve vaccine development and discuss some ethical
implications. .......................................................................................................................................... 11
21. Explain the following figure: ..................................................................................................... 11
22. What is sequencing? ................................................................................................................. 12
23. Why is sequencing so much puzzle work? ................................................................................ 12
24. What are some biases in health AI and how can that be solved? ............................................ 12
25. What is bioinformatics? ............................................................................................................ 12
26. Do you think human doctors will replace algorithms in the future? ........................................ 12
Lecture 7 – Vinayak Javaly (CUNY) ............................................................................................................ 12
27. How was Large Language Model used in one of the case studies by Vinayak Javaly? (small).. 12
28. What are useful some skills a data scientist has to have, according to Vinayak Javaly? (small)
12
Lecture 8 – Annelies De Corte (KPMG) ..................................................................................................... 13
29. What are the six discussed critical success factors in becoming a more data-driven
organisation,according to Annelies De Corte? For at least three of them, provide a business example.
13
30. Explain the difference between data usage and data management. ....................................... 13
31. What was the issue with the case from Vlaamse Waterwegen as explained by Annelies? ..... 14
32. What is the difference between data and IT, as explained by Annelies De Corte? .................. 14
Lecture 9 – Agata Bak-Geerinck (Telenet) ................................................................................................ 14
33. How does Telenet use data to make predictions about football supporters? ......................... 14
34. Discuss the advanced advertising use case with data science at Telenet. What data is used,
which methods? .................................................................................................................................... 15
35. What kind of data does Telenet have about their customers to make predictions on? .......... 15
36. How did Telenet check the accuracy of their football team prediction model?....................... 15
Lecture 10 – Steven Latré (UA) ................................................................................................................. 15
37. Discuss the safety use cases for the use of AI for cycling, as discussed by Steven Latré. Provide
details on the data used, what the output is, and how it helps safety for the UCI. ............................. 15
38. Explain how AI powers the “Weekly lactate test”, as discussed by Steven Latré. .................... 16
39. What is the use case and data used for AI in hockey? (small) .................................................. 16
Lecture 11 – Kevin Mets (UA) ................................................................................................................... 16
40. What is Reinforcement learning, discuss the components of a MDP, and apply to the case of
Deep Q-Networks? ................................................................................................................................ 16
41. Discuss the value-based, policy-based and actor-critic methods with their advantages and
disadvantages, and provide an example for each category.................................................................. 17
42. How does Reinforcement Learning differ from Supervised and Unsupervised Learning? (small)
18
43. What is a delayed reward? ....................................................................................................... 18
44. Why are simulators often used in Reinforcement Learning? ................................................... 18
2
, Lecture 1 – James Hinns (UA)
1. Explain the problem of mode collapse within GANs, and using an example, describe how the
architecture can allow this to happen. Discuss how the CycleGAN architecture is more
resistant to such problems.
GANs (Generative Adversarial Networks) are semi self-supervised and uses 2 neural networks, a
Generator and a Discriminator. The Generator will create fake samples and the Discriminator will classify
the fake and the real samples. The idea is to train both, until the Generator is good enough to fool the
Discriminator, leading to a positive evaluation.
GANs are models that can generate realistic data, but a common problem is mode collapse, where the
Generator produces repetitive outputs. This happens when the Generator fails to capture the full
diversity of the data distribution, leading to a collapse into a single mode or a small set of samples. The
Generator will always use the same result that fools the discriminator.
An example of this is in generating numbers, instead of feeding the Discriminator with a variation of
different numbers, the Generator only feeds the discriminator with the number 6 and manages to fool
the Discriminator. The Discriminator fails to learn the full range of possibilities, leading to mode collapse.
CycleGAN architecture is introduced to address this. CycleGAN involves the use of two GANs working
together on unpaired datasets (that don’t overlap), such as horses in one domain and zebras in another.
- Generator 1 will generate a zebra image based on a horse image. Discriminator 1 then decides
whether it is a real zebra or a generated one.
- Generator 2 translates the generated zebra image by Generator 1, back to the original horse image.
We call this the cyclic image. We can now calculate the identity loss by comparing the original horse
image and the cyclic image.
- This method encourages Generator 1 to take the original input (the images of the horses) and
prevents it from simply generating images that fool the discriminator.
2. In the context of CNNs (Convolutional Neural Networks), describe the following concepts
and how they interact with one another: Kernel, Feature Map, Activation Function and
(Max) Pooling. Referring to these concepts, discuss why CNNs are much more prevalent than
MLPs for most vision (image-based) machine learning tasks.
Kernel or filter: a small matrix that scans the input data and looks for specific
patterns/features by multiplying its weights with the input values.
Feature Map: After applying the kernel over the picture, we get a feature map. Each
value in the feature map represents the presence or absence of certain features or
patterns in the input data.
(We slide the matrix/kernel over the input image and calculate the number that you put in the feature map
and than we repeat it over and over until it covers the whole image.)
Activation Function: introduces non-linearity to the network’s decision-making process (to CNN model) and
allows the network to learn complex patterns and make more accurate predictions based on the input data.
For example, in image recognition, an activation function helps a neuron decide whether a certain visual
feature, like a shape, is present in an image. The activation function makes it possible for the neuron to
activate when it detects the desired features and remain inactive when it doesn't.
3
Les avantages d'acheter des résumés chez Stuvia:
Qualité garantie par les avis des clients
Les clients de Stuvia ont évalués plus de 700 000 résumés. C'est comme ça que vous savez que vous achetez les meilleurs documents.
L’achat facile et rapide
Vous pouvez payer rapidement avec iDeal, carte de crédit ou Stuvia-crédit pour les résumés. Il n'y a pas d'adhésion nécessaire.
Focus sur l’essentiel
Vos camarades écrivent eux-mêmes les notes d’étude, c’est pourquoi les documents sont toujours fiables et à jour. Cela garantit que vous arrivez rapidement au coeur du matériel.
Foire aux questions
Qu'est-ce que j'obtiens en achetant ce document ?
Vous obtenez un PDF, disponible immédiatement après votre achat. Le document acheté est accessible à tout moment, n'importe où et indéfiniment via votre profil.
Garantie de remboursement : comment ça marche ?
Notre garantie de satisfaction garantit que vous trouverez toujours un document d'étude qui vous convient. Vous remplissez un formulaire et notre équipe du service client s'occupe du reste.
Auprès de qui est-ce que j'achète ce résumé ?
Stuvia est une place de marché. Alors, vous n'achetez donc pas ce document chez nous, mais auprès du vendeur audreyvanlierde. Stuvia facilite les paiements au vendeur.
Est-ce que j'aurai un abonnement?
Non, vous n'achetez ce résumé que pour $7.16. Vous n'êtes lié à rien après votre achat.