-
Notifications
You must be signed in to change notification settings - Fork 0
/
introduction.tex
38 lines (32 loc) · 3.96 KB
/
introduction.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
\chapter{Introduction} \label{chap:introduction}
The internet has become a central point of personal data collection~\cite{kretschmer2021cookie}.
Users are spending long periods of time on the internet, with the average European being more than two hours per day~\cite{kretschmer2021cookie} online.
Furthermore, they are being widely tracked while doing so, with an average of 17 trackers per website.\cite{kretschmer2021cookie}).
This has led to the introduction of regulations such as the General Data Protection Regulation (\emph{GDPR}).
Enacted in May 2018, it provides a comprehensive privacy legislation across the European Union (\emph{EU}).
Expanding on previous regulations (such as the ePrivacy Directive), it has defined additional restrictions for data collection, responsibilities for website owners and rights for users~\cite{kretschmer2021cookie}.
Websites need to inform users about data collection and its purposes.
Additionally, a legal basis such as user consent is needed for non-functional cookies which, e.g., may be used for user tracking.
Such regulation is necessary as more than 90\% of websites use cookies capable of user identification.
In the context of browser cookies, cookie notices are used to both declare data collection and its purposes, but also to obtain user consent.~\cite{bouhoula2023automated}.
Bouhoula et al. conducted a comprehensive, automatic analysis of GDPR cookie compliance of 97k popular websites.
Despite the potential fines~\cite{sanchez_rola2019can}, violations are very common according to Bouhoula et al., with 65\% of websites ignoring user rejection of cookies~\cite{bouhoula2023automated}.
An overview of large-scale studies regarding the adherence to GDPR rules was compiled by Bouhoula et al.~\cite{bouhoula2023automated}.
It can be challenging for web page operators, users, and regulators to verify compliance of specific websites with these regulations.
Cookie notices often have different designs and implementation details.
This complicates automatic verification of whether a notice defines the cookie purposes and offers possibilities to configure or reject non-necessary cookies.
Possible approaches include heuristics, manual annotation and machine learning models~\cite{kretschmer2021cookie, bouhoula2023automated}.
Similarly, recognizing what cookies are used for is not trivial~\cite{sanchez_rola2019can, bollinger2022automating}.
Those issues become apparent in previous large-scale studies:
Nouwens et al.~\cite{nouwens2020dark} restrict their study to websites implementing five popular Consent Management Platforms (\emph{CMP}).
After collecting data from 680 websites from the United Kingdom, they found that only 11.80\% of the cookie notices met basic GDPR requirements (namely, no pre-ticked optional boxes, simple way to reject and explicit consent).
Matte et al.~\cite{matte2020cookiebannersrespectchoice} focus on cookie notices adhering to the IAB TCF specification.
They describe violations such as \enquote{Implicit consent prior to interaction} in 10\% of the websites.
Similarly, tools designed for users to audit specific websites are either fully manual and labor-intensive~\cite{gorin2024edpb}, can only automatically analyze cookie notices by specific CMPs, require IAB TCF compliance~\cite{matte2020cookiebannersrespectchoice}, or only provide rudimentary analysis~\cite{cookie_information2019cookie}.
\subsubsection{Our Work}
In our work, we address the limitations of previous auditing tools.
%We provide a tool that offers an in-depth analysis, can be run on a wide variety of websites and is mostly automatic.
%We use NLP models by Bouhoula et al.~\cite{bouhoula2023automated} for the cookie notice analysis and gradient-boosting models by Bollinger et al.~\cite{bollinger2022automating} for the cookie classification.
We develop a browser extension based on the work of Bouhoula et al. that interacts with a specific web page to audit its cookie compliance with the GDPR, requiring only minimal user involvement.
\subsubsection{Thesis Outline}
TODO