"Hi Alexa, can I trust you?"

My team and I regularly carry out Data Protection Impact Assessments (DPIAs) according to Art. 35 GDPR with regard to the deployment of Voice Assistants for private and public bodies in Germany. This article shows my personal perspective on technology, business and legal developments regarding voice interfaces of Amazon, Google & Co.

Content: Innovative and recent patents, predictive consumer intention, business models, anatomy of voice interface platforms and services, natural language processing, information and data security, data protection laws in Europe (GDPR) and the USA (COPPA, CCPA etc.), sector specific regulation and noteworthy details for Data Protection Impact Assessments.

Note: A previous version of this article was first published in October 2019 in one of Germany's leading journals for Data Protection, "Datenschutz-Berater" and can be downloaded from the webite of my law firm Spirit Legal. The article does not claim to be exhaustive and only reflects my personal views. If you have further references to interesting legal developments, proceedings, technical and security details or other relevant publications, I am always thankful for comments and will gladly include and/or link to them.

From Alexa and Google Assistant to Cortana, Siri and Bixby: these days, it seems no serious technology provider can do without its own voice-activated assistant and the corresponding hardware. Huge advertising campaigns and the fact that the technology has been so widely received have led to more than 100 million devices being sold, of Amazon’s Alexa platform alone, in just a few years.

Private households enjoy the ease of use, and companies are rushing to develop platform-based business models. Viewed with scepticism by technology experts since their market launch in 2015, now academics, the media, regulatory authorities and courts are also turning their attention to the security and data protection issues surrounding intelligent virtual assistants.

Alexa, Google and smart homes

One system to rule them all

Where any self-respecting bachelor would once have had a lava lamp in their apartment, today this has been replaced by an Alexa. Available in Germany since 2016, this “smart” speaker contains seven microphones with which it listens to its surroundings. The number seven has held a special significance for people since Babylonian times. Traditionally it stands for divine perfection – and given this transcendental association, it ties in well with how modern representatives of well-known technology titans perceive themselves.

Alexa, Ring and Nest listen to what customers say, and will answer any question to which the internet knows the answer. Amazon Echo learns, stores and analyses what’s going on in the household. Alexa is always ready and willing to manage our home, make calls, read messages aloud and, of course, fulfil all our consumerist desires – credit limit permitting, of course.

In return for all this assistance, Amazon receives extensive information about each and every conversation held near the device. Just like a person. A person who never sleeps and is always eavesdropping. In order to nip any occasional doubts about the sincerity of the intentions behind this technical marvel in the bud, Alexa will in future even speak with the dulcet tones of celebrities like Samuel L. Jackson. And who wouldn’t trust celebrities when it comes to privacy and data security?

New Google and Amazon patents in the offing

We are still at the beginning of a vast transformation, because smart speakers are merely the very first step on the way to the fully fledged smart home of tomorrow. “The next data mine is your bedroom,” as Sidney Fussell aptly put it (The Atlantic, November 2018). Both Amazon and Google have already developed market-ready solutions that will bring comprehensive sensor technology into every home, processing not just the spoken word but also images and other sensitive data (ultrasound, infrared, Bluetooth) from users’ private lives.

The technologies used to process this raw data are able to create inferences from it and analyse users by means of weighted probabilities, depending on what the installed smart home sensors perceive in their immediate surroundings. People already know that autonomous robotic vacuum cleaners can detect objects and map their owners’ homes, not least since Roomba’s intention to sell the floor plans of vacuumed apartments to advertisers caused quite a stir in 2017.

That company has done its homework and is now a big step further: the latest Roomba can even be controlled by Alexa and Google. The ever more closely meshed smart home, powered by countless Internet of Things (IoT) endpoints, is becoming increasingly inevitable as Amazon and Google continue to set and replace industry standards. It’s quite handy that Roomba’s iRobot cloud uses Amazon Web Services (AWS), since it means the data won’t be too far away later on.

Everyone benefits from this interconnectedness: devices like Echo Look and Echo Frames already make no secret of the fact that they use object recognition on clothing to determine (poor) fashion choices and recommend (better) clothes that the user might wish to purchase.

Sensors can also effortlessly make assumptions about disposable income based on nearby electronic devices, which they detect optically or via Bluetooth, as well as use audio signatures to recognise and easily classify people – not necessarily the “user” – in the vicinity of the device. In this way, they can for example make predictions about a person’s gender and age.

Apart from fashion, music and technology tastes, technology providers can and indeed want to use these advances to collect health data, for instance by using sensors to detect that a user might have a cold because they have an unusually hoarse voice today; that they are feeling depressed because of what they say or how they say it; or that they may be suffering from Alzheimer’s disease due to changes in vocalisation and pronunciation that would be barely perceptible to a human ear. With this information, providers can then make appropriate recommendations for users to improve their lifestyle, health and consumer habits accordingly. If you have more than a passing interest in data protection and privacy, details of this sensor technology for households can be found in a number of patents published in the years 2017 and 2018, which also attracted media attention beyond the usual specialist circles.

On the subject of detecting a user’s mood and health, it’s worth taking a look at Cogito's patent no. US 10,276,188 B2, “Systems and methods for identifying human emotions and/or mental health states based on analyses of audio inputs and/or behavioral data collected from computing devices”, as well as Amazon's patent no. US 10,096,31 B1, “Voice-based determination of physical and emotional characteristics of users”.

US Patent: systems and methods for identifying human emotions

US Patent: Privacy-aware personalized content for the smart home

If you are interested in the personalisation of product recommendations and media content, the description of Google’s patent application publication US 2016/0260135 A1, “Privacy-aware personalized content for the smart home”, makes for informative reading. The very mention of the term "privacy" in the title of the patent suggests that the technology is in fact about anything but "privacy".

With its plans, Google is taking smart homes a step further: it has also patented its "Household Policy Manager", which – in homes equipped with the necessary sensor technology – processes all sensor data; runs the entire household, including purchasing all relevant goods and services; and at the same time optimises the professional and private lives of consumers (patent no.: US 20160259308A1).

According to Google’s patent filing,

“As society advances, households within the society may become increasingly diverse, having varied household norms, procedures, and rules. Unfortunately, because so-called smart devices have traditionally been designed with pre-determined tasks and/or functionalities, comparatively fewer advances have been made regarding using these devices in diverse or evolving households in the context of diverse or evolving household norms, procedures, and rules.”

While there is heated debate in Germany and other European countries about “smart metering” and the use of legislation on the operation of metering points to regulate data processing, a new era in information technology is emerging in the shadow of the major utility companies in the energy industry – technology whose rulers need not worry about the threat of enforcement of standards and sanctions. Even today, at the dawn of GDPR, data protection enforcement in Europe is still essentially a paper tiger.

Purchase decisions through algorithms: The end of advertising as we know it

In an interview with the author of the book Platform Capitalism, Nick Srnicek, Tobias Haberkorn fittingly describes the surprising goal of this trend towards the widespread use of sensors by technology companies:

“What tech firms are now pushing for are personal assistants at every instant in the chain of consummation, a type of service where the wish is fulfilled at the very moment it is formed, so that there is no need for advertising anymore.”

So the aim of the smart home is neither a “better life” for the user, nor simply the monetisation of their private conversations and movements, but rather the elimination of conventional advertising, which is associated with wasteful spreading effects. If sensors do the feeling instead of people, and if a machine calculates and decides instead of a person’s own mind, then the lack of emotions and aesthetic perception in the procurement process means there is no longer any room for classical advertising.

Technology and law

Anatomically Alexa

If you are a data protection expert with an interest in technology and would like to find out more about how the Alexa system works, then the “Anatomy of an AI System” project at https://anatomyof.ai is well worth a visit, as it documents and explains the structure and functioning of the technical and human ecosystem around Amazon Alexa right down to processor level.

Structure and functioning of the technical and human ecosystem around Amazon Alexa

With their project “The Amazon Echo as an anatomical map of human labor, data and planetary resources”, the renowned researchers Kate Crawford (Microsoft Research, AI Now Institute) and Professor Vladan Joler (Share Lab, University of Novi Sad) have compiled a remarkable source of both knowledge and contemplation, whose value data protection officers will appreciate not least when preparing data protection impact assessments (DPIA).

Joint controllership: Alexa Skills Kit

Amazon Echo has provided developers with various application programming interfaces (APIs) since June 2015: developers can use these to connect and, via the Alexa Skills Kit, to develop their own voice-controlled applications, called Skills, and also target users of the platform for advertising purposes. Amazon reviews these Skills and publishes them in the Alexa Skills Store, where currently more than 60,000 Skills are available.

Since the providers of Skills must by necessity process user data – including transcribed audio, user ID, location data and, for example, user-specific shopping lists – they are also obliged to comply with the provisions of the GDPR and provide full and transparent information on data processing in accordance with Art. 12, 13 and 14 of the Regulation. In practice, there are considerable shortcomings when it comes to the fulfilment of these information obligations. Back in 2017, a study by Alhadlaq, Tang, Almaymoni and Korolova entitled “Privacy in the Amazon Alexa Skills Ecosystem” found that 75% of all Skills provided no data protection information whatsoever, let alone any that would comply with European law. Fast-forward to today, and the situation still hasn’t changed. You could be forgiven for thinking that the GDPR didn’t exist.

Alexa's "Policy Testing" guidelines unsurprisingly do not extend to privacy policies (last checked February 1st, 2020), from which one can already deduce the level of importance of privacy to Amazon.

It is important to consider how the parties involved are each classified under data protection law: given how Amazon uses the data for its own purposes, including for training its in-house machine learning systems (AI), as well as Amazon’s sovereignty over all means of the data processing, it is abundantly clear that Amazon itself is not just a data processor pursuant to Art. 28 GDPR, but rather a separate data controller, with the full catalogue of legal obligations this entails. However, the joint use of the platform infrastructure and the close exchange of data between the parties, each for their own and partially common purposes, means that developers are considered joint controllers alongside Amazon pursuant to Art. 26 GDPR. In addition to joint and several liability under Art. 26.3 GDPR, such joint controllership also entails the obligation to conclude a joint controller agreement – and to publish the essence of such agreement pursuant to Art. 26.2 Sentence 2 GDPR. A good place for this transparency required by law would be a comprehensive privacy policy.

It should be noted, also in light of the ECJ’s judgment in the Fashion ID case (C-40/17) (in-depths analysis plus checklist for website operatores available on our blog), that the responsibilities of both parties – so of the developer and of Amazon – apply to all processing operations related to the developer’s use of the platform, since it is the developer, together with Amazon, that enables, facilitates and intensifies the processing of the voice data. As yet, there is no publicly available information on whether and to what extent Amazon engages in the necessary acts of cooperation to provide information on how it processes developers’ data, or even concludes joint controllership agreements under Art. 26.

Comparable legal challenges arise for companies that, keen not to miss the anticipated boom in the home automation market, use the “Connected Speaker API” to implement the Alexa Voice Service in their own hardware. It’s a pity that this industry standard is by default already in violation of data protection laws.

Echo Dot Kids Edition: A stumbling block

When targeting unsuspecting consumers, Amazon is successfully appealing not only to adults, but also to those most vulnerable members of society: children.

The Amazon Dot 3 Kids Edition, which is available in the bright Disneyesque colours “rainbow” and “frost blue”, was given another makeover in 2019 and now plays its Amazon-licensed, subscription-based music that little bit louder, answers questions in “kid-friendly” language, and offers “kid-friendly” games alongside equally “kid-friendly” product recommendations. Such an intrusion into the living spaces of children and young people poses enormous challenges, not least for the law.

Against the backdrop of the sheer market power with which Amazon is making its Echo platform available to children, it is easy to understand criticism, already voiced in 2015, by the US Campaign for a Commercial-Free Childhood (CCFC) of smart toys like “Hello Barbie”:

“Kids using ‘Hello Barbie’ aren’t only talking to a doll, they are talking directly to a toy conglomerate whose only interest in them is financial.”

It is hardly surprising that, given the speed at which Amazon has chosen to enter the market (some call it “blitzscaling”), the Echo Dot Kids Edition violates current US (federal) laws. At the very least, it arguably violates the Children Online Privacy Protection Act (COPPA), because a) it collects data without providing adequate information, b) this collection also takes place without parental consent, and c) it offers no possibility to have the recorded and processed data erased (public request for investigation of Amazon, Inc. before the FTC of 9 May 2019).

[Insufficient, because unspecific notice of processing of data intended to collect from children. Screenshot made during Echo Dot for Kids setup (excerpt; original notice is longer, but still insufficient). Also a good example for "dark patterns"]

It doesn’t take a genius to work out that these three selected violations of US law might also happen to be violations of European data protection law, specifically: an infringement of mandatory duties to provide information (Art. 12 et seq. GDPR); the unlawful processing of the data of children (Art. 5, 6 GDPR); and a violation of the principles of fairness, data minimisation and data security standards, given the impossibility of erasing the recordings (Art. 5, 32 GDPR).

Besides claims for injunctive relief, this type of behaviour can also result in heavy fines, provided that the competent authorities recognise and understand the circumstances and react appropriately. Data protection authorities in Europe at the moment are inefficient, due in part to how overloaded they are with cases. In this context, class actions and test cases – still relatively new concepts in Europe – could be suitable instruments for sanctioning and remedying legal infringements (see my recent article on dpoblog.eu on individual lawsuits of data subjects demanding cease and desist, information and substantial damages based on national tort law in case of the use of Google Analytics without legal basis).

Crucially, this could help to create a level playing field for legally compliant providers. In the US, meanwhile, consumers are increasingly taking the law into their own hands: in early June 2019, the mother of a ten-year-old girl from Massachusetts filed a privacy class action in Seattle because of Amazon’s well-documented violations (case no.: 2:19-cv-910). Some 15 pages of the lawsuit are freely available, and they make for enlightening reading for both users and legal experts alike. Spoiler alert: in it, the plaintiff criticises not only Amazon’s unwillingness to erase user data, but also another point in the company’s exploitation chain, namely the fact that “Alexa routinely records and voiceprints millions of children”:

[Excerpt from Class Action Complaint and Demand for Jury Trial, case no.: Case No.: 2:19-cv-910]

Natural language processing (NLP)

Training speech recognition systems is laborious and expensive, but natural language processing (NLP) is a lucrative growth market. Instead of the cost-intensive provision of training data by paid test persons, companies like Amazon, Google and others are increasingly using their own customers as data suppliers or, as in the famous case of the smart doll “My Friend Cayla”, entering into a partnership with a (toy) manufacturer that is conveniently close to the customer. The speech recognition market has huge potential. Nuance Technologies from Burlington, Massachusetts for example is known for its “Dragon” speech recognition software, which is used in offices around the world.

According to its own figures, the company has a database of more than 60 million stored voiceprints (2017) of people from all over the world: adults, children, doctors, lawyers. The company analyses voices and stores the corresponding characteristics as a digital “fingerprint”. Nuance’s “voice biometrics” solution at www.nuance.com offers commercial enterprises, but also public bodies, the ability to compare their own voice recordings with the biometric voice profiles (“samples”) held by Nuance via interfaces, and thus identify the corresponding speakers. Nuance’s own advertising statement, “Every voice matters: Our system knows who is talking and why”, suggests that Nuance does not stop at merely identifying a speaker, but that it also performs dynamic voice analyses in real time, thus analysing the person’s mood:

Quite apart from any legal reservations about processing biometric data for identification purposes after that data has been collected, especially indirectly, business is booming with the national security sector. Nuance makes its identification and mood analysis systems available to the world’s security authorities, turning it into "new public security weapons":

The promise of "prosecuting criminals using their voice" sounds pleasant as long as you live in the right country with a fair government, a non-criminal president and where you can rely on fundamental principles like the right to a fair trial. Unfortunately, there are not so many countries in the world anymore to which these attributes apply. Voice recognition is ubiquitous, fundamental rights are rare.

With the purchase of any smart assistant, it is not only the likes of Nuance, Amazon, Google and other technology providers that gain access to new and unique voice profiles, but also their customers in corporations and governments around the world. Anonymity is increasingly becoming a luxury commodity. The price? Not using technology.

Defence claims and sanctions in Europe (Germany)

In addition to state sanctions and individual claims under data protection law, it’s worth bearing in mind that, in cases where unauthorised data processing involves a violation of the general individual right to privacy, German law offers its own very specific bases for claims that are not identical to claims under the GDPR, but rather compete with them (see also Hense on Google Analytics in Datenschutz-Berater 2019, p. 204 et seq.).

Applying Sect. 823(1) in conjunction with Sect. 1004 of the German Civil Code (BGB) is always an option for data subjects seeking to stop unauthorised data processing (other european legal systems have similar rules in tort law).

Due to the fact that claims like these are more common before German courts, this approach can in fact be the preferred legal method for preventing unauthorised processing, or even for claiming reasonable damages for the violation of personal rights as immaterial damages – which would make sense in cases of intentional and unlawful processing of children’s data.

Regulation in the US

People’s fascination with intelligent virtual assistants seems to have already subsided somewhat in the US. A number of recent publications deal in great detail with the brief history – and the even briefer legal history – of these systems, in particular with the omnipresent Amazon Alexa. Of particular note, because of its instructive nature, is a lengthy essay entitled “Alexa, what should we do about privacy?” by Anne Pfeifle (Washington Law Review, 2019). For US lawyers, the fun of smart assistants will end at the latest when state institutions, especially criminal prosecutors, decide they want to get hold of collected voice recordings and use them in criminal proceedings. Unlike in Europe, concerns in the US about the violation of the right to privacy are often expressed in the form of swift ad hoc legislation that targets specific sectors and technologies. California has given technology giants reason to be concerned, having recently completed the momentous legislative process for the most comprehensive data protection law of any US state (California Consumer Privacy Protection Act, CCPA; German language content: Hense/Fischer, Datenschutz-Berater 2019, p. 27 et seq.; English language content: blog posts of my esteemed US colleague Odia Kagan regarding CCPA).

Assembly Bill (AB) 1395 provides for far stricter regulation of stand-alone “smart speaker devices” with “integrated virtual assistants” than was previously the case. The only exceptions are integrated systems, e.g. in telephones, tablets and connected cars. Among other things, the law expressly prohibits involuntary “data sharing” with third parties, not only of audio files themselves but also of any transcripts, and requires a separate opt-in process for the permanent storage of both types of data. Given existing data usage practices, this is likely to pose particular challenges for US companies.

Practical application: Data protection impact assessments

User control: “Privacy by Default”?

A 2016 publication by the Future of Privacy Forum (FPF) entitled “Always On: Privacy Implications of Microphone-Enabled Devices” is helpful for the individual assessment of risks when using voice-activated assistance systems. In particular, the working paper contains information on the differentiation between the various activation modes (“always on”, “speech-activated”, “manually activated”). These may be of relevance in Germany, not least in view of the country’s telecommunications legislation on covert listening devices (Sect. 90 of the German Telecommunications Act, or TKG). The remarks on user control, microphones and data storage are also useful for European legal experts.

System architecture

Apart from taking a look at the “Anatomy of an AI System” project mentioned earlier, anyone using Alexa for purposes not covered by the very narrow “household exemption” or for scientific purposes should make sure they fully understand the legal situation, and also draw up a risk assessment. For the first Amazon Alexa model, at least, tech-savvy legal experts should consider reading the 2016 essay by Clinton, Cook and Banik entitled “A Survey of Various Methods for Analyzing the Amazon Echo”. It provides an in-depth analysis of the hardware and software used, in which reverse engineering was applied to search for known possibilities for developing exploits. If topics like debugging, the Linux kernel, and the vivisection of boot sequences are a little too dry for you, then just bear in mind the authors’ summary: like other Amazon devices, Alexa is vulnerable to hardware attacks via the SD card pinout, hardware rooting and JTAG.

Attacks and attack vectors

In addition to Amazon’s official – albeit scant – documentation on system architecture and data flows, helpful information is provided by a detailed article by Leong entitled “Analyzing the Privacy Attack Landscape” (2018). It contains some valuable points, including a) the fact that by defining specific, easily misunderstood control commands in malicious Skills, it is possible to trigger a transfer of the conversation content to third parties; b) information about the risk of exploits for Alexa firmware; and c) remarks on the risks when using third-party hardware, where in addition to eavesdropping on entire conversations in the vicinity of the Alexa device, it is also possible to manipulate the control commands. Further reading comes from Haack, Severance, Wallace and Wohlwend (2017): “Security Analysis of the Amazon Echo”. This piece deals with the remarkable creation of a security policy for the specific end device, which is of interest for any DPIA from the perspective of appropriate “measures to address the risks”.

Rights of data subjects

Difficulties in dealing with the rights of data subjects are commonplace at large technology companies. In 2018, Amazon sent a zip archive containing 1700 WAV audio files, along with PDFs of the transcripts, to a third party instead of to the data subject.

In 2018, Amazon sent a zip archive containing 1700 WAV audio files, along with PDFs of the transcripts, to a third party instead of to the data subject.

The latter had submitted a request under Art. 15 GDPR. Not only did this constitute a violation of the data subject’s rights, but the fact that this could even happen also exposes structural deficits on the part of the company with regard to Art. 32 GDPR. Incidents of this kind are reportable and notifiable under Art. 33, 34 GDPR.

Anyone planning the commercial use of Alexa or similar technology – perhaps in hotels or in retail – will also be assuming responsibility for ensuring that Amazon (or whichever other company) respects the rights of data subjects. If violations of data subjects’ rights pursuant to Art. 12 et seq. GDPR as well as violations of data confidentiality – like the one described in the previous paragraph – should cease to be the exception, and instead become the norm, then working with notoriously unreliable service providers would not simply call for a more thorough DPIA: Art. 28, 26 and 32 GDPR would essentially make it impossible to enter into a contract with them.

The right to be forgotten and machine learning (ML): Cui bono?

The permanent storage of raw data (“audio files”) is an issue that should be given special attention in any DPIA. On the one hand, in times when the GDPR has strengthened the rights of data subjects, it should go without saying that companies must also be able to erase data. However, in machine learning systems there is far more to the erasure process than simply pushing a button. Two reasons stand out here. Firstly, machine learning models used for speech recognition are based quantitatively and qualitatively on a large amount of unstructured raw data (“big data”). If this raw data were no longer available due to obfuscation or erasure, this would undermine the validity of the “predictive models” which it has been used to create, thereby jeopardising the ability of those models to achieve their purpose, which is to make data-based, weighted predictions. In the case of biometric and sensitive data, the law has ostensibly resolved this conflict in Art. 17 GDPR by coming out in favour of strong data subject rights, although in practice this may not necessarily be in line with the data subject’s own interests.

Whichever way you look at it, the training of machine learning models inevitably leads to a disparate storage of personal data (images, texts, etc.) within the modelled system itself. These data particles can be extracted by users of the models, especially in the case of “machine learning as a service” (MLaaS). This phenomenon is known as “training data leakage” (see also Ateniese, Mancini, Spognardi et. al. (2015): “Hacking Smart Machines with Smarter Ones: How to Extract Meaningful Data from Machine Learning Classifiers.”). Nevertheless, it is neither technically nor commercially feasible to perform targeted searches for such data or data fragments – a fact which may potentially lead to considerable compliance backlogs in the area of research and development in machine learning and artificial intelligence.

Sweet dreams!