Skip to Content

publications

2019

2018

Recent ‘model inversion’ attacks from the information security literature indicate that machine learning models might be personal data, as they might leak data used to train them. We analyse these attacks and discuss their legal implications.

Where ‘debiasing’ approaches are appropriate, they assume modellers have access to often highly sensitive protected characteristics. We show how, using secure multi-party computation, a regulator and a modeller can build and verify a ‘fair’ model without ever seeing these characteristics, and can verify decisions were taken using a given ‘fair’ model.

We interviewed 27 public sector, machine learning practitioners about how they cope with challenges of fairness and accountability. Their problems are often different from those in FAT/ML research so far, including internal gaming, changing data distributions and inter-departmental communication, how to augment model outputs and how to transmit hard-won social practices.

We presented participants in the lab and online with adverse algorithmic decisions and different explanations of them. We found strongly dislike of case-based explanations where they were compared to a similar individual, even though these are arguably highly faithful to the way machine learning systems work.

In this workshop paper, we argue that sense-making is important not just for experts but for laypeople, and that expertise from the HCI sense-making community would be well-suited for many contemporary privacy and algorithmic responsibility challenges.

The General Data Protection Regulation has significant effects for machine learning modellers. We outline what human-computer interaction research can bring to strengthening the law, and enabling better trade-offs.

Data protection law gives individuals rights, such as to access or erase data. Yet when data controllers slightly de-identify data, they remove the ability to grant these rights, without removing real re-identification risk. We look at this in legal and technological context, and suggest provisions to help navigate this trade-off between confidentiality and control.

In-store tracking, using passive and active sensors, is common. We look at this in technical context, as well as the European legal context of the GDPR and forthcoming ePrivacy Regulation. We consider two case studies: Amazon Go, and rotating MAC addresses.

We outline the European ‘right to an explanation’ debate, consider French law and the Council of Europe Convention 108. We argue there is an unmet need to empower third party bodies with investigative powers, and elaborate on how this might be done.

We critically examine the Article 29 Working Party guidance that relates most to machine learning and algorithmic decisions, finding it has interesting consequences for automation and discrimination in European law.

2017

FAT/ML techniques for ‘debiasing’ machine learned models all assume the modeller can access the sensitive data. This is unrealistic, particularly in light of stricter privacy law. We consider three ways some level of understanding of discrimination might be possible, even without collecting such data as ethnicity or sexuality.

We consider the so-called ‘right to an explanation’ in the GDPR and in technical context, arguing that even if it (as manifested in Article 22) was enforced from the non-binding recital in European law, it would not trigger for group-based harms or in the important cases of decision-support. We argue instead for the use of instruments such as data protection impact assessment and data protection by design, as well as investigating the right to erasure and right to portability of trained models, as potential avenues to explore.

I authored the case studies for the Royal Society and British Academy report which led to the UK Government’s new Centre for Data Ethics and Innovation. I also acted as drafting author on the main report.

This is a preliminary version of the ‘Fairness and accountability design needs’ CHI’18 paper above.

We considered the detection of offensive and hateful speech, looking at a dataset of 1 million annotated comments. Taking gender as an illustrative split (without making any generalisable claims), we illustrate how the labellers’ conception of toxicity matters in the trained models downstream, and how bias in these systems will likely be very tricky to understand.

2016

A conference paper on public sector values in machine learning, and public sector procurement in practice.

2015

This paper argues that performance-based sustainability standards, using a case study from the sugar-cane sector, have significant benefits over technology-based standards, and suggests directions in which this can be explored.