Natural Language Processing (NLP) has always posed ethical or legal problems. These problems are particularly sensitive in this age of Big Data and of data duplication, areas in which NLP is involved. In addition to legal and economic matters (search for patents and rights associated with data/software), there are military issues (monitoring of conversations) and social issues (the “right to be forgotten” imposed on Google).
The crucial problem today is access to data (including sensitive) and personal privacy protection for citizens. Indeed, our domain produces applications considered to be effective for both areas (data access and protection), but without their known limitations being clear to the general public and governments.
Diversifying work on corpora has also led the community to be able to process more and more sensitive sources, be it personal data, medical data or even that of a criminal nature.
For privacy protection, anonymizing data, whether oral or written, is as much an industrial as an academic stake, with sometimes strong coverage constraints depending on the application or research needs, issues regarding the nature of the resources and the information to be anonymized, or legal limits.
Some NLP tools also join the ethical concerns, such as tools for plagiarism detection, facts checking and speaker identification. In addition, the advent of Web 2.0 and with it the development of crowdsourcing raises new questions as to the way in which to consider participants in the creation of linguistic resources.
This special issue of the TAL journal aims to highlight the NLP contributions to ethics and data protection and to uncover the limitations of the field both in terms of real possibilities (evaluation) and societal dangers.
We encourage submissions on all aspects related to ethics for and by Natural Language Processing, and in particular on the following problems or tasks :
sensitive corpus processing, including medical, police or personal data language resource production, in particular using crowdsourcing, and ethics ethical questions linked to the use of tools or the result of NLP processing ethical questions related to NLP practices quality and ways of evaluating applications and/or language resources anonymization, de-identification and re-identification of NLP corpora plagiarism detection by NLP facts checking paralinguistic and ethics, in particular speaker identification or detection of pathologies historical perspective of ethics in NLP definition of ethics as applied to NLP
We also welcome position papers on the subject.
Manuscripts may be submitted in English or French. French-speaking authors are requested to submit in French. Submissions in English are accepted only in case of one of the authors not being a French speaker.
mid-March 2016 Deadline for submission end of May 2016 Notification to authors after first review beg. of July 2016 Deadline for submission of revised version mid-July 2016 Notification to authors after second review end of Sept. 2016 Deadline for submission of final version December 2016 Publication
Authors who intend to submit a paper are encouraged to upload their contribution (no more than 25 pages, PDF format) via the menu "Paper submission" of the issue page of the journal. To do so, you will need to have an account on the Sciencesconf platform. To create an account, go to the Sciencesconf site and click on "create account" next to the "Connect" button at the top of the page. To submit, come back to this page, connect to you account and upload your submission.
TAL perfoms double blind reviewing. Your paper should be anonymised.
Style sheets are available for download on the Web site of the journal (http://www.atala.org/IMG/zip/tal-style.zip).
TAL (Traitement Automatique des Langues) is an international journal that has been published by ATALA (Association pour le Traitement Automatique des Langues) for the past 40 years with the support of the CNRS. Over the past few years, it has become an online journal, with possibility of ordering the paper versions. This does not, in any way, affect the selection and review process.