C-PROM. An annotated Corpus for French Prominence Studies.

Mathieu Avanzi, Neuchâtel
Anne-Catherine Simon, Louvain-la-Neuve
Jean-Philippe Goldman, Louvain-la-Neuve
Antoine Auchlin, Genève

This paper presents C-PROM, an annotated corpus for French prominence studies. The corpus, including different regional varieties of French (Belgian, Swiss and metropolitan French) and various discourse-genres (from oral reading to spontaneous conversations) for a total duration of 70 minutes, was annotated by two phonetics experts. The two experts in charge of the coding followed a strict protocol, which takes into account both the previous mistakes encountered by prior research into prominence detection in French and elements of the methodology followed by scholars working on other languages. We conclude by discussing the average consistency between the two transcribers. The results obtained are quite encouraging, as the F-measure between the two annotators reaches 82.8\%, and the kappa-score 0.77.