Eurach Research

DIDI

Digital Natives - Digital Immigrants. Writing on Social Network Sites: a corpus-based observation of the current language use in South Tyrol, with particular consideration of the writers' age

    DIDI
    • Project duration: -
    • Project status: finished
    • Funding:
      Provincial P.-L.P. 14. Research projects (Province BZ funding /Project)
    • Total project budget: €200,392.20
    • Institute: Institute for Applied Linguistics

    In the project DiDi we have analysed the linguistic strategies employed by users of social network sites (SNS). The data analysis focused on South Tyrolean users and we investigated how they communicate with each other. In regions of the German speaking area where dialect is frequently used in different communicative contexts, regional and social codes are often also used in written computer mediated communication. Another interesting but more general aspect of the new media is connected to the emerging linguistic and social practices (new literacy). One of the main research questions in DiDi was whether people of different age use language on SNS in a similar way or in an age-specific manner.

    The purpose of the study was:
    1. to record the contemporary language use of South Tyrolean German in the new media (cf. the DiDi Corpus)
    2. to describe the everyday usage of language of South Tyrolean SNS users with L1 German with respect to their choice of languages and varieties as well as with respect to their usage of specific cmc phenomena.

    Please see the "Publications" for detailed descriptions of the project and its results.

    The DiDi Corpus

    The DiDi corpus has an overall size of around 650.000 Tokens gathered from 136 South Tyrolean Facebook users who participated in the DiDi project. It consists of 11.102 Facebook wall posts, 6.507 wall comments and 22.218 private messages. All messages were written by the participants throughout the year 2013. Please read the fulldescription of the corpus for further details. Please consider also the description of the method of data collection and the full description of the DiDi project and its research questions.

    As every participant could offer either his/her private messages, his/her texts on the wall or both, the corpus comprises wall posts and wall comments from 130 profiles and private messages of 56 profiles; 50 participants granted access to both types of data. Free access to the corpus is given to the wall posts and comments. Due to privacy issues the access to the private messages is restricted. Access to the private messages can be given for scientific research only, after signing a non-disclosure agreement. In case you are interested in the data for scientific reasons, please contact the research team.

    All texts were anonymised in order to guarantee that the participants' identity cannnot be infered from the texts. The anonymisation included person names, group names, geographical names and adjectival references, institution names, hyperlinks, mail addresses, phone numbers, numbers of bank accounts, servers, postal codes and other private information. Please, read the anonymisation document for the anonymisation keys.

    The corpus offers a vast range of research opportunities for linguists that are interested in CMC in general, and more specific in multilingual language use, the use of regional varieties, code switching, code shifting and code mixing phenomena, etc.

    Access to the DiDi corpus via ANNIS: https://commul.eurac.edu/annis/didi

    Corpus download via Eurac Research Clarin Centre: https://clarin.eurac.edu/

    Publications
    Dialektale Sprachrealitäten über CMC-Korpora erleben: Das DiDi-Korpus zur internetbasierten Kommunikation aus Südtirol im DaZ-Unterricht
    Glaznieks A, Frey JC (2023)
    Journal article
    Korpora Deutsch als Fremdsprache

    https://doi.org/10.48694/kordaf.3839

    Das DiDi‐Korpus: Internetbasierte Kommunikation aus Südtirol
    Glaznieks A, Frey JC (2020)
    Contribution in book
    Deutsch in Sozialen Medien

    https://doi.org/10.1515/9783110679885-019

    https://hdl.handle.net/10863/15720

    Using Data Mining to Repurpose German Language Corpora. An evaluation of data-driven analysis methods for corpus linguistics
    Frey J (2020)
    PhD thesis

    https://hdl.handle.net/10863/17321

    How FAIR are CMC Corpora?
    König A, Frey JC, Stemle EW (2019)
    Presentation/Speech

    Conference: 7th Conference on CMC and Social Media Corpora for the Humanities (cmccorpora19) | Cergy-Pontoise | 9.9.2019 - 10.9.2019

    https://hdl.handle.net/10863/11295

    Comparison of Automatic vs. Manual Language Identification in Multilingual Social Media Texts
    Frey JC, Stemle E, Doğruöz AS (2019)
    Contribution in book
    Building computer-mediated communication corpora for socio-linguistic analysis

    https://hdl.handle.net/10863/10130

    How FAIR are CMC corpora?
    Frey JC, König A, Stemle E (2019)
    Conference proceedings article

    Conference: 7th Conference on CMC and Social Media Corpora for the Humanities (cmccorpora19) | Cergy-Pontoise | 9.9.2019 - 10.9.2019

    More information: https://cmccorpora19.sciencesconf.org/data/pages/proceedings ...

    https://hdl.handle.net/10863/11294

    Das DiDi-Korpus: internetbasierte Kommunikation aus Südtirol
    Frey J, Glaznieks A (2019)
    Presentation/Speech

    Conference: 55. Jahrestagung des Instituts für Deutsche Sprache | Mannheim | 12.3.2019 - 14.3.2019

    https://hdl.handle.net/10863/13382

    DIDI - The DiDi Corpus of South Tyrolean CMC 1.0.0
    Frey JC, Glaznieks A, Stemle EW (2019)
    Database

    More information: http://hdl.handle.net/20.500.12124/7

    The myth of the Digital Native? Analysing language use of different generations in Facebook
    Frey JC, Glaznieks A (2018)
    Conference proceedings article
    Der plurilinguale Sprecher in Facebook. Neue Medien und Pluriliteracy in Südtirol
    Frey JC (2018)
    Presentation/Speech

    Conference: 4th LRI Workshop for young academics "Language Policy - Language Use - Language Standard" | Meran | 7.6.2018 - 8.6.2018

    Becoming a multilingual speaker. New Media and pluriliteracy in South Tyrol
    Frey JC (2018)
    Presentation/Speech

    Conference: Round table "Social Net(work)s in Education and Language Sciences" | Heidelberg | 15.6.2018 - 15.6.2018

    Pluriliteracy on Social Media. The Multilingual Practices of South Tyroleans on Facebook
    Frey JC (2018)
    Presentation/Speech

    Conference: Language, Identity and Education in Multilingual Contexts | Dublin | 2.2.2018 - 4.2.2018

    The myth of the Digital Native: Analysing language use of different generations on Facebook
    Frey JC, Glaznieks A (2018)
    Presentation/Speech

    Conference: 6th Conference on CMC and Social Media Corpora for the Humanities (cmccorpora18) | Antwerp | 17.9.2018 - 18.9.2018

    Sociolinguistic research using the DiDi corpus of South Tyrolean CMC: From corpus-based research designs to computational linguistic challenges
    Frey CF, Stemle EW, Glaznieks A (2018)
    Presentation/Speech

    Conference: 44. Österreichische Linguistiktagung 2018 (ÖLT2018) | Innsbruck | 26.10.2018 - 28.10.2018

    Experteninterview: We viel "Emojion" verträgt unsere Sprache?
    Abel A, Frey JC (2018)
    Newspaper
    Zett: Die Zeitung am Sonntag
    Dialekt als Norm? Zum Sprachgebrauch Südtiroler Jugendlicher auf Facebook
    Glaznieks A, Frey JC (2018)
    Contribution in book
    Jugendsprachen/Youth Languages: Aktuelle Perspektiven internationaler Forschung/Current Perspectives of International Research

    https://doi.org/10.1515/9783110472226-038

    https://hdl.handle.net/10863/7699

    The Myth of the Digital Native: Analysing language use of different generations on Facebook
    Frey JC, Glaznieks A (2018)
    Conference proceedings article

    Conference: 6th Conference on CMC and Social Media Corpora for the Humanities (cmccorpora18) | Antwerp | 17.9.2018 - 18.9.2018

    More information: https://www.uantwerpen.be/images/uantwerpen/container49896/f ...

    https://hdl.handle.net/10863/8093

    Think Global, Write Local – Patterns of Writing Dialect on SNS
    Glaznieks A (2017)
    Presentation/Speech
    Geschriebener Dialekt in Südtiroler Facebooktexten
    Glück A, Glaznieks A (2017)
    Presentation/Speech
    A data mining approach to digital age
    Frey J (2017)
    Forlì
    Presentation/Speech

    Conference: DIT Postgraduate Research Workshop | Forlì | 6.7.2016 - 6.7.2016

    Think Global, Write Local: Patterns of Writing Dialect on SNS
    Glaznieks A (2017)
    Conference proceedings article

    https://doi.org/10.5281/zenodo.1041851

    https://hdl.handle.net/10863/7939

    Proceedings of the 5th Conference on CMC and Social Media Corpora for the Humanities
    Stemle E, Wigham C (2017)
    Bolzano: Eurac Research
    Edited book

    More information: https://zenodo.org/record/1040875

    https://doi.org/10.5281/zenodo.1040875

    https://hdl.handle.net/10863/6510

    Connecting Resources: Which Issues have to be Solved to Integrate CMC Corpora from Heterogeneous Sources and for Different Languages?
    Beißwenger M, Wigham CR, Etienne C, Fišer D, Suárez HG, Herzberg L, Hinrichs E, Horsmann T, Karlova-Bourbonus N, Lemnitzer L, Longhi J, Lüngen H, Ho-Dac L, Parisse C, Poudat C, Schmidt T, Stemle E, Storrer A, Zesch T (2017)
    Bolzano, Italy
    Conference proceedings article
    Proceedings of the 5th Conference on CMC and Social Media Corpora for the Humanities

    More information: https://zenodo.org/record/1041877

    https://doi.org/10.5281/zenodo.1041877

    https://hdl.handle.net/10863/7942

    DiDi Corpus
    Stemle EW (2017)
    Duisburg, Germany
    Presentation/Speech

    Conference: Integrating a new type of language resource into the Digital Humanities landscape| French-German colloquium on standards for corpora of computer-mediated communication | Duisburg : 19.6.2017 - 20.6.2017

    More information: https://sites.google.com/view/dhcmc2017/

    https://hdl.handle.net/10863/9186

    Mehrsprachigkeit auf Südtirols Social-Media-Profilen
    Frey J (2016)
    Bozen/Bolzano
    Presentation/Speech

    Conference: Work in Progress Linguistics Colloquium Eurac Research/Free University of Bolzano | Bozen | 11.6.2015 - 11.6.2015

    The DiDi Corpus of South Tyrolean CMC Data: A multilingual corpus of Facebook texts
    Frey J, Glaznieks A, Stemle EW (2016)
    Naples
    Presentation/Speech

    Conference: Third Italian Conference on Computational Linguistics (CliC-it 2016) | Naples | 5.12.2016 - 6.12.2016

    DiDi: A multilingual corpus of non-public South Tyrolean computer-mediated communication
    Frey J (2016)
    Lancaster
    Presentation/Speech

    Conference: UCREL Summer School in corpus-based NLP | | 10.7.2016 - 15.7.2016

    The DiDi Corpus of South Tyrolean CMC Data: A multilingual corpus of Facebook texts
    Frey J, Glaznieks A, Stemle EW (2016)
    Naples
    Conference proceedings article

    Conference: Third Italian Conference on Computational Linguistics (CliC-it 2016) | Naples | 5.12.2016 - 6.12.2016

    More information: http://ceur-ws.org/Vol-1749/paper27.pdf

    https://hdl.handle.net/10863/8949

    "Bitte deutsch schreiben!" Multilingual and diglossic - a linguistic description of South Tyrolean Facebook users
    Glaznieks A, Frey JC (2015)
    Presentation/Speech

    Conference: Multilingualism in the Digital Age | Reading | 19.6.2015 - 19.6.2015

    The DiDi Corpus of South Tyrolean CMC Data
    Frey J, Glaznieks A, Stemle EW (2015)
    Essen
    Presentation/Speech

    Conference: 2nd Workshop of the Natural Language Processing for Computer-Mediated Communication / Social Media| NLP4CMC at GSCL 2015 | Essen : 28.9.2015 - 29.9.2015

    The DiDi Project: Collecting, Annotating, and Analysing South Tyrolean Data of Computer-mediated Communication.
    Stemle EW (2015)
    Rennes
    Presentation/Speech

    Conference: ird-cmc-rennes | International Research Days: Social Media and CMC Corpora for the eHumanities | Rennes : 23.10.2015 - 24.10.2015

    More information: http://ird-cmc-rennes.sciencesconf.org/

    https://hdl.handle.net/10863/9187

    The DiDi Corpus of South Tyrolean CMC Data
    Frey J, Glaznieks A, Stemle EW (2015)
    Essen
    Conference proceedings article

    Conference: 2nd Workshop of the Natural Language Processing for Computer-Mediated Communication / Social Media| NLP4CMC at GSCL 2015 | Essen : 28.9.2015 - 29.9.2015

    https://hdl.handle.net/10863/8928

    Zum Projekt DiDi - Digital Natives - Digital Immigrants
    Frey J (2014)
    Bozen/Bolzano
    Radio-TV
    Wie schreibt Südtirol auf Facebook?
    Frey JC (2014)
    Presentation/Speech

    Conference: 1. LRI Workshop "Sprache - Region - Identität in der computervermittelten Kommunikation | Meran | 13.6.2014 - 14.6.2014

    Code-Switching on Facebook Wall Posts of Bilingual German-speaking South Tyroleans
    Stuckey N, Frey J (2014)
    Vienna
    Presentation/Speech

    Conference: 41. Österreichische Linguistiktagung (ÖLT 2014), Universität Wien | Vienna | 6.12.2014 - 8.12.2014

    Collecting language data of non-public social media profiles
    Frey J, Glaznieks A, Stemle EW (2014)
    Hildesheim
    Presentation/Speech

    Conference: Workshop “NLP 4 CMC| Natural Language Processing for Computer-Mediated Communication / Social Media” at the 12th edition of KONVENS | Hildesheim : 8.10.2014 - 10.10.2014

    Collecting language data of non-public social media profiles
    Frey J, Stemle EW, Glaznieks A (2014)
    Hildesheim: Universitatsverlag Hildesheim, Germany
    Conference proceedings article

    Conference: Workshop “NLP 4 CMC| Natural Language Processing for Computer-Mediated Communication / Social Media” at the 12th edition of KONVENS | Hildesheim : 8.10.2014 - 10.10.2014

    More information: http://www.uni-hildesheim.de/konvens2014/data/konvens2014-wo ...

    https://hdl.handle.net/10863/8891

    The Project DIDI. Writing on Social Network Sites – A Corpus-based Observation of the Current Language Use in South Tyrol, with Particular Consideration of the Writers' Age
    Glaznieks A, Stemle EW (2013)
    Dortmund
    Presentation/Speech
    The Project DIDI. Writing on Social Network Sites – A Corpus-based Observation of the Current Language Use in South Tyrol, with Particular Consideration of the Writers’ Age. Talk at the international workshop "Building Corpora of Computer-Mediated Communi
    Glaznieks A, Stemle EW (2013)
    Dortmund
    Presentation/Speech

    Conference: International Workshop "Building Corpora of Computer-Mediated Communication| Issues, Challenges, and Perspectives" | Dortmund : 14.2.2013 - 15.2.2013

    Herausforderungen bei der automatischen Verarbeitung von dialektalen IBK-Daten
    Glaznieks A, Stemle EW (2013)
    Darmstadt
    Presentation/Speech

    More information: https://www.researchgate.net/publication/259344920_Herausfor ...

    Our partners
    1 - 1
    • Südtiroler Kulturinstitut

    Science Shots Eurac Research Newsletter

    Get your monthly dose of our best science stories and upcoming events.

    Choose language
    Eurac Research logo

    Eurac Research is a private research center based in Bolzano (South Tyrol) with researchers from a wide variety of scientific fields who come from all over the globe. Together, through scientific knowledge and research, they share the goal of shaping the future.

    No Woman No Panel

    What we do

    Our research addresses the greatest challenges facing us in the future: people need health, energy, well-functioning political and social systems and an intact environment. These are complex questions, and we are seeking the answers in the interaction between many different disciplines. [About us](/en/about-us-eurac-research)

    WORK WITH US

    Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution 4.0 International license.