The LST Guide to Your Linguistic Career

2009 Fieldwork Workshop Department of English, National Taiwan Normal University April 11, 2009

Field Data Management HsiuHsiu-chuan Liao National Tsing Hua University

Linguistic Project: Processes recording: recording: media (audio, video, image) and text capture: capture: the encoding and transfer of an analogue recording (as on a cassette or reelreel-toto-reel tape) or text written on paper to the digital domain as a computer file analysis: analysis: transcription, translation, annotation, and notation of metadata archiving: archiving: creating archived objects, and assigning access and usage rights mobilization: mobilization: publication and distribution of the material in various forms

Fieldwork: Things to be Noted-I

Fieldwork: Things to be Noted-II

Never record in a compressed format (e.g. MP3) Never record direct to computer hardhard-disk. -Such techniques risk irrecoverable data loss.

Fieldnotes should be written in ballball-point pen (not pencil and not washable ink!). --Ball --Ball--point pens seem to be more durable— durable—they are smudgesmudge-proof (cf. Ink will bleed when the paper gets wet; pencil does smudge if you are carrying your notebook around a lot and the pages rub against each other).

Make note of abbreviations and symbols used in the front of the notebook.

Write on one side of the page only. only. --Use --Use the other side for notes, later corrections, and crosscross-references.

Start each section of notes with the date, date, time and place, place, and speaker. speaker.

Do not write on every line. line. --If --If you cross things out or write glosses above words in your notes, you will not want to write on every line.

Fieldwork: Things to be Noted-III

Fieldwork: Things to be Noted-IV

Make sure that pieces of paper are not lost. lost. -- Avoid the temptation of writing vocabulary items on the backs of envelopes, envelopes, or at the very least stick the envelope into a notebook or copy the information to your fieldnotes as soon as possible. possible.

Go over your notes after the session and add anything that you remember from the session that isn’ isn’t in the notes and annotate your fieldnotes for hypotheses about the data, data, questions, comments, and notes on crosscrossreferencing. referencing.

Always use a notebook with a binding: binding: (a) a hardback notebook: notebook: durable and stable binding, but quite heavy (b) a spiral bound (A5 100100-page) notebook with cardboard or plastic covers: covers: easily balanced on one’ one’s knee and not too heavy

-- Use highlighters or postpost-it notes and page markers (although bear in mind that highlighters aren’ aren’t very archivally friendly) or different colored pens (but that information will be lost when you photocopy your notes) -- Tag comments in your database electronically (this is preferable).

1

Fieldwork: Things to be Noted-V At the very least you will want to organize your data so that you can find things in it later on. Therefore, it is worth investing some time and thought in how you organize your fieldnotes. fieldnotes. -- Don’ Don’t rely on your memory; memory; you won’ won’t remember what’ what’s in which recording in six months, or what a particular question meant. Did it mean that the gloss is suspicious, or that you aren’ aren’t quite sure of the transcription, or that you want to check that the word is in your main database? -- The media should be durable. durable. It would be a disaster if you lost everything because someone spilt a drink on your laptop, or you lose the piece of paper with your coding system.

Make a Backup-II As soon as the media is recorded, make sure you cannot accidentally write over it. it. When transferring files, make sure that you always know which folder is the original version and which is the updated version. version. Make sure you have audition sheets for the recordings you have made and that your recordings are labeled in an informative way. way.

Make a Backup-IV How often should you make backups of your data? -- Back up your transcription files each day in the field, field, and burn CDs every three or four days (depending on how much transcriptions that you have done). -- Back up audio files as soon as it’ it’s transferred to computer, computer, and post the backups every few weeks to your snail mail address.

Make a Backup-I Make backup copies of your recordings as soon as possible. possible. --This --This goes for audio recordings and any other files (e.g. your database where you keep all the information about what is in which recording). Make sure that the backup worked (i.e. that the copy did not become corrupted). Never directly edit your original recordings. --You --You could accidentally delete part of the recording, or resample it, or a file could become corrupted.

Make a Backup-III There are several ways of backing up your recordings and files. At least two different backups (stored in different places) are recommended. -- Portable hard drives are not very expensive. -- Data DVDs are an efficient way of backing up large amounts of sound data. -- CDs are also good.

Capture-I Capture: Capture: the encoding and transfer of an analogue recording (as on a cassette or reelreel-toto-reel tape) or text written on paper to the digital domain as a computer file. When using digital capture software, it is important to make sure you use appropriate settings. settings. It is advisable to transfer fieldnotes from notebooks to computer files, ideally as soon as possible after recording so you do not forget notes, abbreviations, and comments.

2

Capture-II As for recording, recording, it is imperative to name your computer files consistently and clearly, clearly, making sure that you should not rely on directory structure to disambiguate file names. -- Different naming schemes can be used, but clarity and transparency is the goal. It is also essential to record the relevant metadata for the data files you create as you make them, them, ideally in a structured way such as a relational table using standard terminology. -- metadata: metadata: data about data, data, i.e. structured information about events, recordings, and data files.

Ways of Organizing Fieldnotes The most useful system will depends on how you organize your field sessions. sessions. -- Use several notebooks: notebooks: one for elicitation, elicitation, one for transcription of texts, texts, and another for miscellaneous queries. queries. --U --Use a single notebook for everything. everything.

Label the Recordings Whatever system you use, each recording should be uniquely identified. identified. --If --If you are using reusable media, such as compact flash cards, each session (or ‘episode’ episode’) might get its own number. Whatever the system, stick to it and document it.

Structure of the Project Work out a basic directory structure before you leave the field. field. That way, you will not be trying to organize parts and files the same time that you are collecting your first data. Consider the directory or file structure that you will use: where will you keep all the different files that you will be creating? (a) Keep all files in a single directory. directory. (b) A set of directories is preferable: have different directories for audio files, files, transcriptions, transcriptions, budget forms and other reports, reports, secondary analysis such as articles, assignments, or your dissertation; lexicon files, files, or other categories of this type that you find useful.

Organization of the Recordings Record a summary of what is on the track at the end of the day. Do an audition sheet for your recordings at the end of each day. Having some sort of numbering systems: systems: -- Number each tape or recording through your career (e.g. 11-1,000). -- Number by collection: collection: Each fieldtrip has a number (1.1(1.1-10, 2.12.1-100) etc. -- Number sequentially by language worked on (Tag1(Tag1-100, Ilk1Ilk1-100). -- Use the date of recording (20090307a, 20090307b, 20090411).

Other Recordings Decide how previous linguists’ linguists’ recordings, recordings, or radio recordings, recordings, or recordings made by speakers themselves will be incorporated into your cataloguing system (or if they will be). -- Will they get the same numbering system as your own recordings? If so, how will you keep them separate? -- If they are kept separate, will you be able to find things on them?

3

Software for Data Processing-I

Software for Data Processing-II

Three most important aspects of fieldwork software:

Some software programs are easier to use than others. If you find mastering new programs difficult, use one that is on the easier end. end. -- There is no point in using a program with multiple capabilities if you know you won’ won’t be using any of them.

(a) You must be able to get your data into the program easily. easily. (b) You need to be able to find things in your data easily. easily. (c) You need to be able to get data out of the database: database: in producing reports for the language community, getting examples out of your database and into the text of your reference grammar, and when converting between programs.

Minimize the time you have to spend retyping or reentering data between programs. -- Ideally you should be able to transcribe your tapes and then move data around, annotate it and include it in a final writewrite-up without having to retype.

Tools for Linguistic Analysis and ProcessingProcessing-I

Tools for Linguistic Analysis and ProcessingProcessing-II

general purpose software: software: the user must design the data structures and can write application programs to manipulate the data and carry out various tasks. -- MS Word and Excel, Excel, and File Maker Pro. Pro. -- Such software is powerful and flexible; however, they store data in a proprietary format which is not optimal for longlong-term storage and access.

specific purpose software: software: is designed to be used for particular tasks. -- Transcriber and EXMARaLDA (EXtensible EXtensible MARkup MARkup Language for Discourse Annotation): for timetime-aligned audio annotations -- Shoebox/ Shoebox/Toolbox: Toolbox: for text and lexicon annotation -- Praat: Praat: for speech analysis and annotation -- ELAN: ELAN: for audio and video annotation -- IMDI Browser: Browser: for cataloguing and administration metadata

Ways to Organize Data-I

Ways to Organize Data-II

The simplest way to organize data is to type everything in a word processor. processor. You could have one file for your fieldnotes, another or your lexicon, and a third for analysis. However, this approach is not recommended. -- Formatting a dictionary completely by hand, hand, as a Word document, with correct alphabetization, formatting and so on, would be very difficult to do. do. -- Even if you want to do something as simple as displaying all the nouns in the data, you will have to go through the examples by hand.

A much better way to store your fieldnotes is in a database. database. -- There are programs which let you build a dictionary and export the records in a consistent format to another program. -- Even in the ‘old days’ days’ before computerized database software, linguists used card files to organize their data before a dictionary typetype-script was produced. -- A computer database allows you to do the same thing as the card files.

4

Fieldwork Database Program

Interlinearizing

A very commonly used fieldwork database program is Shoebox/ Shoebox/Toolbox. Toolbox. (The Toolbox program is downloadable from the following website: www.sil.org/computing/toolbox.) www.sil.org/computing/toolbox.)

Providing interlinear glossing to texts adds a lot of value to your data.

It allows for the creation of a structured dictionary, dictionary, semisemiautomatic interlinearization, interlinearization, fieldnote compilation, compilation, and other tools such as corpus searching and wordlist building. building.

(a) It makes them much easier for you and other linguists to use. (b) It provides an implicit working out of your analysis of the language.

Shoebox/Toolbox text annotation can be exported into rich text format (RTF) to be read by MS Word in order to produce presentation format PDF documents for printing and distribution.

Processing Field Data-I

Processing Field Data-II

You need to consider what an ‘item’ item’ is: is: Data could be organized around ‘tracks’ tracks’ or ‘episodes’ episodes’ within a recording. -- An episode might be a single session or a story with the session. -- Each episode would be an item in your collection.

It is very useful to be able to keep track of which pieces of raw data have been processed, processed, particularly if you are not working on all sessions sequentially.

Given related items the same file name makes associating data easier. -- recording: recording: 031509031509-01 (first session on March 15, 2009) -- audio file: file: 031509031509-01. 01.wav -- transcription: transcription: 031509031509-01. 01.eaf (ELAN) -- Toolbox: Toolbox: 031509031509-01

It’ It’s a good idea to be able to keep track of analyses too. -- Some people use a separate notebook to jot down ideas and notes about problems that need solving; others have a database, others note them in the fieldnotes themselves.

Archiving-I

Archiving-II

Deposit your field recordings, recordings, original notes, notes, audition sheets, sheets, and any secondary analyses (including conference papers) in an archive. Also archive anything which is vital to the project which might not be easy to recover if it’ it’s lost. lost. -- This would include any fonts which are vital to the project. Archive anything that you would not want to lose, lose, and anything that cannot be easily recreated from other materials.

Find out in advance what formatting requirements the archive has. Make sure that computer files are archived in a format that is recoverable later. later. Do your best to ensure that your documents are readable in the future. future. You should save a copy of your files as rtf, rtf, plain text, text, or html as well as Word or other word processing programs. Have a printout of important notes on acidacid-free paper that’ that’s stored somewhere safe. safe.

5

Data Formats in Different Contexts working text Word, XLS, FMpro, FMpro, Shoebox/ Toolbox audio WAV video MPEG2

archiving XML

presentation PDF, HTML

WAV, WAV, BWF MPEG2, MPEG4

MP3, WMA, RA QuickTime, AVI, WMV

On-line Resources Leipzig Glossing Rules http://www.eva.mpg.de/lingua/files/morpheme.html http://www.eva.mpg.de/lingua/files/morpheme.html Linguistics computing resources on the internet (http://www.sil.org/linguistics/computing.html http://www.sil.org/linguistics/computing.html)) Typological tools for field linguistics (http://www.eva.mpg.de/lingua/toolshttp://www.eva.mpg.de/lingua/tools-atat-lingboard/tools.php) lingboard/tools.php) Praat: Praat: doing phonetics by computer (http://fonsg3.hum.uva.nl/praat/) http://fonsg3.hum.uva.nl/praat/) WordCorr: WordCorr: A tool for comparativecomparative-historical linguists (http://www.wordcorr.org http://www.wordcorr.org//) OnOn-line journal: Language Documentation & Conservation (LD&C) (http://www.nflrc.hawaii.edu/ldc http://www.nflrc.hawaii.edu/ldc//) The Hans Rausing Endangered Languages Projects (http://www.hrelp.org/languages/resources/orel http://www.hrelp.org/languages/resources/orel/)

References Bowern, Bowern, Claire. 2008. Linguistic fieldwork: A practical guide. guide. Palgrave MacMillan. Crowley, Terry. 2007. Field linguistics: A beginner’ beginner’s guide. guide. Oxford: Oxford University Press. Gippert, Gippert, Jost, Jost, Nikolaus P. Himmelmann, Himmelmann, and Ulrike Mosel, eds. 2006. Essentials of language documentation. documentation. Berlin and New York: Mouton de Gruyter. Gruyter. Ladefoged, Ladefoged, Peter. 2003. Phonetic data analysis: An introduction to fieldwork and instrumental techniques. techniques. Oxford: Blackwell. Newman, Paul, and Martha Ratliff, eds. 2001. Linguistic fieldwork. fieldwork. Cambridge: Cambridge University Press. Vaux, Bert. 2007. Linguistic field methods. methods. Wipf & Stock Publishers.

6

Field Data Management

Apr 11, 2009 - disaster if you lost everything because someone spilt a drink on your ... paper with your coding system. Make a ... Data DVDs are an efficient way of backing up large amounts of .... which might not be easy to recover if it's lost.

58KB Sizes 0 Downloads 174 Views

Recommend Documents

Field Force Management
N Android API & iOS SDK. N Places API. N Place Autocomplete. N StreetView. WANT TO FIND OUT MORE? Our Customer Success team will work with you to ...

Colors Fron data Field -
Colors Fron data Field. Colors from data field. Employment. In transit. Heals. Household shores. Reoreation E: relaxation. Sleep. Employment. In transit. Meals.

Alfalfa Field Data 2016 05 26.pdf
Page 3 of 6. Rickeman - McLeod - South of Hutchinson. Date Height Mat PEAQ RFV RFV RFQ ADF NDF NDFd CP Lignin. 5/9 20.5 Veg 208 236 22.6 28.1 28.2.

Alfalfa Field Data 2016 05 16.pdf
Page 1 of 6. May 16, 2016 Alfalfa Harvest Alert / Scissors-Cut Information Updated May 18. Central Minnesota Forage Council, U of M Extension, Farm and ...

Alfalfa Field Data 2016 05 19.pdf
5/18 Rickeman McLeod Hutchinson 23.8 30%B 188 216 241 23.8 29.4 49.3 24. 5/16 Poppler Wright Howard Lake 26 Bud 172 226 265 23.4 29.1 54.7 25.6.

FEDC: A Framework for Field Ecological Data ...
of these projects use data grid technology to transmit and manage the data, such ... data mining and mathematical methods to do some data analysis, so that the ...

Alfalfa Field Data 2016 05 12.pdf
Page 1 of 5. May 12, 2016 Alfalfa Harvest Alert / Scissors-Cut Information Updated May 13. Central Minnesota Forage Council, U of M Extension, Farm and ...

PDF Download Determann s Field Guide to Data ...
they move data to cloud solutions. With data privacy law enforcement at an all ... the tools to complete an end- to-end processglossaries of key acronyms and ...

Functional rarefaction: estimating functional diversity from field data
With the inferential tools that we develop here, researchers will be ... conduct a functional rarefaction analysis of field data. Comparisons with ..... dimensions, it can not be easily represented on graph paper. ... Third, we used PCoA to visualize

Alfalfa Field Data 2016 05 05 1st.pdf
Download. Connect more apps... Try one of the apps below to open or edit this item. Alfalfa Field Data 2016 05 05 1st.pdf. Alfalfa Field Data 2016 05 05 1st.pdf.

Alfalfa Field Data 2015 05 18 Update.pdf
Page 1. Whoops! There was a problem loading more pages. Retrying... Alfalfa Field Data 2015 05 18 Update.pdf. Alfalfa Field Data 2015 05 18 Update.pdf.

Research Data Management Training - GitHub
Overview. Research Data management Training Working Group: Approach and. Methodology ... CC Australia ported licence) licence. ... http://www.griffith.edu.au/__data/assets/pdf_file/0009/528993/Best_Practice_Guidelines.pdf. University of ...

Nevada BMP Field Guide - Stormwater Quality Management Committee
Jun 1, 2008 - Protect the roots of large trees by placing orange construction fencing ... tree). Wherever possible, extend the limits of the no-dig root protection zone outward such that it is twice as large as the outer perimeter of the branches. ..