วันพฤหัสบดีที่ 18 มกราคม พ.ศ. 2561

Using analysis corpus software to analyze specialized texts

            Using analysis corpus software to analyze specialized texts

1. What is corpus?
A corpus (sometimes used in the plural form ‘corpora’) can be generally defined as ……….  ‘A collection of naturally-occurring texts in a computer-readable format which can be retrieved and analyzed using corpus analysis software’ (Kennedy,1998; McEnery & Wilson, 2001; O’Keeffe,A., McCarthy, M., & Carter, R. ,2007; Tuebert & Cermakova, 2007) 

Sources of language corpora

·       Subscribe to a large corpus provider such as the British National Corpus (BNC).

·       Use web concordancing.
·       Compile own corpora and analyze data using analysis software

§  Antconc (for monolingual corpus)

§  Wordsmith (for monolingual corpus)

§  Paraconc (for multilingual corpus)

Designing a specialized corpus (based on Bowker and Pearson 2002)
·       Corpus size
§  There are no fixed rules; depending on research purposes, availability of data and time.
·       Text extracts vs. full text
§  Depends on the aim of corpus compilation.
·       Number of texts
§  Depends on your research focus.
·       Medium
§  Can be spoken or written texts or mixed, it depends on research questions.
·       Subject and text type
§  Should mainly focus on the specialized text under investigation.
§  Text type within a specialized subject field may vary from technical to popular texts.
·       Other considerations
§  Authorship: Texts written by experts in a field tend to presentmore reliable.
§  Language: Specialized texts can be stored and retrieved in the form of monolingual, comparable, or parallel corpora.
§  Publication date: Texts should come from recent publicationsunless queries are made in relation to particular period of time.



Sources of specialized texts

·       Printed materials software

·       Word document texts
·       CD-ROMs
·       Texts on the web
·       Online database

Getting started with Antconc

·       Download the latest version of Antconc.

·       Creating a specialized corpus profile (adapted from Bowker and Pearson 2002:72)
A sample profile
Size
56,812 words
Source of corpus data
From the internet (www.voanews.com)
Number of texts
70 texts
Medium
written
Subject
News about South Korea
Text types
News article
Authorship
Journalist
Language
Texts written in English mostly by native speaker
publication
Recent text (retrieved in September 2017)

·       Doing small-scaled research on your own specialized corpora.
Using corpora to do research in ESP
§  To identify frequent words or clusters in a specialized corpus.
§  To identify key words in a specialized corpus in comparison with a general corpus for syllabus design, materials development, or terminological studies.
§  To examine language patterning and phraseology of words in a specialized text.

§  To examine meaning of specialized vocabulary.

0 ความคิดเห็น:

แสดงความคิดเห็น