MaterStudiorum.ru - домашняя страничка студента.
Минимум рекламы - максимум информации.


Авиация и космонавтика
Административное право
Арбитражный процесс
Архитектура
Астрология
Астрономия
Банковское дело
Безопасность жизнедеятельности
Биографии
Биология
Биология и химия
Биржевое дело
Ботаника и сельское хоз-во
Бухгалтерский учет и аудит
Валютные отношения
Ветеринария
Военная кафедра
География
Геодезия
Геология
Геополитика
Государство и право
Гражданское право и процесс
Делопроизводство
Деньги и кредит
Естествознание
Журналистика
Зоология
Издательское дело и полиграфия
Инвестиции
Иностранный язык
Информатика
Информатика, программирование
Исторические личности
История
История техники
Кибернетика
Коммуникации и связь
Компьютерные науки
Косметология
Краткое содержание произведений
Криминалистика
Криминология
Криптология
Кулинария
Культура и искусство
Культурология
Литература и русский язык
Литература(зарубежная)
Логика
Логистика
Маркетинг
Математика
Медицина, здоровье
Медицинские науки
Международное публичное право
Международное частное право
Международные отношения
Менеджмент
Металлургия
Москвоведение
Музыка
Муниципальное право
Налоги, налогообложение
Наука и техника
Начертательная геометрия
Новейшая история, политология
Оккультизм и уфология
Остальные рефераты
Педагогика
Полиграфия
Политология
Право
Право, юриспруденция
Предпринимательство
Промышленность, производство
Психология
Психология, педагогика
Радиоэлектроника
Разное
Реклама
Религия и мифология
Риторика
Сексология
Социология
Статистика
Страхование
Строительные науки
Строительство
Схемотехника
Таможенная система
Теория государства и права
Теория организации
Теплотехника
Технология
Товароведение
Транспорт
Трудовое право
Туризм
Уголовное право и процесс
Управление
Управленческие науки
Физика
Физкультура и спорт
Философия
Финансовые науки
Финансы
Фотография
Химия
Хозяйственное право
Цифровые устройства
Экологическое право
Экология
Экономика
Экономико-математическое моделирование
Экономическая география
Экономическая теория
Эргономика
Этика
Юриспруденция
Языковедение
Языкознание, филология
    Начало -> Информатика, программирование -> Division of the sentence into phrases

Название:Division of the sentence into phrases
Просмотров:400
Раздел:Информатика, программирование
Ссылка:Скачать(15 KB)
Описание: Министерство образования Республики Беларусь Учреждение образования «Гомельский государственный университет им. Ф. Скорины» Филологический факультет Курсовая работа Division of the senten

Университетская электронная библиотека.
www.infoliolib.info

Часть полного текста документа:

Министерство образования Республики Беларусь

Учреждение образования

«Гомельский государственный университет

им. Ф. Скорины»

Филологический факультет

Курсовая работа

Division of the sentence into phrases

Исполнитель:

Студентка группы К-42

Лапицкая Т.Е.

Гомель 2005


Content

 

Introduction

Presentation

Algorithm for division of the sentence into phrases

Lists used by Algorithm No 2

Some examples of the performance of Algorithm No 2

Conclusion

References

 


Introduction

 

For multiple purposes, in Text Processing and Machine Translation, often there is a need to divide the sentence into smaller units that can be processed more easily than the whole sentence, especially when the sentence happens to be a long one. To that purpose we have devised an efficient algorithm based on the assumptions presented in the next section.


Presentation

 

When we say that we are going to divide the sentence into phrases, we must state first how we will define the phrase and what our understanding of the phrase will be where it starts and where it ends. For the purposes of the present algorithm (and not for any other, especially theoretical, purposes) the phrase is delimited on its left and on its right by Punctuation Marks and Auxiliary words. The phrase usually starts with an Auxiliary word and ends with the appearance of a Punctuation Mark or an Auxiliary word.

The Auxiliary words, marking the boundaries of the phrases, are presented in tables (Lists). Each table lists Auxiliary words of a particular type. It was observed that some Auxiliary words (as well as some sequences of consecutively used Auxiliary words) start usually longer and more independent phrases than others. For example, in a sentence like is often difficult to seek solutions through the curtailment of consumption.

The Auxiliary word through followed by the Article the (another Auxiliary word) starts a phrase that ends with the appearance of a Punctuation Mark, while the Auxiliary word of starts a sub-phrase which is part of a longer phrase. In our algorithm (see Algorithm No 2 in Section 3) this subdivision of the sentence into longer phrases and the subdivision of the longer phrases into smaller constituent phrases is expressed by leaving different lengths of space between one phrase and another. The longer the space left before the phrase, the more self-sufficient and independent the phrase is thought to be. In this study we have established five types of phrases, depending on their relative independence within the sentence. This independence is expressed by a particular Auxiliary word (or words) or by a Punctuation Mark. The longest and the most self-sufficient and relatively independent phrase starts and ends with a Punctuation Mark. The second most independent phrase starts with a word from List No 1 and ends with a Punctuation Mark or with the appearance of another Auxiliary word from List No 1. For example:

(6 spaces left) One US government study estimated

(5 spaces left) that there are 68 large manufacturing complexes

(4 spaces) in the region

(5 spaces left) that have significant idle capacity, (end)

The full stop at the start of the sentence is equivalent to six spaces. In other words, a smaller space following after a larger space to the left means that the phrase starting after the smaller space is dependent on, and a constituent of, the larger phrase. The smaller space in the example above (4 spaces) shows that the phrase following after it is dependent on the previous phrase that there are 68 large manufacturing complexes and explains it (or brings additional information about it, here location), while the five spaces left after region signify that the next phrase is dependent on the previous large phrase (the one that has a longer space left in front), in this case One US government study estimated that there are 68 large manufacturing complexes.

The space left between the phrases depends on the actual Preposition (or Punctuation Mark) used or on the sequence of Punctuation Mark and/or Auxiliary words, as specified (for more details see the instructions for Algorithm No 2 below).


Algorithm for division of the sentence into phrases

Input text comparing of each word entry Searching left or right with the Auxiliary words or (up to two words) for Punctuation Marks (presented other Auxiliary words in Lists) and identifying the or Punctuation Marks Auxiliary words or Punctuation Marks Output result: a phrase

Note: The algorithm (27 digital instructions in all) is available for free download on the Internet (see Internet Downloads at the end of the book).

Lists used by Algorithm No 2

NB The words not registered in the Lists are recorded as they follow, in the same sequence, after those registered in the Lists.

(i)      List No 1: besides, therefore, however, whereas, thus, hence, though, despite, with, nevertheless, throughout, through, during, that, only, but, if, otherwise, again, which, although, thereby, already, against, unless, thereafter etc.

(ii)     List No 2: over, as, what, toward(s), for, into, about, by, so, from, at, above, under, beside, below, onto, since, behind, in front of, beyond, around, before, after, then, altogether, among(st), between, beneath etc.

(Hi) List No 3: both, neither, none etc.

(iv)    List No 4: of, to (as Preposition)

(v)     List No 5: the, a, an

(vi) List No 6: so much as, so far as, so far, as long as, as soon as, so long as, in order that, in order to, lest, as well as, and, or, nor etc.

(vii) List No 7: such, than, onto, until, all, near, even, when, while, within, last, next, also, less, more, most, whether, much, once, one, any, many, some, where, another, other, each, then, whose, who, whoever, till, until, what, across, whence, according, due to, owing, whereby, prior, wherever, whenever, already, moreover, likewise, however etc.

(viii) List No 8: out, in, on, down etc.

Some examples of the performance of Algorithm No 2

Below we will present a text divided into phrases according to the instructions for the algorithm:

(i) Many countries also have established or have under construction a free zone, where exporters have access to shipping facilities, a pool of labour and freedom from exchange controls.

(ii) The Caribbean Basin Initiative, a US package of aid and trade incentives to encourage manufacturing, has given an added boost to industrial development in this region.

The analysis of the sentence starts with checking the contents of the memory and taking to print any information stored up to this moment (this is done at the start of each new sentence), also with ascertaining whether the sentence has ended or not and recording the analysed word in the memory if it is not recorded yet ia procedure carried out after each word). ............





Нет комментариев.



Оставить комментарий:

Ваше Имя:
Email:
Антибот:  
Ваш комментарий:  



Похожие работы:

Название:English Theoretical Grammar
Просмотров:190
Описание: Lecture-Notes in English Theoretical Grammar Theme 1. INTRODUCTION. Point 1. The subject of theoretical grammar and its difference from practical grammar. The following course of theoretical grammar serves to describe the grammatical structure of the English language as a system where all parts are interconnected. The diffe

Название:Project Work in Teaching English
Просмотров:283
Описание: MINISTRY OF EDUCATION AND SCIENCE OF UKRAINE IVAN FRANKO NATIONAL UNIVERSITY OF L’VIV COLLEGE OF EDUCATIONPROJECT WORK IN TEACHING ENGLISHCourse paper presented by a 4th-year student Ivanna Linitska Supervised by Zadunayska Y. V., Teacher of English L’VIV – 2010 Table of Contents   Introduction Chapter I. Pr

Название:Some problems of accentual structure in English
Просмотров:277
Описание: The Eurasian Academy Institute “Eurasia”Course paper “Some problems of accentual structure in English”Pocheikina J.A. (325 group) Speciality: 050207 Interpreting Discipline: Foundations of the theory of the studied language. Theoretical phonetics The scientific supervisor Senior teacher Buzhumova P.Z.Uralsk-2010 Contents: Intro

Название:The English grammar
Просмотров:102
Описание:   The English grammar Unit one: What is grammar? Question 1. Can you formulate a definition of ‘grammar’? Compare your definition with a dictionary’s. Question 2. Think of two languages you know. Can you suggest an example of a structure that exists in one but not in the other? How difficult i

Название:10-th century in English history
Просмотров:239
Описание: The church in the mid IXth century So the Church made sure that it was closely linked with royalty and in the IXth century, Edmund’s son, Edgar (959-975), started to reform the monasteries, and Cantebury, Sherban, Winchester and Worcester all became monastic[монашеский] cathedrals. The church was well-inderved, the total income of monasteries and nuneries by the earl

 
     

Вечно с вами © MaterStudiorum.ru