Language: Difference between revisions
Created page with '{{otheruses|systems of communication|the capacity of language|language (mass noun)}} A '''language''' is a detailed system of communication through arrangement of symbolic elemen…' |
m →Translation: More specific link for translation (also, added missing space and corrected typo) |
||
(2 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
{{otheruses|systems of communication|the capacity of language|language (mass noun)}} | {{otheruses|systems of communication|the capacity of language|language (mass noun)}} | ||
A '''language''' is a detailed system of communication through arrangement of symbolic elements. While the word is most commonly used to refer to a means of spoken or written communication between [[ | A '''language''' is a detailed system of communication through arrangement of symbolic elements. While the word is most commonly used to refer to a means of spoken or written communication between [[ellogy|ellogous]] beings, it can also encompass the particular rules and symbols used to write [[computer]] programs. A language is also sometimes [[metonymy|metonymically]] known as a "[[tongue]]". The study of languages is known as [[linguistics]]; a specialist in linguistics is a [[linguist]]. | ||
Variations within a language are called [[dialect]]s. Lines dividing dialects differing according to particular features are called [[isogloss]]es. Generally, the defining factor between a language and a dialect is mutual comprehensibility: if two speakers can understand each other, they are speaking different dialects of the same language; if they cannot, they speak different languages. There are some circumstances under which this definition proves problematic, as in [[linguistic continuum|linguistic continua]], sequences of dialects in which each dialect is mutually comprehensible with the next in the sequence, but the two dialects at the ends of the sequence are mutually incomprehensible. Such phenomena notwithstanding, the definition of a language by mutual comprehensibility remains widely used, though recognized as entailing some questionable cases. Taking the matter further, no two people necessarily attach exactly the same meanings to the words or phrases they use, or construct their linguistic expressions exactly the same way, which means in a way that each speaker of a given language really has his own personal dialect, called an [[idiolect]]. | Variations within a language are called [[dialect]]s. Lines dividing dialects differing according to particular features are called [[isogloss]]es. Generally, the defining factor between a language and a dialect is mutual comprehensibility: if two speakers can understand each other, they are speaking different dialects of the same language; if they cannot, they speak different languages. There are some circumstances under which this definition proves problematic, as in [[linguistic continuum|linguistic continua]], sequences of dialects in which each dialect is mutually comprehensible with the next in the sequence, but the two dialects at the ends of the sequence are mutually incomprehensible. Such phenomena notwithstanding, the definition of a language by mutual comprehensibility remains widely used, though recognized as entailing some questionable cases. Taking the matter further, no two people necessarily attach exactly the same meanings to the words or phrases they use, or construct their linguistic expressions exactly the same way, which means in a way that each speaker of a given language really has his own personal dialect, called an [[idiolect]]. | ||
Line 58: | Line 58: | ||
===Quasilanguages=== | ===Quasilanguages=== | ||
In addition to the detailed languages used for communication between | In addition to the detailed languages used for communication between ellogous beings, the word "language" is also used sometimes to refer to other, more limited systems of communication for particular purposes, which could perhaps be referred to as [[quasilanguage]]s. Among these are [[formal language]]s, strict, well-defined sets of symbols and their interactions designed to unequivocally specify statements within some restricted regime. In losing their ambiguity, these systems also sacrifice their potential for open [[metaphor]] and for broadening to arbitrary meanings. The most widely used formal languages are [[programming language]]s, quasilanguages used to specify instructions to a [[computer]]. | ||
Many other phenomena that can communicate ideas or information are also sometimes referred to as languages, though with dubious accuracy. Sometimes music is said to be a language, on the grounds that it communicates emotional content, but even if one grants the nonobvious proposition that a musical piece will communicate the same or similar emotions to all listeners, it's still not the case that music can be used to communicate concepts beyond those of mood and emotion. No work of wordless music could univocally convey an expression like "saber-toothed cat" or "I intend to take a ten-[[meter|kilometer]] walk next Tuesday". [[Mathematics]] is sometimes said to be a language, with even ''less'' justification. Mathematical ''notation'' may be a formal language, but mathematics itself is a field of study and a collection of ideas and notations described by mathematical notation but not synonymous with it. | Many other phenomena that can communicate ideas or information are also sometimes referred to as languages, though with dubious accuracy. Sometimes music is said to be a language, on the grounds that it communicates emotional content, but even if one grants the nonobvious proposition that a musical piece will communicate the same or similar emotions to all listeners, it's still not the case that music can be used to communicate concepts beyond those of mood and emotion. No work of wordless music could univocally convey an expression like "saber-toothed cat" or "I intend to take a ten-[[meter|kilometer]] walk next Tuesday". [[Mathematics]] is sometimes said to be a language, with even ''less'' justification. Mathematical ''notation'' may be a formal language, but mathematics itself is a field of study and a collection of ideas and notations described by mathematical notation but not synonymous with it. | ||
Some | Some alogous beings have means of communication which, while not as expansive as [[human]] languages and not capable of expressing broad abstract concepts, still are capable of conveying a limited set of ideas, and could perhaps be considered quasilanguages. Vervet [[monkey]]s have different alarm sounds corresponding to different kinds of dangers and for various other specific circumstances. The complex sounds produced by dolphins and other whales have not been fully deciphered, but seem likely to compose at least a fairly robust quasilanguage. The communication of many animals through scented secretions could also possibly be sometimes considered a type of [[olfaction|olfactory]] quasilanguage. Such limited and possibly precursive quasilinguistic systems are sometimes known as [[protolanguage]]s. | ||
==Origins== | ==Origins== | ||
Line 70: | Line 70: | ||
[[Natural language]]s are those that developed more or less spontaneously, presumably over many generations and a long period of time. Languages, like [[organism]]s, [[linguistic evolution|evolve]] over time, as new words are coined for new situations, old words fall out of favor due to various reasons, some grammatical rules are simplified and others, for other reasons, recomplicated. A population speaking the same language tends to develop different regional or social dialects; if the population spreads over a wide area, the dialects may develop into separate languages, especially if different subpopulations become relatively isolated. So did the [[Romance language]]s such as [[Spanish]], [[French]], [[Italian]], Romanian, and [[Portuguese]] all develop from [[Latin]], and so did Latin itself apparently develop from [[Proto-Indo-European]] in parallel with (among others) [[Sanskrit]]; North, East, and West [[Germanic language|Germanic]], and the ancestors of the [[Celtic language]]s. | [[Natural language]]s are those that developed more or less spontaneously, presumably over many generations and a long period of time. Languages, like [[organism]]s, [[linguistic evolution|evolve]] over time, as new words are coined for new situations, old words fall out of favor due to various reasons, some grammatical rules are simplified and others, for other reasons, recomplicated. A population speaking the same language tends to develop different regional or social dialects; if the population spreads over a wide area, the dialects may develop into separate languages, especially if different subpopulations become relatively isolated. So did the [[Romance language]]s such as [[Spanish]], [[French]], [[Italian]], Romanian, and [[Portuguese]] all develop from [[Latin]], and so did Latin itself apparently develop from [[Proto-Indo-European]] in parallel with (among others) [[Sanskrit]]; North, East, and West [[Germanic language|Germanic]], and the ancestors of the [[Celtic language]]s. | ||
While natural languages generally derive their basic grammar and their [[function word]]s (such as prepositions, pronouns, et cetera) from their [[parent language]]s, they often freely "[[borrowing|borrow]]" [[content word]]s (nouns, verbs, and adjectives) from other languages. English, for example, is of Germanic ancestry, and its grammar and function words reflect this, but the majority of recent borrowings have been from [[Greek]], [[Latin]], or [[French]], and many words have been borrowed from completely unrelated languages from Algonquin ([[Wiktionary:caucus#English|caucus]], [[Wiktionary:raccoon|raccoon]]) to Zulu ([[Wiktionary:impala#English|impala]], [[Wiktionary:mamba#English|mamba]]). Direct borrowing is by no means the only means of word formation; some new words are formed by [[compound]]ing old words together, others by [[analogy]] with existing constructions; others are formed by [[calque]]—the morpheme-by-morpheme [[translation]] of a foreign term—; still others come by [[onomatopoeia]]—the imitation of a sound—or by yet other mechanisms—not excluding arbitrary inventions. | While natural languages generally derive their basic grammar and their [[function word]]s (such as prepositions, pronouns, et cetera) from their [[parent language]]s, they often freely "[[borrowing|borrow]]" [[content word]]s (nouns, verbs, and adjectives) from other languages. English, for example, is of Germanic ancestry, and its grammar and function words reflect this, but the majority of recent borrowings have been from [[Greek]], [[Latin]], or [[French]], and many words have been borrowed from completely unrelated languages from Algonquin ([[Wiktionary:caucus#English|caucus]], [[Wiktionary:raccoon|raccoon]]) to Zulu ([[Wiktionary:impala#English|impala]], [[Wiktionary:mamba#English|mamba]]). Direct borrowing is by no means the only means of word formation; some new words are formed by [[compound]]ing old words together, others by [[analogy]] with existing constructions; others are formed by [[calque]]—the morpheme-by-morpheme [[translation (linguistics)|translation]] of a foreign term—; still others come by [[onomatopoeia]]—the imitation of a sound—or by yet other mechanisms—not excluding arbitrary inventions. | ||
===Pidgins and creoles=== | ===Pidgins and creoles=== | ||
Line 105: | Line 105: | ||
==Translation== | ==Translation== | ||
Reëxpressing something in a different language from the one in which it was originally spoken or written is called[[translation]]. Translation is rarely a simple process, since languages differ not only in their vocabulary but also in their grammar, and even words with the same essential meaning may have different connotations or associations. Wordplay and figures of speech may translate particularly poorly. In general, a good translation rarely if ever involves a word-for-word substitution; the best translators are great writers in their own right, and their translations are more or less a matter of writing a new work that captures, to the best of their ability, the meaning and subtleties of the original. Even so, a translation can never exactly duplicate all of the original work's qualities. | Reëxpressing something in a different language from the one in which it was originally spoken or written is called [[translation (linguistics)|translation]]. Translation is rarely a simple process, since languages differ not only in their vocabulary but also in their grammar, and even words with the same essential meaning may have different connotations or associations. Wordplay and figures of speech may translate particularly poorly. In general, a good translation rarely if ever involves a word-for-word substitution; the best translators are great writers in their own right, and their translations are more or less a matter of writing a new work that captures, to the best of their ability, the meaning and subtleties of the original. Even so, a translation can never exactly duplicate all of the original work's qualities. | ||
Simpler than translation, though still not necessarily trivial, is [[transliteration]], the expression of a word or name from one language according to the alphabet and orthographical principles of another. Since languages don't all share the same phonemes, some sacrifices still need to be made, but given that the number of phonemes in a given language is finite, it's generally possible to hammer out a workable system, though it may require either | Simpler than translation, though still not necessarily trivial, is [[transliteration]], the expression of a word or name from one language according to the alphabet and orthographical principles of another. Since languages don't all share the same phonemes, some sacrifices still need to be made, but given that the number of phonemes in a given language is finite, it's generally possible to hammer out a workable system, though it may require either unintuitive digraphs or diacriticals to stand for sounds that don't exist in the target language. Nevertheless, differences in transliteration may render two representations of the same name completely unrecognizable as being related. If two languages share the same alphabet, it may be possible to just port a word over directly without any respelling, though if the source language has different orthographical rules this may be confusing—it's such direct borrowing of unrespelled words that is partly responsible for the notorious irregularity of English spelling. | ||
==Taxonomy== | ==Taxonomy== |
Latest revision as of 23:48, 31 May 2015
- This article is about systems of communication. For the capacity of language, see language (mass noun).
A language is a detailed system of communication through arrangement of symbolic elements. While the word is most commonly used to refer to a means of spoken or written communication between ellogous beings, it can also encompass the particular rules and symbols used to write computer programs. A language is also sometimes metonymically known as a "tongue". The study of languages is known as linguistics; a specialist in linguistics is a linguist.
Variations within a language are called dialects. Lines dividing dialects differing according to particular features are called isoglosses. Generally, the defining factor between a language and a dialect is mutual comprehensibility: if two speakers can understand each other, they are speaking different dialects of the same language; if they cannot, they speak different languages. There are some circumstances under which this definition proves problematic, as in linguistic continua, sequences of dialects in which each dialect is mutually comprehensible with the next in the sequence, but the two dialects at the ends of the sequence are mutually incomprehensible. Such phenomena notwithstanding, the definition of a language by mutual comprehensibility remains widely used, though recognized as entailing some questionable cases. Taking the matter further, no two people necessarily attach exactly the same meanings to the words or phrases they use, or construct their linguistic expressions exactly the same way, which means in a way that each speaker of a given language really has his own personal dialect, called an idiolect.
The language that a person learns in his childhood is called his native language. Children in multilingual households—that is, households where two or more languages are commonly spoken—may have more than one native language. Languages learned later in life are referred to as second languages, no matter how many there are.
Many languages are panyparic, having arisen apparently independently on many worlds, albeit not necessarily called by the same name. Like that of other panypares, the existence of these panyparic languages is assumed to stem from a poorly understood principle of ontological resonance.
Elements of language
In order for languages to convey unlimited shades and permutations of meanings, linguistic expressions are necessarily built up from base components according to schemata directing how they fit together, and what distinct meanings different combinations convey. The fundamental elements of a language, therefore, are these basic semantic components, and the mechanisms of connecting them. These two factors are not truly independent, however, since the components may change depending on their function in the broader context, and the chosen components may constrain the possible structure of the expression.
Vocabulary
A language comprises many fundamental units of meaning called morphemes. These morphemes may inflect or combine to form words. Some words may consist of a single morpheme (such as "some"); others may be collections of several (such as "collections", which combines the morphemes "collect", "-ion", and "-s"; etymologically "collect" itself originally comprised two separate morphemes, but whether it continues to do so in modern English is debatable). Morphemes that attach to the ends of other morphemes are called "suffixes"; those that attach to the beginnings are called "prefixes". Jointly, suffixes and prefixes are called "affixes", while the "core" part of the word that the affixes attach to is called the "stem". (This is a bit of a simplifaction; there are a few more subtleties and distinctions involved, and there may also be intermediate levels called roots.) Collectively, the words of a language are known as its vocabulary.
Over time, as a language evolves, some new words and morphemes will arise, while old ones pass out of use and are largely forgotten. It frequently happens, however, that such a morpheme that has been otherwise forgotten is still preserved in one or two particular words, or an otherwise forgotten word remains in one or two particular phrases. This phenomenon is called fossilization. Examples of linguistic fossils in English include the word "fro", now found only in the phrase "to and fro", and "shrift", not now used outside the phrase "short shrift". Many other examples exist.
Parts of speech
Words are conventionally categorized into parts of speech depending on their usual function in a sentence. Words that stand for objects or concepts are called nouns; words that describe actions (generally actions taken by an agent specified by a noun) are called verbs. Words that modify nouns are adjectives; words that modify verbs are adverbs (most of which can also modify adjectives and other adverbs). Pronouns stand in for unspecified nouns. Conjunctions join parallel elements, while prepositions address the relations between elements. Articles introduce nouns under certain circumstances, and interjections are exclamations that generally stand by themselves.
Not all languages, however, have the same structure, and trying to shoehorn a rigid system of classification onto a language that it doesn't fit may lead to a drastic misunderstanding of the language. The parts of speech listed above are the ones conventionally used for English, but may not fit other languages well. For that matter, they may not even fit English all that well; with the exception of articles, these parts of speech were actually originally defined with reference to Latin, at a time when Latin was considered the paragon of all languages and the medium of learning, and overzealous attempts to make English conform with Latin norms has led to the propagation of a number of spurious grammar rules with no basis in the real history or usage of the English language, that sometimes continue to be taught by well-meaning instructors ignorant of their artificiality. So has arisen the unfounded idea that it's ungrammatical to end an English sentence in a preposition (because a preposition couldn't end a sentence in Latin), or to split an infinitive (because Latin infinitives were single words and couldn't be split).
Grammar
Vocabulary alone does not make up a language, though it seems doubtful that a language could exist without it. Also of importance is the way the words are put together. The system of rules and patterns for putting words together into meaningful expressions is called the grammar of a language.
Laymen sometimes use the word "grammar" to refer to all the rules concerning the language, including the orthography of individual words. To a linguist, grammar is usually considered to be separate from orthography, but is still a complex topic. At the very least, it involves both the arrangement of words—syntax—and the construction of words and their alteration under certain contexts—morphology. Sometimes grammar is also construed to include other aspects such as phonetics, phonology, and semantics.
The grammatical analysis of a spoken or written expression to extract its meaning is called "parsing". Humans and other linguistically capable beings seem to have some built-in capacity for parsing language, and in most cases are able to understand sentences and resolve ambiguities without conscious effort. This is not to say that all speakers of a given language do so with perfect grammar, of course, but for the most part they have little difficulty understanding or being understood, even if their language may not be perfectly "grammatically correct" according to widespread usage or the accepted authorities. Parsing is a complex problem, however, and one with which writers of artificial parsers on computers have had only very limited success. Some computer programs have been created with the ability to understand relatively complex expressions within limited regimes, or to fool those interacting with them into thinking they understood more than they did by searching for keywords and through clever mechanisms involving regurgitating back parts of their own expressions without real comprehension. While progress continues to be made, computer programs approaching human levels of linguistic parsing are still in the future.
Channels
Language can be construed through a number of different channels. Most means of linguistic communication can be divided into two general categories, however: transient or fixed. The former comprises linguistic expressions that arrive in sequences of ephemeral morphemes that must be interpreted as they are detected. The most commonly encountered transient language is spoken language, verbal utterances conveying meaning through sound, although transient languages may also be somatic (such as sign languages) or, in principle, may be conveyed through other means such as olfactory or through variations in electric field. A fixed language is a linguistic expression that is recorded in some static form, generally allowing perusal of the expression at any speed or in any order, although in practice fixed language is still usually decoded sequentially. The most common kind of fixed language is written language, comprising symbols drawn, etched, or otherwise committed onto material substrates for later visual decipherment. However, fixed language may take other forms, such as tactile or perhaps in the form of artificially implanted memories. (Nothing prevents an expression in writing or another form of fixed language from being displayed a bit at a time and erased, in such a way as to render it as fleeting as one in a transient language: one common example is that of a scrolling marquee that presents only a few letters at a time of a much longer expression. Still, the fact that writing can, in principle, be set down in a fixed form qualifies it as a type of fixed language, even though particular examples of writing may appear transient.)
Spoken language
The basic units of spoken language are individual sounds or sets of similar sounds called phonemes, several of which may combine together into a morpheme. In human language, phonemes are conventionally divided into vowels and consonants, the former comprising sounds produced by an unobstructed vocal tract, and the latter by a vocal tract which is at some point constricted or stopped to build up pressure and produce different sounds. Though vowels can vary based on the shape of the mouth, the position of the tongue, and other factors, most languages have many more different consonantal phonemes than vocalic. Phonemes in turn combine into syllables, each of which typically comprises one or more vowels gliding into each other, possibly preceded and/or followed by one or more consonants. Two vocalic phonemes that run into each other in the same syllable are called a diphthong; there are also triphthongs comprising three vowel sounds. (Conversely, a single vowel sound occurring by itself is called a monophthong.) Most morphemes of spoken language comprise one or more syllables, though there may in some cases be subsyllabic morphemes that must be combined with other morphemes to form an utterable syllable.
Even among human languages, different languages may have widely different sets of phonemes. Sounds may exist in one language that do not exist in another (and are therefore difficult for a speaker of the latter language to produce, since he wasn't exposed to them in childhood), or two sounds may represent different phonemes in one language but not be differentiated in another (and therefore be difficult to distinguish by speakers of the latter). (Different sounds that represent the same phoneme in a given language are called allophones). In English, for example, consonants are distinguished by voicing—/d/ is voiced; /t/ is unvoiced—by point of articulation—/d/ is alveolar; /b/ is labial; /g/ is velar—and by manner of articulation—/d/ is a plosive; /z/ is a fricative; /n/ is nasal. Vowels are distinguished by length, "closeness", and "backness"—/i/ is a close front vowel, /u/ a close back vowel, /æ/ an open (or near-open) front, and /ɒ/ an open back. Other languages, however, have further factors distinguishing their phonemes. In some languages (including Finnish and Japanese), length is important for consonants as well as for vowels. It's also common for the "roundedness" of a vowel to be a distinguishing factor. (English includes both rounded and unrounded vowels, but in complementary distribution; no two common vocalic phonemes are distinguished only by roundedness. The phoneme /i/, for example, is unrounded, but the corresponding rounded close front vowel, /y/, does not occur in English; conversely, /u/ is rounded, but the corresponding unrounded close back vowel, /ɯ/, does not occur.) Many languages also include grades of phonation aside from just "voiced" or "unvoiced"; Hindi adds the breathy voice in its "aspirated stops", contrasting them with other (modal) voiced consonants, and Korean also includes a faucalized voice. Many languages, including Mandarin, Vietnamese, Cherokee, Hmong, Navajo, and Punjabi, also distinguish syllables by their tones, with the pitch or change in pitch of a syllable altering its meaning. (Sometimes phonemes distinguished by tone are called tonemes.) Even those features that are distinctive in English may have further variations; while English does distinguish based on place of articulation, however, it does not distinguish between, for instance, alveolar and retroflex (the latter being slightly further back than the former), whereas other languages, like Hindi, do.
In many languages, English included, some syllables of a word tend to be given more emphasis than others, represented perhaps by a higher volume or pitch. These syllables are said to be "stressed". Sounds in unstressed syllables, particularly vowels, may be relaxed or otherwise altered; in English, for instance, many unstressed vowels converge upon a neutral mid-central vowel called the schwa, /ə/. Sometimes otherwise identical words are differentiated by which syllable is stressed; in English this often applies to related noun-verb pairs, such as "produce" (things that have been manufactured or grown, accent on the first syllable) and "produce" (to manufacture or otherwise bring into being, accent on the second syllable).
At least in the case of human language, the spoken language was the first form to develop, written language having been created after the fact to mirror spoken language, rather than vice versa. A vestige of this exists in that, for instance, someone conversant with a given language is said to "speak" it, not to "write" it: it's commonplace to say that someone "speaks Italian", for instance, but saying that someone "writes Italian" comes across as odd and unnatural. And indeed, except in the case of extinct languages known only to scholars deciphering ancient texts, it is quite rare for someone to be fully conversant with the written form of a language but utterly incapable of communicating in its spoken form.
The particular combination of phonemes that make up a given word is called its pronunciation. The study of the physical properties of the phonemes of a language is called phonetics. The study of the phonemes in their broader lingistic context is called phonology.
Different people may have slightly different pronunciations of the phonemes, or different patterns of stress, or otherwise have noticeable variations in their manner of speaking. These variations in an individual's speech are collectively known as an accent. Accents often arise from regional variations or from a learner of a second language retaining some aspects of his native language, and can therefore be roughly classified accordingly: an Australian accent differs from a Texan accent, even though both the Australian and the Texan are native English speakers; similarly, the French accent exhibited by a native speaker of French who is learning English will be very different from the "Indian accent" of a native speaker of Hindi. (Actually, the distinctions may become more specific than that; none of the "accents" mentioned here are monolithic, and finer regional variations exist.) Sometimes, however, accents come about by less clear means. Richard Garriott, creator of the Ultima series of computer games, earned the nickname "Lord British" in high school because his classmates thought he sounded like he had a British accent. (In fact, although Garriott was born in England, his parents were American, and moved to Texas shortly after his birth, so it's unlikely that his birthplace had any significant effect on his accent.)
Written language
Written language, rather than encoding meaning in sound, encodes it instead in visual symbols. The fundamental symbols of a written language are called its graphemes. The sequence of graphemes that make up a given word or expression is known as its spelling. Deciphering a sample of written language is called "reading". Most written languages developed as a way to set down in a fixed medium expressions of a language originally spoken, and accordingly the graphemes often correspond to particular phonemes or syllables of the spoken language.
When the graphemes correspond to individual phonemes, the full set of graphemes, or letters, is called an alphabet (after the first two letters of the Greek alphabet, alpha and beta). The letters need not have a one-to-one correspondence to the phonemes; sometimes one letter may encode multiple phonemes (as the English letter x, which often stands for the biconsonantal combinations /ks/ (as in "axe") or /gz/ (as in "exit")), or two or more letters for one phoneme (as the English "sh", which usually stands for the single phoneme /ʃ/—as in fact does the German three-letter combination "sch", as in "schauen"). The latter case, of a sequence of two or more letters standing for a single sound (or for a sequence of sounds not corresponding directly to its constituent letters) is called a multigraph, or more specifically by the number of letters: a digraph if two letters, a trigraph if three, a tetragraph if four, and so on. Furthermore, the correspondence of letters to phonemes need not be wholly consistent within a given language. While some languages do have an exact isomorphy such that the pronunciation is perfectly predictable from the spelling, and vice versa, this is not universally true. In French, for instance, the pronunciation of a word is indeed predictable from the spelling, but the reverse is not the case. English, in particular, is notorious for its irregularity and unpredictability in both directions, to the extent that there are words that are spelled the same but pronounced differently (such as "lead", the metal, and "lead", to conduct), and words that are pronounced the same and spelled differently (such as "lead", the metal, and "led", conducted). Because normal writing systems therefore cannot be used reliably to stand for sounds, special alphabets have been devised specifically to record pronunciations, by far the most widespread of which (on Earth) is the International Phonetic Alphabet, or IPA. In addition to the basic letter shapes themselves, graphemes may differ by additional markings placed above, below, within, or beside the letters, called diacriticals.
Not all writing systems possess true alphabets. Some (such as Hebrew and Arabic) have abjads, in which most or all vowels are omitted; some (such as the Devanagari script used for Sanskrit, Hindi, and several other languages, and the Cree syllabics developed for certain Algonquian languages) have abudigas, in which the vowels are specified by marks added to or other modifications of the consonants that form the most prominent part of the symbol system. Some writing systems (including Yi, Linear B and the writing system devised for Cherokee) use syllabaries, in which individual symbols encode not single phonemes, but entire syllables (or intermediate units called morae)—and in which, unlike abudigas, there is not necessarily any resemblance between the symbols for syllables (or morae) with the same consonant but different vowel sounds. Some writing systems (such as Sumerian cuneiform and the hanzi characters used in Mandarin and the other Sinitic languages) use logographies, large collections of symbols, or logograms, for individual words or morphemes. Some languages combine multiple systems of writing, such as Japanese, which incorporates two different syllabaries (hiragana and katakana) along with a logography (kanji) derived from the hanzi logograms; or ancient Egyptian hieroglyphics, which combined characteristics of a logography and an abjad.
The graphemes within a word (and in some cases between words) may be disjoint ("block", sometimes called "printing"), as in the Greek and Futhark alphabets, the Hebrew abjad, or the Japanese kana syllabaries, or they may be connected ("cursive"), as in the Arabic abjad or the Devanagari script. Some alphabets, including the Cyrillic as well as the Latin alphabet used to write English (and many other languages), have both a block and a cursive form.
A written language may include other elements than simply the words. Perhaps in part to compensate for the lack of the inflections possible to shade meanings in spoken language, written languages also include punctuation, extra symbols to mark the divisions between sentences and phrases or other notable elements of an expression. Some, but not all, written languages also incorporate spacing between the words, or some other method of demarking where one word ends and the other begins. (In some cases, a grapheme may have a different form depending on whether it's in the beginning, middle, or end of a word, as is the case in the Arabic abjad, for some letters of the Hebrew abjad, and for the letter sigma in the Greek alphabet). Some written languages have other variations in the graphemes or arrangement to indicate other factors; for instance, in the Latin, Greek, Cyrillic, and Armenian alphabets, among others, each letter comes in two forms, called the upper and lower case, with some protocols concerning when each is to be used.
The complete systematics of a written language, including its spelling and punctuation, is known as its orthography.
Quasilanguages
In addition to the detailed languages used for communication between ellogous beings, the word "language" is also used sometimes to refer to other, more limited systems of communication for particular purposes, which could perhaps be referred to as quasilanguages. Among these are formal languages, strict, well-defined sets of symbols and their interactions designed to unequivocally specify statements within some restricted regime. In losing their ambiguity, these systems also sacrifice their potential for open metaphor and for broadening to arbitrary meanings. The most widely used formal languages are programming languages, quasilanguages used to specify instructions to a computer.
Many other phenomena that can communicate ideas or information are also sometimes referred to as languages, though with dubious accuracy. Sometimes music is said to be a language, on the grounds that it communicates emotional content, but even if one grants the nonobvious proposition that a musical piece will communicate the same or similar emotions to all listeners, it's still not the case that music can be used to communicate concepts beyond those of mood and emotion. No work of wordless music could univocally convey an expression like "saber-toothed cat" or "I intend to take a ten-kilometer walk next Tuesday". Mathematics is sometimes said to be a language, with even less justification. Mathematical notation may be a formal language, but mathematics itself is a field of study and a collection of ideas and notations described by mathematical notation but not synonymous with it.
Some alogous beings have means of communication which, while not as expansive as human languages and not capable of expressing broad abstract concepts, still are capable of conveying a limited set of ideas, and could perhaps be considered quasilanguages. Vervet monkeys have different alarm sounds corresponding to different kinds of dangers and for various other specific circumstances. The complex sounds produced by dolphins and other whales have not been fully deciphered, but seem likely to compose at least a fairly robust quasilanguage. The communication of many animals through scented secretions could also possibly be sometimes considered a type of olfactory quasilanguage. Such limited and possibly precursive quasilinguistic systems are sometimes known as protolanguages.
Origins
When and exactly how the first true language formed is impossible to know... and probably impossible to exactly specify, given that there can be no absolute, objective dividing point between a quasilanguage and a true language. Scholars variously date the development of true language on Earth anywhere from a hundred thousand to two million years ago. Some researchers have proposed that somatic language actually preceded spoken language, but the evidence is inconclusive. It's not even clear whether language formed once among some ancestral population and then spread and diversified, or whether it was developed independently in many locations; some linguists promote each view. Be all that as it may, since the first language arose, whenever and wherever that was, languages have multiplied and modified until today on Earth there are thousands of languages of a couple hundred apparently unrelated phyla. Often some languages become widely spoken as second languages to allow communication among diverse people—such a language being called a lingua franca. Nevertheless, even where such lingue franche exist people may still have widely different native languages.
Natural languages
Natural languages are those that developed more or less spontaneously, presumably over many generations and a long period of time. Languages, like organisms, evolve over time, as new words are coined for new situations, old words fall out of favor due to various reasons, some grammatical rules are simplified and others, for other reasons, recomplicated. A population speaking the same language tends to develop different regional or social dialects; if the population spreads over a wide area, the dialects may develop into separate languages, especially if different subpopulations become relatively isolated. So did the Romance languages such as Spanish, French, Italian, Romanian, and Portuguese all develop from Latin, and so did Latin itself apparently develop from Proto-Indo-European in parallel with (among others) Sanskrit; North, East, and West Germanic, and the ancestors of the Celtic languages.
While natural languages generally derive their basic grammar and their function words (such as prepositions, pronouns, et cetera) from their parent languages, they often freely "borrow" content words (nouns, verbs, and adjectives) from other languages. English, for example, is of Germanic ancestry, and its grammar and function words reflect this, but the majority of recent borrowings have been from Greek, Latin, or French, and many words have been borrowed from completely unrelated languages from Algonquin (caucus, raccoon) to Zulu (impala, mamba). Direct borrowing is by no means the only means of word formation; some new words are formed by compounding old words together, others by analogy with existing constructions; others are formed by calque—the morpheme-by-morpheme translation of a foreign term—; still others come by onomatopoeia—the imitation of a sound—or by yet other mechanisms—not excluding arbitrary inventions.
Pidgins and creoles
When speakers of different languages are forced together by circumstance, or find it necessary to interact for trade or other purposes, often they end up creating a simplified language combining features of each of their native languages to use as a lingua franca. Such a language is called a pidgin. Pidgins have highly simplified grammar, and do not lend themselves to the expression of complicated concepts; they are perhaps intermediary between a quasilanguage and a true language. Because of these deficiencies, children who grow up around pidgins tend to extend them into more useful true languages. Such a language is called a creole. Unlike pidgins, creoles are complete, versatile languages with enough power and flexibility to allow as great a breadth of expression as any other natural language, and are often spoken as native languages.
Some languages now well established may have originated as creoles. One linguist proposed that the Germanic language group, of which English is a member, may have had such an origin, although this theory has been heavily criticized.
Constructed languages
Some languages are simply artificially created within a relatively short span of time, by a single individual or small group. Such a language is called a constructed language, sometimes shortened to "conlang". Such a language may be created for a number of reasons. Sometimes a language is created because the inventor thinks he can improve on existing natural languages in some way, and wants to promote his own language as a new lingua franca, or even as a replacement for extant tongues. These languages rarely if ever catch on as well as their inventor hopes, perhaps not least because they are not as ideal as their inventor thinks, with the patches over some of their flaws exposing other new flaws elsewhere, or with the flaws simply not patched as well as the inventor believes in the first place. On Earth, the most successful constructed language of this type is Esperanto, which boasts hundreds of native speakers and at least tens of thousands of speakers as a second language—not negligible, but certainly not nearly enough to fulfill its inventor's ambitions. Other languages are created as part of a fictional work; the languages J. R. R. Tolkien created for The Lord of the Rings are famous examples, a more recent example being the Klingon language created for the Star Trek franchise. Still other conlangs are created just for the amusement of the creator or as a sort of a work of art, with no particular thought to putting them to a broader use.
Constructed languages may be a priori—constructed "from scratch", with no vocabulary borrowed from or based on existing languages—or a posteriori—constructed as a variation of an existing language, or as a combination of several languages. Esperanto is an example of an a posteriori language; Klingon and Tolkien's languages are a priori. These terms refer only to the vocabulary of a constructed language; an a priori language can still copy an existing language's grammar (although knowledgeable creators of what they intend to be alien languages often take pains to try to make sure their grammar is as distinct as their vocabularies)—while, conversely, an a posteriori language can have original rules of grammar—or can borrow grammar from an entirely different parent language from the one from which it borrows its vocabulary.
Language and Culture
Certainly a language is reflected by a culture, and vice versa. A culture will have no words for concepts it does not have; a language that developed among a people with little technology on an isolated tropical island will not have a word for glacier, or car. Furthermore, how a language differentiates similar concepts can tell something about the culture. English uses the same word ("uncle") for both "mother's brother" and "father's sister's husband". Other languages differentiate these words, which may be a reflection on some of their attitudes toward kinship.
However, the stronger theory that linguistic limitations are an absolute determiner of thought and that speakers of languages without a word for a particular concept are unable to conceive of that concept has been discarded by modern linguists. Sometimes called the Sapir-Whorf hypothesis, this idea had considerable popularity in the early twentieth century. Upon examination, however, it became clear that some proponents of the theory had vastly overstated its dominance. Speakers of languages without large numbers are nevertheless cognizant of the differences between large quantities, that a lack of a specific word for an emotion does not prevent that emotion from being felt, and so on.
The link between language and culture, while real, continues to be frequently exaggerated. For instance, the oft-repeated myth that the Eskimos have fifty words for snow (or some other large number) is just that—a myth, or at best a severe tomatism. One problem with the claim is that "Eskimo" is an umbrella term lumping together diverse peoples who speak many different (albeit related) languages. If all these languages collectively have fifty words for snow, that's unremarkable; one might just as well say that the Europeans have fifty words for snow. If one confines oneself to a specific Aleut, Inuit, or Yupik language, there's still a sense in which the claim is true, but it's not a very meaningful one. These languages are agglutinative; they are capable of forming very long words by stringing together lengthy sequences of morphemes. It's possible, therefore, to construct a virtually unlimited number of words concerning snow, but the same could be said for words concerning shoes, or clouds, or rocks. If one confines oneself strictly to root morphemes, and not compounds and inflected forms, then the number of roots for snow in an Eskimo language is comparable to that in English, and possibly less. (The Scandinavian Saami languages do, in fact, have hundreds of separate words for snow, but clearly this isn't a universal rule.)
Taboo words
One of the clearest ways that language illuminates cultures is in what words a particular culture designates as taboo word—words that should not be said, at least not in polite company, or outside of certain circumstances. In English (and other modern Western languages), most taboo words deal with sex, genitalia, and excretion. This is not to say that these words are never said; on the contrary, they find frequent use as expletives, utterances in rage or frustration. They are widely considered rude, however, and discouraged in formal discourse; the use of such words is held to make a work unfit for children's ears (or those of sensitive adults). Some words are more taboo than others; expletives referring to sex are considered particularly jarring, such that a film using "one of the harsher sexually derived words" too many times almost always receives an R rating by the MPAA—though there's more leeway when the word is used as an expletive than with its literal meaning; a single usage in the latter case may be enough to raise the rating. To a lesser degree, some terms referring to religion are also taboo, largely due to the Biblical prohibition against taking God's name in vain. Contrary to the MPAA's prohibition against sexual terms (but in keeping with the reason for their taboohood), these words tend to be considered perfectly innocuous and even laudable when used with their literal meaning, but offensive (albeit to most people only mildly so at worst) when used as expletives.
Other cultures may have entirely different taboo words and phrases than English. Among many cultures, it is forbidden to mention the names of the recently deceased. In some cases, even homophones of these names are avoided. In other cases, it is not the names of the dead that are to be avoided, but those of particularly honored and venerated persons.In still other cultures, words that are completely acceptable in speech with some people may be forbidden in converse with others, especially those of opposite-gendered in-laws.
Of course, it isn't always possible to avoid speaking of the subjects that these taboo words describe. Rather, synonyms are used, either previously more obscure words chosen because they haven't picked up the baggage of the taboo words or new words coined to fill the role. Such words chosen to avoid taboo words are called euphemisms. Often euphemisms have a learned sound compared with the more earthy and primitive sound of the taboo words; in English, for instance, words dealing with sex, genitalia, or excretion that come from Greek or Latin are usually considered unobjectionable (given an appropriate context), while the corresponding taboo words come from Germanic roots. (Contrary to popular folk etymologies, none of the common English taboo words in fact originated as acronyms.) Over time, however, it may happen that a euphemism develops the same connotations as the original word and itself becomes taboo, and a new euphemism must be created or chosen to fill in for the old.
Figures of speech
Another common reflection of culture in language is the particular figures of speech it develops. Any language can use metaphor and simile, respectively an implied or explicit comparison of one object or concept with another. Figures of speech can also involve the rearrangement of words to give different emphasis, in repetition or omission of words, and any other of the myriad ways in which words can express something other than their literal meaning. It is perhaps metaphors that most represent societal norms, especially those metaphors that become solidified into idioms that are used forgetful of the literal meaning. These common metaphors can show what concepts people tended to associate with each other, or include relics of past practices and norms.
Games and wordplay
While a vital tool for communication and the working of complex societies, language also lends itself to entertainment, much of which falls under the category of what is called wordplay. Many games, puzzles, and amusements have been devised that draw on language's structure and complexity. Ambiguities and misleading wordings have been taken advantage of for ages to create riddles; homophones and other similarities have been put into service for puns and rebuses. Some people challenge themselves to perform such feats as finding multiple meaningful rearrangements of the same letters—called anagrams—or creating written phrases and works that contain all the letters of the alphabet—pangrams—or that purposely omit one or more of them—lipograms. One form of word puzzle that has become particularly popular is the crossword, which involves following clues to fill in an interlocking grid of words.
Linguistic features can be used for purposes of art and euphony. Words can be arranged so their stressed syllables fall into particular patterns, called meter, comprising units of one more more syllables called feet that are defined by the number of syllables and their stresses. A stressed syllable followed by an unstressed comprises a dactyl, an unstressed followed by a stressed an iamb, two unstressed a pyrrhus, and two stressed a spondee, for example. Works consisting of regular meters is called verse. Verse also often incorporates rhyme, the use of words with similar end sounds; alliteration, the juxtaposition of words with similar beginning sounds; assonance, the repetition of similar vowel sounds; and consonance, the repetition of similar consonantal sounds; these are sometimes found in non-metrical works as well. Literary works that make heavy use of verse and/or frequent or elaborate figures of speech are called poetry, as opposed to prose, works written without such overt artistic aspirations.
Translation
Reëxpressing something in a different language from the one in which it was originally spoken or written is called translation. Translation is rarely a simple process, since languages differ not only in their vocabulary but also in their grammar, and even words with the same essential meaning may have different connotations or associations. Wordplay and figures of speech may translate particularly poorly. In general, a good translation rarely if ever involves a word-for-word substitution; the best translators are great writers in their own right, and their translations are more or less a matter of writing a new work that captures, to the best of their ability, the meaning and subtleties of the original. Even so, a translation can never exactly duplicate all of the original work's qualities.
Simpler than translation, though still not necessarily trivial, is transliteration, the expression of a word or name from one language according to the alphabet and orthographical principles of another. Since languages don't all share the same phonemes, some sacrifices still need to be made, but given that the number of phonemes in a given language is finite, it's generally possible to hammer out a workable system, though it may require either unintuitive digraphs or diacriticals to stand for sounds that don't exist in the target language. Nevertheless, differences in transliteration may render two representations of the same name completely unrecognizable as being related. If two languages share the same alphabet, it may be possible to just port a word over directly without any respelling, though if the source language has different orthographical rules this may be confusing—it's such direct borrowing of unrespelled words that is partly responsible for the notorious irregularity of English spelling.
Taxonomy
Although not common on Earth, on some worlds it's an accepted practice to categorize languages according to a formal linguistic taxonomy, not unlike the etorical taxonomy used to categorize living beings. Though the fact that so many languages freely borrow words from their neighbors might seem to pose a problem for cladistic analysis, to many linguists this is no greater an issue than the phenomenon of horizontal gene transfer among biological organisms.
Though taxonomic systems differ, according to the most common linguistic taxonomic system the smallest taxonomic unit—corresponding to the species in etorical taxonomy—is the index, above which—the analogue of the etorical genus—is the type. The succeeding levels, in ascending order, include the complex, the branch, the family, the stock, the phylum, and finally the bole.
Vivilinguistics
In a few rare places (Earth not among them), the majority of linguists believe languages to be literal living beings—and so classify them according to standard etorical taxonomy. A few scattered linguists elsewhere also hold to a similar belief, though they may or may not be aware of the larger like-minded community. The treatment of languages as living beings is called vivilinguistics. Vivilinguists argue that languages, like other living things, are born, die, reproduce, evolve; they react to changes in their environment; in short, they fulfill the qualifications for life, and the fact that they are intangible and conceptual should not be a hindrance in considering them living. While unified in considering languages to be living beings, however, vivilinguists remain divided on the details; apolinguists hold each language to be a separate being, while the pherolinguists believe that it is each person's idiolect that should be regarded as a living entity, essentially a symbiote of the speaker. A fringe group within the already fringe theory of vivilinguistics, the monolinguists hold that it is language as a whole, not individual languages, that are really a coherent entity. And some vivilinguists, the climacolinguists, combine vivilinguistics with hathroics to conclude that the apolinguists, the pherolinguists, and the monolinguists are all right, and that languages at all these levels are living things simultaneously.
Opponents of vivilinguistics have mustered a number of counterarguments, holding up phenomena such as dialect continua and idioglossias as proving that the idea of a language as a distinct living entity is incoherent. Vivilinguists generally dismiss these claims as showing only that languages as living beings are counterintuitive, not that they are impossible. One of the best known arguments against vivilinguists is the Vürzer Paradox. According to this argument, if languages are alive and evolving, they may eventually develop intelligence; if they develop intelligence, they may develop language of their own; and eventually we are left with a preposterous infinite chain of linguistic entities. Overall, vivilinguists find the Vürzer Paradox uncompelling on a number of grounds, and even their disputants now generally recognize this particular as weak and contrived; despite its notoriety among laymen, it's not now taken seriously in the scholarly community.
See also
Language on Wikipedia