ozarque ([info]ozarque) wrote,

Linguistics; conlangs; Language Construction 101A

LANGUAGE CONSTRUCTION 101A

Suppose you want to construct a language that might be of some practical use for communication in your fictional universe ... here's how it's done. It's not difficult.

STEP ONE: Decide whether you want a polysynthetic language (where you construct meanings by assembling lots of small meaningful pieces into larger chunks) or an isolating language (where words are made up of only a few meaningful pieces.) Polysynthetic is quicker and easier.
STEP TWO: Choose an order for verb, subject, and object. Six are mathematically possible; pick one.
STEP THREE: Choose the structure and assembly rules for your syllables. For example, you could decide that all syllables must contain a vowel; that none can begin with more than one consonant; that all can end with either a vowel or a consonant; and that no more than twelve syllables may be in a single word.
STEP FOUR: Choose a set of phonemes (chunks of sound that can change meaning). The fact that we understand "bat" and "sat" as two different words proves that the sounds of "b" and "s" in those words are two different English phonemes. Hawaiian has eleven, English has about thirty-five, and all human languages choose from the same set. You could pick sounds no human language uses, of course, if you're constructing a language for ETs, but you couldn't be sure that your human readers would be able to pronounce them.
STEP FIVE: Set up an inventory of syllables that your rules will allow. Like this.... a, e, i, o, u, ba, be, bi, bo, bu, bab, beb, bib, bob, bub, baba, bebe, bibi, bobo, bubu, bubab... and so on, till you've listed as many as you feel you need to get started with. If you only allow three-syllable words in your language and you have only seven phonemes, the list will be shorter than if you're using ten syllables and twenty phonemes.
STEP SIX: Decide how you want to handle your basic grammar markings. How will you mark something as plural? As past time? As completed or ongoing? How will you mark something as subject, object, possessive, et cetera? Write the rules you need to do these things. Suppose you decide to mark these basics by adding syllables to your words; then you'll have rules like these: "Adding 'ba' at the end of a word makes it plural." "Adding 'ga' at the beginning of a verb marks past time." And so on.
STEP SEVEN: Assign meanings to your listed syllables for your core vocabulary. That is, for words like "house, person, child, tree, fire, make, eat...".and for words you invent because they're as basic to your fictional culture as "fire" is to human culture.
STEP EIGHT. Make your basic decisions about syntax. How will you make a sentence negative? How will you construct a question or command? How will you combine two or more sentences into a single bigger sentence? Once again you could do this by using syllables. Like ... "Adding 'fo' to the last word in a sentence marks it as a question." No human language does these things by repeating words, but for an ET language you could decide to have a rule that said "A sentence in which every word is repeated twice is a question." Human beings would find that cumbersome, but your ETs might not; that's up to you.
STEP NINE. Take some simple text . . . a short folktale is a good choice ... and start translating it into your language. This serves as a diagnostic probe to let you know what you need to add or change.

And there you are... a usable language.

Suzette

  • Post a new comment

    Error

  • 40 comments

[info]mylittleredgirl

March 19 2005, 16:30:44 UTC 7 years ago

Oh my, I adore you so much. I think you just summed up all the required undergraduate linguistics courses I took in nine easy steps. (And with a geeky usefulness, too!) I tried to write out a rule list like this, but was bogged down with all the information... but now I see how streamlined it can really be to begin with.

... I will now spend my free time for the next few weeks making up languages. *claps hands with geeky joy*

[info]leora

March 19 2005, 21:48:29 UTC 7 years ago

Are you tempted, as I am, to try to make your rules clear, feasible, but as incredibly annoying as you can?

Want to make a sentence a question? No problem - just form the correct way to say it as a statement, then reverse the word order. Have fun you poor, fictional people! Mwahahahaha

[info]leora

7 years ago

[info]interactiveleaf

March 19 2005, 16:35:12 UTC 7 years ago

Why is Rule #2 necessary? I believe that many human languages don't place much emphasis on word order; am I wrong about this?

[info]pthalogreen

March 19 2005, 17:03:42 UTC 7 years ago

Many human languages have free word order (such as Hungarian) but even these languages generally have a word order into which the words tend to fall anyway. In Hungarian for example if you look at the following sentences which all mean the same thing (John killed Peter):

1. János ölt Pétert
2. János Pétert ölt
3. Pétert ölt János
4. Pétert János ölt
5. Ölt János Pétert
6. Olt Pétert János

They all have a slightly different connotation. Sentence 1 is completely unmarked and is SVO word order. Sentence 2 could mean "John killed Peter (and not someone else)". Sentence 3 is very similar to Sentence 3 and I'm afraid I can't quantify the difference even though I know there is one, I think the emphasis might be even stronger. (I'm not a native speaker). Sentence 4 is more "It was John that killed Peter" (and not Jason that killed Peter). Sentences 5 and 6 are much less common but are grammatically acceptable, with the emphasis being on the verb this time.

Anonymous

March 19 2005, 17:04:26 UTC 7 years ago

(michael farris)

I'll take a try at this one. Word order is important in every human language (that I've come across and I've never heard of any where it's not). But word order accomplishes different things.

In English, among other things, it marks subject (the doer, before the verb) and object (the doee, after the verb).

The woman sees the boy.
is different from:
The boy sees the woman.

Who sees who(m) is determined here by the position of the nouns relative to the verb.

In Polish (and many other languages) this doesn't happen, the words for 'woman' and 'boy' will have separate forms depending on whether they are subject or object (for 'woman' those are kobieta and kobietę respectively). In theory, you can order the words any old way and the sentence will be interpreted according to the forms of the nouns and not word order.
In practice however, there are some rules, for example Polish word order follows a general rule that you mention older information first and then newer information (such as a new topic or the answer to a question) so the different word orders will be perceived as referring to different situations. This is similar to stressing different words in an English sentence (often, the most heavily stressed word in English would be moved to the end of the sentence in Polish).


[info]genderfur

March 19 2005, 17:07:14 UTC 7 years ago

English does. German does. I don't know about "many" but I know about English & German. In both cases, changing the word order can change the meaning of the sentence, or the denotation of certain elements in the sentence.

These two sentences have exactly the same words, not even variations of form:
Did you see the hat? Did the hat see you?)

[info]pthalogreen

March 19 2005, 17:13:49 UTC 7 years ago

English and German are very closely related, so it's difficult to look at the two of them and extrapolate something for "all the world's languages."

The reason "Did you see the hat? Did the hat see you?" works in English is because English does not mark the accusative case (except with I/me, we/us, who/whom). In a language which does mark the accusative case, you will find that no matter what the word order, the same object gets seen "Ön látta a kalapot?" "A kalapot látta Ön?"

[info]genderfur

7 years ago

[info]genderfur

7 years ago

[info]griffen

7 years ago

[info]frosttalon

7 years ago

[info]coraa

March 19 2005, 17:42:17 UTC 7 years ago

My understanding of this (which may be wrong; I'm not a linguist) is that even languages that have no necessary word order -- that is, the meaning is comprehensible no matter what order the words are in -- still have conventional word order which is followed unless there's a really good reason to break it.

For instance, Latin is a classic example of a language in which it's perfectly clear which word the subject is no matter where you put it in the sentence. However, when I was learning Latin I was taught that sentences were almost always written in Subject Object Verb order, unless there was a distinct reason to do otherwise (fitting a poetic meter, for instance). A non-SOV ordered sentence would be understandable but would sound weird, for lack of a better description.

[info]griffen

March 19 2005, 20:14:30 UTC 7 years ago

*nods* But not all languages use a Subject + Object + Verb method of ordering words (and I mean in any of those six combinations). ASL, for example, uses a "topic-comment" structure. So depending on what you want to emphasize, that's what goes first:

YESTERDAY - STORE - ANN - ME - GO. (Falls into a Subject-Object-Verb form, which is actually the most common in ASL)

YESTERDAY - ANN - ME - GO - STORE. (Falls into a Subject-Verb-Object form, the least common in ASL)

These are two different sentences and the structure makes them different. In the first sentence, the emphasis is on the store - it's the topic - and the comment describes who went to it. In the second it's the people being discussed - they're the topic - and the comment is what those people did (and most ASL speakers would still order the words "STORE - GO" at the end). (The word "yesterday" is a tense-word; in ASL you always sign the time you're discussing first, to establish a tense for the verbs.)

So although ASL may *sometimes* fall into a Subject-Verb-Object structure, it ain't necessarily so for most of its sentences.

[info]ozarque

7 years ago

[info]griffen

7 years ago

[info]pecunium

March 19 2005, 19:48:37 UTC 7 years ago

For languages which have a more free word order (like Latin, or Russian) one tends to find declension, which means the parts of speech are defined by varying the word.

In Russian, by way of example, Piotr loves Anna becomes "Poitr lyubit Annu."

Anna loves Peter becomes "Anna lyubit Petra"

In English the object of the affection immediately follows it. By arranging the words in different order we get completely different meanings.

Making the word order different in Russian gives shades of meaning, "Annu Piotr lyubit," is more akin to, "Anna is loved by Peter" than the (not wrong, but not really right) translation, "Peter loves Anna."

TK

[info]pthalogreen

March 19 2005, 16:56:23 UTC 7 years ago

ooh fun.

I created a language once. I was only 10, and it was a visual gestural language, and I didn't have a linguistic understanding of English (the only language I spoke at the time) so I had no idea that "be, is, am, are" are the same lexical item in English because they sounded so different to me. I knew how to use them perfectly well, but that lack of knowledge caused a problem and it ended up being a visual gestural code that followed English grammar. But I was very proud of it at the time. :) My mother probably would've been proud of me if she understood what I had done better, but since I stopped speaking English in favour of my language, she was more frustrated than anything.

I love polysynthetic languages. Hungarian is agglutinative and everything makes so much logical sense. Even the irregular verbs are all irregular in the same way.

Hungarian is SVO, but the word order is also free. Switching the order usually marks sentence stress (the focus of the sentence is right before the verb. In an unmarked sentence, the verbal prefix takes up the focus position. With negation, question formation, imperative and emphasis the verbal prefix is removed from the verb and comes after the verb "kimegyek" (I go out) "Nem megyek ki" (i don't go out), because you can't have more than one item in the focus position at the same time. But technically all 6 orders for subject verb and object are acceptable. You can tell from the case ending on the nouns which is the subject and which is the object anyway.

Hungarian requires that each syllable have a vowel and that there not be too many consonants together at once. I think the rule is generally 2 consonants together maximum, with a syllable break in between, but foreign words that have come into the language don't follow this rule. Hungarian took the word for school from the slavic, but because at the time Hungarian had a rule that no word could start with more than one consonant, a vowel was added to the beginning of the word. So "iskola." But words that came into Hungarian much later, like strand from german (beach, river bank) doesn't obey that rule.

I think an artificial language should have some irregularities built in as well to more accurately mirror human language.

Hungarian has 40 phonemes. a á b c cs d dz dzs e é f g gy h i í j k l ly m n o ó ö ő p r s sz t ty u ú ü ű v z zs. The digraphs and trigraphs are considered to be one letter and they constitute a single sound.

Hungarian has 18 cases. 9 of them are locative. in the box, into the box, out of the box, on the box, onto the box, off of the top of the box, next to the box, towards next to the box, away from next to the box. Possessive suffixes go on the possessed item, not on the possesor. So "a házam" (my house. ház = house, -m = my) Consequently, there's verb "to have" Instead you say "Van egy macskám" There-is one cat-my. Or you can say "Nekem van egy macskám" where nekem is the first person singular dative pronoun. So "to-me there-is a cat-my"

Hungarian also has a thing where you can take the case endings off a word and conjugate them like you'd conjugate a verb. So "Szív-em-ben" = Heart-my-in or In my heart (it's written without the hyphens but I put hyphens between the morphemes for clarity) and you can also say bennem (in me) benned (in you) benne (in it/him/her). It's very productive.

Hungarian normally marks an interrogative question through intonation alone. "Wh questions" get a wh word (except most of them start with m in Hungarian). In Serbian they can use the word "dali" at the beginning of a sentence to mark a question, in addition to intonation, but it isn't required. Slovene has the same thing, with "ali". Both mean "if", if I remember correctly.

</ramble>

[info]pthalogreen

March 19 2005, 17:09:53 UTC 7 years ago

forgot on important word "consequently, there's verb 'to have'" should have been "consequently, there's no verb 'to have'".

Also, most languages that I'm familiar worth have the Wh questions all starting with the same letter and the answers always starting with the same letter (th in english).

Where? there. What? that. When? then.

Hu: Mikor? amikor. Melyik? Amelyik. Miért? Azért.

Slovene: Koliko? toliko. Kako? tako. Kdo? To.

[info]pecunium

7 years ago

[info]beckyzoole

7 years ago

[info]elfwreck

March 19 2005, 17:22:57 UTC 7 years ago

The Tower of Babel story is a favorite of many conlangers (also seen here); it's widely know, it's about languages, and it works with concepts that probably don't directly relate to how or why the language was created, so it forces the creator to stretch a bit. One gets to decide things like "do I translate this as literally as possible, or do I pretend it's a tale from this language's folklore, which may have entirely different concepts?"

[info]conlangs has advice & random commentary for people interested in making their own languages, or just interested in the idea.

[info]phrawzty

March 19 2005, 17:28:44 UTC 7 years ago

That was super! thx. :)

[info]jmkelly

March 19 2005, 19:12:33 UTC 7 years ago

Maybe this should go in Language Construction 101B, but Step 6 could be enlarged quite a bit. I'm not a linguist, but from studying Homeric Greek and reading a little about other languages, I'm aware that there are limitless possibilities in what can be conveyed by grammar markings. There's a slew of concepts beyond past/present/future, completed/ongoing, and punctative/frequentative; there can be different number concepts, like Homeric Greek's dual, or an "at-least-one" marker (I wish we had one of those in English). There are markers like Navajo's, which (as I vaguely recall) address intention and success: "He went to town" and "He intended to go to town but didn't get there" can be distinguished solely by markers. (Navajo has a ball in other ways too, with different verb forms for different kinds of subjects.) And there's a marker in Maidu for lack of firsthand knowledge: the verb suffix c'oj "must always occur in sentences wherein the narrator is reporting something which he did not, himself, see happen" (William Shipley: The Maidu Indian Myths and Stories of Hanc'ibyjim).

[info]pecunium

March 19 2005, 20:05:04 UTC 7 years ago

I think lots of those are in the latter half of 1B, maybe the first half of 2A.

TK

[info]ceileidh

March 20 2005, 01:20:45 UTC 7 years ago

This is an awesome site (the constructed language kit) for conlangs. It also has links to conworld sites...

http://www.zompist.com/kit.html

[info]elliejane

March 20 2005, 22:24:43 UTC 7 years ago

Hello, can I just say thank you for this link? Particularly for the words "conworld sites". I didn't know there were conworld sites out there! Although I once promised myself I would never start world building, I find I do need to know a few things for something I'm trying to work on. This is going to be an immense help!

[info]ceileidh

March 21 2005, 11:28:34 UTC 7 years ago

You're welcome. 8 )

I hope you find this useful. I'm going to use some of the ideas from this site - conlang - for a school project.

[info]dsgood

March 20 2005, 01:57:56 UTC 7 years ago

Some of those rules seem to apply only to spoken languages. Human languages also include visual (most sign languages) and tactile (deaf-blind sign languages) ones.

Alternate history idea: Most humans use sign languages; only the blind use spoken languages.

And then there are languages using other senses -- including ones humans don't usually have.

[info]mylittleredgirl

March 20 2005, 03:04:20 UTC 7 years ago

Yes -- in the case of telepathic aliens, how would they structure conversation? I've read theories that say that human thought is constrained by speech and language in some ways... so would those aliens have a telepathic language? Would some or all of the above rules still need to apply (in a corresponding telepathic way)?

Oooooooh. Cool. *goes off to think about that*

[info]ozarque

7 years ago

Anonymous

7 years ago

[info]naath

March 21 2005, 20:04:29 UTC 7 years ago

That would be cool but... hard to put in a book. Jean Auel has all her Homo Neaderthalis speaking sign (because they can't speak) but that means that she just writes what they 'say' in English. Personally I think that sign languages are less likely as long as the race concerned can speak - because spoken words can be heard (by those who can hear) at a distance, when you're looking away etc. whereas to 'listen' to sign you have to be looking at the person you are talking to which is fairly restricting.

[info]qiihoskeh

March 20 2005, 13:24:08 UTC 7 years ago

STEP ONE

I'm confused. Did you really mean to say polysynthetic? I thought that required things like object incorporation and marking all core arguments on the verb and stuff like that. But maybe it doesn't matter in this context.

[info]ozarque

March 20 2005, 14:42:33 UTC 7 years ago

Re: STEP ONE

I understand -- and I apologize. The confusion is entirely understandable, and is my fault. I think you're right that it doesn't matter in this context, but that may be excessively sloppy (and lazy) of me. I just hate the thought of getting into a discussion -- in this context -- of the many proposed definitions of "polysynthetic" and "agglutinating" and the proposed distinctions between the two ... and so on. I'm sure you know what that would be like. All I intend to mean by "polysynthetic language" is a language in which most of the heavy lifting is done in the morphology.

Thanks for your comment, and if you feel that further (and rigorous) exploration of the issue is needed, just let me know. It would have to wait until April, but it can be done.

Suzette

[info]daszeria

May 9 2005, 06:14:33 UTC 7 years ago

"Polysynthetic is quicker and easier."

Suzette, I have to disagree here; most of my conlangs would qualify as polysynthetic, but I've also done analytic/isolating ones. Designing a realistic and expressive morphology is no easier than designing a system of syntactical rules; it may _seem_ easier because you can cheat and make an agglutinating morphology that's perfectly regular, but a realistic morphology isn't simple.
Create an Account
Forgot your login or password?
Facebook Twitter More login options
English • Español • Deutsch • Русский…