Tag Archive: vocabulary


After I answered a question on the amount of vocabulary, apparently, Google thinks that I am an expert for all things vocabulary related. Why else would a search for “how to create words for a conlang” hit my blog?

But okay, unknown googler, I will answer your question. Of course, this is my personal take on it and it is probably inadequate for your needs. Feel free to comment which parts you personally consider good, bad or ugly.

Generally, you are in one of 3 stages:

  1. just decided to start
  2. already formed the first sentences in $CONLANG
  3. some form of basis exists

These 3 stages differ. The first words are very important for the character of the language. Later in the development, they can be relegated to legacy terms or even struck off from the vocabulary*, but the damage will already been done. These words are going to appear in each and every of the first example sentences you are going to use. They will greatly influence how you perceive the language. Which words you will coin for this depends on your personal preferences and the culture of the language. A stone age culture might want a term for ‘hunt’, a future tech language might prefer a word for ‘compile’ for the first example sentences. Use something typical. Build a unique feeling for the culture with the first sentence you are going to write. Make sure that the words fit your phonotactics (or change them while you still can) and fit your phonoaesthetics. Try to speak them, shout them, sing them. No, this is not a joke. Singing words makes sure that you avoid too horrible insults to pronuncability, taste and common sense. At this point, developing a feeling for the language is most important.

After you finished the first few sentences and have a stable-ish grammar**, your priorities can change. You probably still have to work on hitting the exact aesthetic quality, but it will become easier. It will however become very important now, not simply to relex English. You now need to think about what meanings a word has. If you are fluent in a different language, it might help you. But here is a random list of things to keep in mind to prevent relexing:

  • Work from sentences. At this point, to look at a list (like Swadesh) probably means to take the English assumptions and meanings and translate them 1:1. Thus, better think about how to use each word in a sentence.
  • Think about what the word means. Take ‘spoon’ and ‘fork': they are both implements to take food and move it into your mouth. The difference is the shape. Maybe you want to have a word for this purpose and specify the exact kind of implement differently. Or think of the difference between ‘to buy’ and ‘to acquire’. OTOH, if you have the feeling that a word mushes two meanings together, think of the meanings separately (right can be the antonym of left and of wrong).
  • Think about your culture: If your conculture uses chopsticks for eating (or the right hand), the words for ‘fork’ and ‘spoon’ might be complicated terms or loan words from a language of a culture which uses them.
  • Base your vocabulary distinctions at least partially on a foreign language, best of course, is one you are fluent in. If a language which kinda fits the culture already exists, an online dictionary might come in handy: translate a term into a target language (which fits the perceived quality of the language) and take one of the the translated terms and re-translate them into English. Or ask someone who knows the language.
  • Consider usage: Maybe a certain word which your L1 uses intransitively is used transitively in $CONLANG, maybe it is the other way around. Maybe there is a choice where English does not allow one, maybe a word which is almost exclusively used in passive voice (frex: to get relegated) is used in active voice in $CONLANG (German uses absteigen), or the other way around (English uses to assign a grade, but rejistanian ‘rala’sidekhir runa which literally means “to get reached a grade”), maybe the meaning is more general, more specific, more polite, more vulgar and maybe it has different connotations.
  • Consider existing vocabulary. Given existing Unabsteigbarkeit of your language*** you might be able to derive or compound the word. Remember here that it has to fit the culture. Remember also, that many languages do compound verbs (Rejistanian has many verbal compounds with ‘visko for “to speak”. Some examples are ‘ytinvisko=to.change-to.speak: to translate, or ‘idavisko=to.turn.into-to.speak: to declare). Or express it idiomatically.
  • There are always onomatopoetic expressions to fall back to (‘iaia for ‘waking up with a hangover’ for example is supposed to sound as it feels). And even when you are not using a strictly onomatopoetic word, think about whether the sounds represent what the meaning represents. Maybe create own onomatopoetic rules (rejistanian for example often uses the u as only vowel in stems with an unpleasant meaning).
  • Maybe there is no term. This does not mean that there is a Sapir-Whorf component involved, there can be many reasons why ‘they have no word for it’. Many English speaking countries do have democratic governments despite the lack of a word for ‘kandidieren’ (they say: to run for office) and they seem to dislike work as well but lack a word for ‘Feierabend’ (the end of the workday as well as the time after work). Maybe $CONLANG has no word for something $NATLANG considers important (rejistanian for example lacks a word for ‘art’, mainly because its definition is so wishy-washy that I cannot get to the associated concepts behind it).
  • Write it down. Not only the general idea, but also things like connotations and usage.

From this level on, you can start using The Method, described below, however still remember that the first words shape the character of the language rather much. An example for this is probably ‘sidekhir. Its original meaning is “to reach”. Not only has its meaning expanded into many different areas (to arrive at a place, to get a mark, to get/change into a state), but its at that time dubiously-legal became quite common. (This is not necessarily a bad thing. Maybe you will like the place you reach when something seemingly random influences your language.)

You have made it into the next stage? You don’t know? Well, if you find that you actually can say things without constantly coining vocabulary, you are out of the hard area and probably have established enough feeling for the language to think of the pitfalls of relexing automagically. Depending on how different the culture of $CONLANG is from your own, you might always have issues reaching its state well enough to easily figure out how $CONLANG says it, but you are much less unsure about these things. Now you can think of areas of meaning and fill the gaps in vocabulary. At this point, you can probably start to use lists. I personally still abhor it. Lists however are not the only thing, Languages also have terms which might not have a direct equivalent in other languages. Maybe you think that your culture considers certain things important enough to name them (‘xikila means to qualify via 2 different routes and it became important after it happened in my soccer leagues not only once but twice), maybe you personally want a name for something (rejistanian for example has ‘kamandi (to let others down out of laziness, incompetence or bad motives) and ‘selka (to contribute your share of the work or more) because I thought that group work required these expressions). Maybe you want to include inside jokes, there is nothing wrong with that, Klingon does it, Rejistanian does it, Kamakawi does it to a point. I personally use The Method to help assigning sounds to a meaning.

The Method works like this:

  1. open your (alphabetically sorted) $CONLANG to $NATLANG dictionary in a text editor, make the window small enough only to show about 25 lines (using a textmode editor would be ideal)
  2. close your eyes and randomly scroll in the file
  3. open your eyes and look at the first and the last word in it
  4. the new word needs to fit in between there somehow. This will mean that certain changes to meaning are required to fit the ‘feel’ of the word. It also means that the distribution of initial sounds is more natural. The areas which have already many words will gain words quicker than the other areas.

And don’t forget: have fun doing it! :)

* if you are the kind of person who does that. I am not.
** is it ever really stable? That was a rethorical question.
*** I want to establish the word Unabsteigbarkeit for the ability of a language (including surrounding culture) to build new words via affixes. Toki Pona has the lowest Unabsteigbarkeit out there (it is completely isolating), Esperanto is in a completely different league (the word Unabsteigbarkeit is ne-mal-promoci-ebl-ec-o in Esperanto, just without the dashes which are just inserted to show the affixes), pun intended.


There are also new IRC quotes for you:

I fully agree with malvarma: Why does everyone seems to love Quenya?

( malvarma) I think I will learn a language that sounds pretty to me.
( B-rat) learn sindarin or quenya!
( malvarma) Klingon sounds pretty, but it’s too hard.
( malvarma) I think quenya sounds like dreck.

And here a quote just for the lulz of it:

( malvarma) ithkuil does sound nice. lojban sounds like a nerd mating call.


And the word of the day? It is vylisni’het, which means “lip”. Since I know no good example sentence with it, here a bad one.

Example: Vylisni’het’ny’il min’redy takani. (lip-PL-GEN2S 3PL-be.red mature: Your lips are in an arousing shade of red)

Someone googled for exactly this phrase and found this blog. I will not ask ‘why, oh google, why?’ instead, I will try to look at the issue a bit. The answer of course is: it depends. And in the case you do not do a language like Toki Pona, which has a fixed set of vocabulary as part of its design, the answer will probably be ‘many’! I have more than 1700 words in rejistanian (not all are stems), but still I feel that I am not finished. Sure, I can say things like that there is no fixed order in which the home team is listed in traditional rejistanian sports, or that the train was too late and thus I was unable to come on time, but I am not sure that I could talk about everything I talk about in real life in rejistanian. I know that if I was confronted with a malfunctioning car in Rejistania, I would not be able to understand what exactly was wrong with it only based on the description of the mechanic. I know that I would not be able to ask in a rejistanian cosmetics store which foundation, eye shadow, lip stick, concealer, etc, they would recommend because none of these words exist. So from personal experience: more than 1800 words. Of course, it depends on the language. A language used by stoneage tribes needs no word for carburator or gasket. A language used by aliens with tentacles (like the Rikchik) needs no word for finger. It also depends on what a language is supposed to be used for. A naming language will not require more than about a hundred words. A language which will only be used for a specific purpose only needs the vocabulary for this purpose. However, when you want to rickroll people, ask the referee about the location of his seeing-eye dog, order food in a restaurant, discuss the latest election, tell about that new band you discovered or convince people that Bielefeld does not exist and what is really there, then you need word, lots of them.

So, while I cannot give numbers, I can tell people that the only way to deal with the creation of vocabulary is to grin and bear it. As soon as a basis is done, there will not be the need to create words all the time. If the language has a clear purpose, it will reach the point where you can see a sentence and immediately know that you coined all its words far quicker, mostly because you have a clear direction into which to direct your effort without being distracted by attempting to explain to people that Bielefeld does not exist and what really is there in your constructed language. It also helps to have lots of ‘Unabsteigbarkeit’ in your language, ie the ability to create a word from affixes (intolerability is a good English example), as well as much compounding but not a cure-all. Some compounds make little sense unless you remember the reason behind them, so they have to be documented just as well.

I guess my significant other has a much more laconic way to answer the question though:

(Rejistania) In the category ‘who googles such slani and finds the RWotD’, the prize goes to that person who googled ‘how many words does a conlang need?’
(Allanea) the answer is simple
(Allanea) M
(Allanea) O
(Allanea) A
(Allanea) R

He is perfectly right. I can imagine that even after a century of using and improving rejistanian, the future me will find new lexical gaps. And IMHO that is a great thing.

eljanicator provided some numerical values, which probably work well as ballpark numbers:

I don’t remember where I read it exactly, but I’ve heard that a very limited special-purpose language needs at least 100 words, a trade/diplomacy pidgin needs at least 500, a fully functional language for everyday communication in a wide variety of subjects needs at least 2000, and most modern-day real-world languages have at least 6000. Many have considerably more. The very “largest” ones have a few hundred thousand, though in that case most people who speak it only actually know a small fraction of the total, as the bulk of the language consists of highly specialized or exotic words that most people don’t really need or encounter in ordinary life.


The word of the day for today is ytanu’het which means neck or rather the same as the German word ‘Hals’, ie: everything between the level of head and shoulders.

Example: Ytanu’het’xe mi’tore. (neck-GEN1S 3S-hurt: I have a sore throat)

Follow

Get every new post delivered to your Inbox.