| all_syllables_list.txt | ||
| all_syllables_table.txt | ||
| Dictionary.md | ||
| generate_all_syllables.py | ||
| README.md | ||
| Syllables.md | ||
| vocabulary_table.ods | ||
Table of Contents
The Generator
There is a python scrypt to generate a list of syllables. In it's first lines, there are defined phonemes of the language: consonants and vowels, and names for consonant groups. Next go the functions to generate different files, and the main body with command line parsing.
usage: generate_all_syllables.py [-h] [--table] [--list] [--markdown] [--output OUTPUT]
Print matrix in different formats
options:
-h, --help show this help message and exit
--table Print as table
--list Print as list
--markdown Print as markdown
--output OUTPUT Output file
There are four possible files: list, table, markdown list and markdown table. Markdown version of the list loads symbol definition from the .txt file. Current algorithm for loading the definitions is not efficient, but works fast enough.
About the Language
Phonetics
All possible syllables are displayed in this file.
Consonants
There are 16 consonants, which can be classified by two traits: a place of articulation and a manner of articulation. There are 4 places of articulation: back (Velar), up (Palatal), down (Alveolar) and front (Bilabial).
- The first group is back, as they are closer to the lungs;
- Next goes up, as it has a body of a tongue lifted;
- Down with the body of a tongue lowered and the tip lifted;
- Front, pronounced with the frontest part of the bouth - lips.
There are 4 manners of articulation:
- plosive: it completely blocks the airflow,
- nasal: it allows airflow through the nose,
- fricative: it allows airflow through the mouth,
- approximant: it represents the phonemes that could be considered semi-vowels.
A consonant phoneme lies at the intersection of two groups of these two classifications. Let's list all the consonant phonemes:
| Letter | Place | Manner | IPA | English | Occures in words |
|---|---|---|---|---|---|
| k | back | plosive | k | k | staCKer |
| g | back | nasal | ŋ or ŋɡ | ng | stiNG |
| h | back | fricative | x or h | kh, h | scottish loCH, russian Хурма (khurma) or Hot |
| j | back | approximant | j | y | Yellow |
| c | up | plosive | tɕ or t̠ʃ | ch, ć | polish pisaĆ, russian Через (cherez) or CHill |
| y | up | nasal | ɲ or ɲc | ny, ñ | espaÑol, canyon or russian коНЬКи (kon'ki) |
| x | up | fricative | ɕ or ʃ | sh, ś | Sure, russian Щепка (sh'epka), polish coŚ or SHame |
| l | up | approximant | ʎ or l | ll, l | miLLion or Lamb |
| t | down | plosive | t | t | Tall |
| n | down | nasal | n or nd | n, nd | Neck or eND |
| s | down | fricative | s or θ | s, th | Sad or THink |
| r | down | approximant | ɹ or r | r, rr | Ray or spanish peRRo |
| p | front | plosive | p | p | Put |
| m | front | nasal | m or mb | m, mb | Mold, laMB |
| f | front | fricative | ɸ or f | f | japanese Fuhai or Fall |
| w | front | approximant | w or ʋ | w | Well or indian Vine |
As you can see, there are some variations allowed, and not all phonemes have their standard IPA definitions. Namely, "j" is back approximant here, and trill "r" is allowed down approximate. That is designed that way to allow more people to pronounce every phoneme easily (I doubt that everyone can pronounce the real velar approximant). It also worths noting that nasal phonemes are allowed as double clusters, namely ng, n'k', nd and mb, as not every language has a full inventory of nasal consonants. These clusters should be relatively easy to pronounce and distinguish in speech.
Vowels
Vowels, like consonants, have two classification - opennes and roundness. There are 5 vowels: two rounded, three unrounded, two open, two close and one middle. Let's list all the vowel phonemes:
| Letter | Opennes | Roundness | IPA | Occurs in words |
|---|---|---|---|---|
| a | open | unrounded | a or ɑ | polish jAjo or fAther |
| e | middle | unrounded | e̞ or ɛ | spanish bEbé or bEd |
| i | close | unrounded | i | frEE |
| u | close | rounded | u | bOOt |
| o | open | rounded | o̞ or ɔ | spanish tOdo or bOy |
Generally the vowels are clear, like in spanish, and don't form diphtongs. It is advised to put a glottal stop between vowels, but it should not sound as the "k" consonant phoneme.
Writing Systems
The main writing system is latin alphabet, that is used on this page (you can see it in the Phonetics section). However, if you wish, you can create your own script, or use my own. It is relatively easy to create a syllabary or abugida for the language, as consonants and vowels are already classified, and you only need to assign some graphemes to the classes. In my custom script, there are four basic letters: <, c, b and ɛ. The set above is for the back consonants (it faces left). Shape < is for plosives, as it's sharp and pointy-like, just like plosives; Shape c (or u for down) is for nasals, as it's closed (rounded) on one side and open on another, just like mouth is closed and nose open; Shape b (d for up, p for down and q for front) is for fricatives, as it looks like an obstruction on the way, or as a wisstle; Shape ɛ (w for down, m for up) is for approximants, as it consists of two parts approaching each other. Direction to left (<) is for back consonants (k, q, h, j), up (m) - for up (c, y, x, l), down - to down (t, n, s, r), and to right - for front (p, m, f, w). For the vowels, you add (or don't add) diacritics:
- - for unrounded,
- v for rounded,
- on top - for open
- on bottom - for close.
- For "e" you don't add any diacritics.
Example: word kelanuifor (you don't need to know what it means, we're discussing only phonetics here) would look like <m̄ṷ'q̱w̌.
It's syllables are: ke, la, nu, if, or, and to separate forvard (CV) syllables from backward (VC) we can use '.
Grammar
Overview
The language has oligosynthetic grammar. Oligosynthesis = oligo (few) + synthesis (combination) = combination of few. There are 4 * 4 * 5 * 2 = 160 primes in the language, that combine into different words. (4 places of articulation, 4 manners, 5 vowels and two possible orders - CV and VC). The word structure is [CV]n[VC]m where n ≥ 1 and m ≥ 1. The stress, or tone change, goes on the half of the word - where CV syllables end and VC syllables begin. So, the word sounds like "tatatAEpepep". Alternatively, you can mark CV syllables as rising tone, and VC - as falling, like "ta/ta/ta/et\et\et\". These measures can help the listener to understand more easily, where the semantic part ends and grammatical part begins. CV syllables in the beginning of the word are semantic - they carry the meaning of a word. VC syllables in the end of the word are grammatic - they modify the meaning or tell the role of a word in a sentence. For example: hecaohas = heca + ohas, heca = he (perceive) + ca (eye) = see, ohas = oh (do, verb) + as (past tense) = did in the past, hecaohas = saw. Grammatical VC syllables can transform one part of speach into another (for example, a verb to a noun), modify the meaning (ka-person + il-all -> kail - all people, humanity), indicate tense, case, part of speech, conditions, questions and much more. In that regard, you can say that the language has agglutinative features - there are many affixes (building blocks) with predictable meaning that modify the word. Oligosynthetic language is different from agglutinative in that the number of these building blocks is fixed and usually small (160 in our case). The root of the word consists of its own building blocks - in the "hecaohas" example the root "heca" (see) can be analyzed as "perceive with an eye". This is a feature of analytical languages, such as chinese or vietnamese. In such languages you usually don't invent words on the go, combining smaller words - there are stable idiomatic ways of saying things. Same is true for this language: you don't need to think every time how to combine small parts to get the needed meaning, and the listener doesn't have to analyze every root. Instead, once you've learned (or invented) a complex word once, you reuse it as is. The fact that the roots consist of small building blocks helps to learn words more easily - when you see a new word, you might already know what it means, without looking it up in a dictionary. However, if you can't guess the meaning, or can't be sure if you've guessed it right, or need to say some word and don't remember it - you can consult with the dictionary to see an explication of the root you need. Most probably, explication will use the building blocks from the word (like "to see" = "too perceive the light from objects through you eyes"). Explications only use one-syllable CV roots, so when you've learned all 80 of them (and the grammar of course), you can understand any word in the dictionary. This is the concept of semantic primes - every complex meaning can be explained through universal undivideable building blocks. Usually the root starts with the most important syllable, then go syllables that narrow the meaning until it's precise enough. So "to see" is "heca", not "cahe", because seeing is the process of perceiving, and by the medium of an eye. "cahe" would mean something like an eye that can perceive, which could be useful in some situations, but that's definitely not a synonim to "see".
Parts of Speech
Three different parts of speech can be identified: verb, noun and adjective. Verbs tell what action is being performed. Nouns tell what objects are involved in this action: who does it, who receives it, who helps, who watches, where and when, and much more. Adjective describe verbs and nouns - what color is the object, how big is it, how fast and strong is the action etc. CV* part of the word without any suffixes is a Base of no part of speech. What part of speech does a word belong to can be seen from it's ending (the last suffix). Suffixes can:
- assign a part of speech to Bases (CV* parts without VC suffixes, example: case suffixes, that say that an object of some class is the subject or the object in the sentence: home-at = home (base) -> at home (noun, where? locative case));
- change a part of speech into another (for example, Adjective -> Verb: happy-quality-be = happiness (base) -> happy (adjective) -> to be happy (verb))
- preserve a part of speech of the word and only modify the meaning (like question, negation or tense suffixes: move-do-future = move (base) -> to move (verb) -> will move (still verb, but in future))
- return a word from a part of speech back to the Base (example: run-do-"er"-be = run (base) -> to run (verb) -> runner (base, because without the case suffix) -> to be a runner (verb once again))
What every suffixes does to every part of speech is described in it's definition in the dictionary.
Word Order
Although all words are marked with their role in a sentence, it is advised to follow a head-final word order. Head final means that the most important word in a phrase goes last. But what is the most important word in a phrase, you might ask. It's not just the word you think is most important currently, or some new word, or the most beautiful. The most important word (the head) in the sentence is the verb. The head in a noun phrase is the noun. The head in a verb phrase is the verb. So, the order should be: modifier-of-subject subject (modifier-of-object object)* adverb verb. (smth)* means that smth is optional, and that there can be many of these smth. Modifiers to nouns can be sentences themselve, in that case these subsentences start with placeholders for the modified noun, and the verbs in them are marked. If there are many words that share the role in a sentence (for example, many subjects, or verbs), then the first one is not marked, and all consecutive are marked. These twins should go in a row: adj adj-and subj adj subj-and adj subj-and adj obj-case1 adj obj-case1-and adj obj-case1-and adj obj-case2 adj adj-and adj-and obj-case2-and adv verb adv adv-and verb-and Two sentences can be connected (for example, if something then something). What is said with particles like "if", "then", "more than" etc in english, can be said with grammatical VC suffixes in the word. So, "if-then" structure would look like this: subj obj verb-if - subj' obj' verb'-then. (The order of these two parts can be changed if it's more natural or logical). The order of objects of different cases (with different suffixes) is free: if you want to say the direct object of a verb before the place where the verb is done, you can do so. But if you want to say the other way around, you free to do that too. So, "subj obj-direct obj-location verb" is the same as "subj obj-location obj-direct verb", maybe emphasis is different.