Entries tagged with “Sami”

An imperative puzzle — 4 Jul 2010

Some of this is sort of adopted from a comment left elsewhere on the internets for someone asking about imperatives in languages. While musing over the data in Finnish and Northern Sámi, there appears to be an interesting puzzle: 2nd person imperatives are different from the imperatives formed for all other persons, in that non-2nd person imperatives appear to all be decended from an optative mood while 2nd-person imperatives are morphologically distinct. Perhaps this is analagous to the English imperative strategy, in which the 2nd person imperative is a bare verb stem: Go!, Sleep!; while other persons are formed periphrastically: May he go, Let him sleep.

Finnish and Northern Sámi

In Finnish, and closely related languages the second person imperative is formed with a bare verb stem, while other persons and numbers have additional morphemes, most of which include -k- (said by some to be a historical present tense marker).

(1)     mennä 'go'; mene-n 'I come

                sg.         pl.
        1.      --          menkäämme
        2.      mene        menkää
        3.      menköön     menkööt

The negative imperative is formed with help of an auxilliary negative verb, älä (2), which has similar morphology.

(2)             sg.         pl.
        1.      --          älkäämme
        2.      älä         älkää
        3.      älköön      älkööt

According to Maija Länsimäki, these ko/kö morphemes are originally from the optative. While this doesn't directly say anything about the plural 1st and 2nd persons, it seems like there's a chance that they are either related by way of optative, or connected to the present marker theory (2nd person imperative of tulla was originally *tulek).

What is just as interesting about this pattern is when the negative verb occurs with other verbs, e.g., don't go:

(3)             sg.             pl.
        1.      --              älkäämme menkö
        2.      älä mene        älkää menkö
        3.      älköön menkö    älkööt menkö

The same -ko/-kö appears on the verb. Is this a form of optative agreement, or something else? If these forms are connected, is the -ko/-kö marker found in questions (Nauroiko Mikko? 'Did Mikko laugh?') also related, or is this just a coincidence brought on by the small phoneme inventory in Finnish?

A similar pattern is to be found in Northern Sámi, as well, but slightly extended because NS allows for singular, dual and plural number (4). This paradigm is exactly the same for the negative auxilliary (5), however NS does not have anything similar to the -ko/-kö which occurs on the main verb in negative imperatives (these all occur in one form for all persons and numbers).

(4)     mannat 'to go'
                sg.         du.          pl.
        1.      mann-on     mann-u       mann-ot
        2.      mana        mann-i       mann-et
        3.      mann-os     mann-os-ka   mann-os-et

 
(5)     ale 'Neg'
                sg.         du.         pl.
        1.      allon       allu        allot
        2.      ale         alli        allet
        3.      allos       alloska     alloset

Here we see that 2nd person singular offers a bare stem, and that all other non-2nd person imperatives have a round vowel (o/u often alternate in NS, and in precisely this situation) which is specific to these situations only. The availability of dual in the paradigm allows us to see that there is something about 2nd person here that separates it from the other persons: and perhaps this is a difference of mood.

Estonian, as best as I can find, also has a similar pattern in the negative imperative auxilliary; but I can't find out how the main verbs go for non-2nd person imperatives. Anyone...?

(6)     minema 'go'
                sg.        pl.
        1       --         --
        2       mine       minge
        3       --         --

 
(7)     ära 'Neg'
                sg.        pl.
        1       --         ärgem
        2       ära        ärge
        3       ärgu       ärgu

Generalities?

This at least establishes that this pattern is similar in Finnish, Northern Sámi and Estonian (and apparently English), but what does it mean? One could assume from all of this that 'true' imperatives are restricted only to 2nd person, and other persons may be expressed with other moods for semantic reasons... 2nd person imperatives are only applied directly to the listener from the speaker and are commands, while 1st and 3rd person imperatives may refer to someone perhaps outside of the conversation and as such speakers may only wish for things that these persons may do.

Since I haven't Googled around yet, these are only my musings. May someone reading this come forward with more knowledge!

1 comments

Constraint Grammar — 7 Jun 2010

School's out! Woohoo! Now it's time to get working.

Since finishing exams, I've been spending the last week or so working with Constraint Grammar as part of my Google Summer of Code project in machine translation from Finnish to Northern Sámi. It's enlightening and interesting and there's much to learn, but it seems to give me precisely the kind of puzzles that I like to solve. Constraint Grammar is a syntactic formalism developed by Fred Karlsson (the author of the first Finnish grammar book I studied, which quite possibly changed my life) which has the essential goal of disambiguating words which are homophonous: have similar appearances but separate morphological uses or separate meanings.

An example:

minä lu-i-n kaksi kirja-a

1pSg.Nom READ-Prt-Sg1 TWO BOOK-Part

'I read two books.'

This all makes perfect sense to us, because we know what words are meant; however luin could mean "I read", or "with/by bones". Since the latter meaning is obviously not the one that we want for the sentence, Constraint Grammar provides a rule-based formalism for selecting the intended meaning based on the surrounding context. This isn't easy of course, because one actually needs quite a few rules to produce a fully disambiguated sentence, and natural sentences aren't always as simple as the one given above. Following is the full analysis of each word:

"<minä>"
    "minä" Pron Pers Sg Nom
    "mikä" Pron Interr Sg Ess
"<luin>"
    "lukea" V Act Ind Prt Sg1
    "luu" N Pl Ins
"<kaksi>"
    "kaksi" Num Card Sg Nom
"<kirjaa>"
    "kirja" N Sg Par
    "kirjata" V Act Ind Prs Sg3
    "kirjata" V Ind Prs ConNeg 
    "kirjata" V Act Imprt Sg2

As we can see, there are quite a few items that need to be removed (and listed in CG formalism below): the word minä can have its personal pronoun reading chosen because it precedes a verb with 1st person singular marking (line 1237); luin gets its verbal reading selected (as opposed to the 'bone' reading) because it follows a pronoun (line 1645); and finally kirjaa 'book+Part' is selected because it precedes a number.

1187: SELECT (Par) (-1C Num) (-1 Nom)
1237: SELECT (Pron "minä") (*1 Sg1 LINK NOT *-1 CLB?) (NOT 1 CLB?)
1645: SELECT (Sg1) (-1C MINA) (-1 Nom)

2094: MAP (@SUBJ>) TARGET Nom (0 WORD LINK *1 (Act))
2109: MAP (@<OBJ) TARGET Par IF (0 WORD LINK *-1 V BARRIER S-BOUNDARY2) ;
2115: MAP (@+FMAINV) TARGET VFIN IF (NEGATE *0 VERB BARRIER S-BOUNDARY2 OR CC) ;

Then following this disambiguation, several tags are added for later convenience... One tag, @SUBJ> tells us that the word is the subject of the sentence, preceding the verb; @+FMAINV tells us that the word is the main verb, @X tells us there is more work to be done yet; and @<OBJ says that the word is an object following its verb. The tags are shortcuts for passing along information for the generation part of the translation, in which words are produced based on the analysis. The full disambiguation is next, but note that the tags and analysis may not be correct yet; I'm just pulling this from the project as-is. Lines beginning with a semicolon (;) are those which are dropped from the analysis

"<minä>"
    "minä" Pron Pers Sg Nom @SUBJ> SELECT:1237 MAP:2094 
;   "mikä" Pron Interr Sg Ess SELECT:1237 
"<luin>"
    "lukea" V Act Ind Prt Sg1 @+FMAINV SELECT:1645 MAP:2115 
;   "luu" N Pl Ins SELECT:1645 
"<kaksi>"
    "kaksi" Num Card Sg Nom @X MAP:2348 
"<kirjaa>"
    "kirja" N Sg Par @<OBJ SELECT:1187 MAP:2109 
;   "kirjata" V Act Ind Prs Sg3 SELECT:1187 
;   "kirjata" V Ind Prs ConNeg SELECT:1187 
;   "kirjata" V Act Imprt Sg2 SELECT:1187

So, there's more work to be done. As I dig further in, I may post a few recipes if there are tricky problems that arise. I'll be running some newspaper sentences through the grammar to see what additional things need to be worked out; the rules work fine for short sentences, but it may be that they'll not hold up when applied to much more complex sentences. As you can see in the lines of code produced above, there are BARRIERs involved, which delimit the ability of the rule to search its surroundings. More of these will likely pop up as weirder sentences are tested.

As it turns out though, the above analysis for this sentence is actually enough to produce a good translation. Once all the words are disambiguated, they're sent off to a generator, which produces the following (with slashes representing dialectical variation):

$ echo "minä luin kaksi kirjaa" | fin-sme
        mun/mon lohken guokte girjji/girjje

The sentence also shows the connection between the two languages that the project concerns, if you squint you can see their relatedness.

3 comments

Tromsø Sámi Week // Sámi vahkku // Samisk uke 2010 — 16 Jan 2010

Sámi Week in Tromsø (February 1. - 7.) is an event celebrating Sámi culture. Part of the significance for such an event is that Tromsø is very much a Sámi city, in that the Sámi language and cultural events are a part of city life, and Tromsø is one of the larger cities in Sápmi. There are also many famous Sámi people who make the city their home, or one of their homes.

Although I've never been around to participate in this event before, I'm excited to see it this year. The official site for the event, which is in English only for the last years festivities, shows that last year was a week full of films, art exhibits, a cooking course, and a reindeer race, amongst the many other fun events.

The program for 2010 is up (and available in English), but digging around the internet before the program was available, I also found this:

Also occurring during the last week of January and first week of February is the Nordlys classical music festival.

0 comments

Threat of massacre at Tromsø's Kongsbakken school, Sámi students targeted — 8 Jan 2010

This was the big news today in Norway, and a few people I know around the world have messaged me today asking if I'd heard. So far, I haven't seen any English coverage of the story, so I translated NRK's article on the issue. Added some links for more context for people not familiar with Norway, the culture.

From NRK:

Today, all 650 students at Kongsbakken high school (nor. videregående skole) were evacuated due to a perceived threat of a school massacre posted on the internet. The threat to Kongsbakken high school was aimed at Sámi people. "Scary stuff", said one Sámi student at the school.

The threat was in the form of an illustration, including a picture of a gun, a sword, the school, and a map of Tromsø.

The threat was published on Thursday under the title "School massacre" on the American internet forum, 4chan. The principal (rector) at Kongsbakken high school, Ivar Odd Størkersen, was tipped off about the threat by a journalist working at Aftenposten. As such, he immediately contacted the police.

The threat was in the form of an illustration, including a picture of a gun, a sword, the school, and a map of Tromsø. The illustration further contained the text "Targets: Sami people". The threat was later removed from 4chan for unknown reasons.

"We take a connection between weapons and the school building as a serious and severe threat. It is sad that some have felt the need to connect the school with such a serious incident, " said Rector Størkersen to NRK.

"I have no idea why Sámi people are listed as a target. There are many Sámi people in Tromsø, and there are many students with Sámi background who are students at Kongsbakken. I will not speculate on what the purpose of this threat is," said Rector Ivar Odd Størkersen.

Frightening and scary

"It is frightening and scary. Such a threat must be taken seriously," said Mihka Solbakk (18), a Sámi student at Kongsbakken.

Students who arrived at the school this morning were met by police and locked doors. The school management had decided to evacuate all students.

"I believe there are many who are afraid now," said Solbakk.

Mihka Solbakk reports that he has spoken with many of his Sámi classmates today and many of them have expressed that it is frightening that such a threat was directed against Sámi students. He has otherwise not experienced that his ethnic background has been a problem, either at school or in town.

"Of course, you get to hear the occasional derogatory comment here and there."

"Some might joke that Sámi people are inferior, you never really know if anyone believes it. It may even be that this [threat] is meant as a joke," says Solbakk.

Solbakk made it clear otherwise that such threats should be taken seriously, and he thinks that the administration of Kongsbakken has handled the matter in a good way. "There have been a number of school shootings in the U.S. and in Finland in recent months, so it's no wonder that the administration takes these attacks seriously," he says.

Losing sleep

"I received a text message about the threat when I woke up this morning. Obviously I was a little scared," said Odd Ivar Solbakk, Mihka Solbakk's father.

His son was still sleeping when the message about the threat came, and so he would not be to school until after lunchtime. Odd Ivar Solbakk said to NRK Sámi Radio that news that the threat was directed against Sámi people was a special piece of news to hear at the crack of dawn.

"Tromsø is a city with many prolific Sámis and the Sámi people are very important here. It has never seemed as if there were a threat to the Sámi youth in the city," said Solbakk.

As a parent of a student at the school, Solbakk was proud of the way the school administration and the police have handled this matter. Having learned the full extent of the situation, he says he is impressed with the response and feels the school's forthrightness is something that characterizes a good school environment. He is also impressed.

"A threat so close to us as this is something that would interfere with sleep; it's unpleasant," says Odd Ivar Solbakk.

Investigating the matter

Police chief Truls Fyhn in the Troms Police Department said in a press release that an investigation has been initiated to get more information from the site that the threat was posted to, after the police were made aware of the case on Thursday night.

"The investigation is still continuing," he wrote.

Kripos is also involved in the investigation," said Tromsø police chief, Kurt Pettersen.

School administration at Kongsbakken expects that school will resume as normal beginning on Monday. Later today, the police will have a meeting with the principal and the others in the school administration to inform them of the investigation.

Listen to interviews

Read more news on NRK

My own thoughts

This is hugely surprising news. On one hand, 4Chan (see Wikipedia if you don't know what it is) is full of ridiculous things and so this may have been a joke, but on the other hand, with recent school and mall shootings in Finland and the U.S., one can never be sure. I have not observed much anti-Sámi discrimination or racism here in Tromsø, aside from errant comments people make that are mostly just stupid, but then I'm not Sámi, and I haven't lived here for more than 6 months, so I know basically nothing. The history of Sámi-Norwegian relations as I know it hasn't been pretty, but things are otherwise much much better now.

Some added context is that Kongsbakken (King's Hill) is a fairly prestigious, arts-oriented, and more liberal-minded school.

I'm hoping this was all just a joke, but it looks like it's gotten damn serious.

Updates

More English-language news is available here, from My Little Norway.

0 comments

2nd Annual Sámi Linguistics Symposium — 2 Jan 2010

The symposium was quite a success, I thought. It was the first I attended, and nice to see what's going on and what others are researching. Subject matter of the talks ranged, touching on graphic design, the successes and failures behind the development of writing systems, the varying web-presence between Inari Sámi and Skolt Sámi; and the syntax and semantics of reflexive pronouns, tone and intonation, and issues involving machine translation and lexical databases.

One of the things that I felt was an important theme was that when doing academic research, it is important to return something back to the communities you are working with. Otherwise, if you're working with endangered languages, how can you help them improve their situation? Some research produces results that is usable by these communities, and some research does not. Of course, it always means extra work to do so, but isn't it good to have an effect on things?

Anyway, brief update, but, here are some projects and resources to check out for those interested. I'll add some more things from time to time, as they come up.

0 comments

Northern Sámi syllable parser — 2 Mar 2009

Back in one of my phonology classes while I was working on my undergraduate degree in Linguistics, I wrote a paper that was an Optimality Theoretical account of stress-based coda strengthening in Northern Sámi. Although the OT model worked perfectly for the data set I had collected, I was somewhat unhappy with a smaller data set and wished to prove my point on a larger scale. In the couple years since then, I've gained some better programming skills and internet has changed so more data is available. As such, what follows is an extention of this previous paper. At some point I might track that down and revise and post it, but since that will likely be a little while, what follows will be a more complete discussion of the phenomenon that assumes less prior knowledge.

In order to process large amounts of data, I created a syllable parser based on some rules I had made for a programmatical account Standard Finnish. The role of a syllable parser in in Northern Sámi could be a couple things. As an analytical tool, the syllable parser can handle larger amounts of data in a shorter amount of time than can be processed by a human working at this task. The parser could also be used predictively, and could aid language/spell checking and for translation or localization. In localization, the syllable parser could make sure that the right suffix is applied to the right kind of word.

Read more
0 comments

Facebook in Northern Sámi / Facebook davvisámegillii — 17 Jan 2009

A month or so ago, I talked to a friend of mine who works at Facebook, and as a result, a new localization option was opened in Facebook's Translations application: Northern Sámi. Some of you might ask what Northern Sámi is, so before I talk about the project, here's a quick introduction to the vital details in a format that is less intensive with regards to linguistic terminology.

Northern Sámi is spoken in Northern Scandinavia by an estimated 15,000 - 35,000 people (depending on who you ask). It is a Finno-Ugrian language, which makes its more well known relatives Finnish and Hungarian, which aren't quite closely related. If you were to compare the relation of Finnish and Northern Sámi to Indo-European and Romance languages, you might say that Finnish is to Portuguese as Northern Sámi is to Russian. Northern Sámi is most closely related to about 8 to 10 other Sámi languages which also are spoken around Northern Scandinavia and the Kola Peninsula of Russia. Of these languages, Northern Sámi is the most numerous in terms of speakers.

If forced to pick a few interesting points about the language, I would have to go with the following:

Dual numbers — Northern Sámi contains verb conjugations and pronouns that describe 'we two', 'you two' and 'they two', in addition to the singular and plural. The following examples show this, but also show that English only differentiates between singular and plural.

Márit lea gávpis.
'Márit is at the store.'

Márit ja Máhtte leaba gávpis.
'Márit and Máhtte are at the store.'

Márit, Máhtte ja Elle leat gávpis.
'Márit, Máhtte and Elle are at the store'.

Detailed terminology for reindeer and snow. A good summary is available in this PDF.

A three-way contrast between consonant and vowel length.

Interdentals! Ththththththththtthththththththth. There aren't a lot of Finno-ugrian languages that have these sounds. In fact, the only other language variety I can think of right now outside of Northern Scandinavia with interdentals is in a version of the Rauma dialect of Finnish (Southwest Finland) as spoken by now elderly speakers. Interdental consonants (like in 'think') used to be more prominent in Finnic languages about a thousand years ago, but have since become less common.

The Facebook internationalization project in Northern Sámi, since it began, has grown to having 25 translators. Some of them are highly active in providing translations, and some of them are highly active in voting on translations to make sure that the best translation "wins". Recently, the project reached a new phase (translating phrases), which has been going much faster than even the first phase (establishing a glossary of terminology) despite that this second phase contains much more work. While I cannot predict how long this second phase will take, I can say that (copy/pasting) there are 23,796 phrases left to translate as of this date.

The reason I feel that a Northern Sámi-localized version of Facebook is important is because Facebook is about keeping people in touch with each other. What better a way to accomplish this, than to do it in Facebookers' own languages? Not only that, but this goal becomes immediately more awesome when it is also improving the usefulness of a minority language to its speakers. This is important for the survival of a language, because in order for a language to survive a language must continue to be useful to its speakers, and they must want to speak it. Part of this is maintaining prestige, and part of this is making sure that the language can continue to be used in a changing and globalizing environment.

In this case, Facebook is just a piece of the puzzle, and part of a more general point: since it is a prominent social networking site (which is constantly gaining users, and has an active population larger than Russia), it is naturally an important part of some peoples' methods of keeping in touch. If this one resource is available to users in their own language, this service has increased the usefulness of that language and reduces a need to interact with that service with another non-native language. With more services and media (books, news, TV, etc.) becoming available, a language has an even better chance at surviving.

Now, Northern Sámi isn't as endangered (or just plain isn't endangered) like some of its closest relatives, but the availability of Facebook in Northern Sámi can serve as a sign that something like this is just as possible for other minority languages too.

If you're interested in participating, check out Facebook's Translations application.

Update (3/9/9): Since the original time of posting, the amount of translators working on this project has just about doubled. w00t!

0 comments

Moving in — 1 Jan 2009

Moving in... So, prepare for little bugs. If anything explodes and gives an error, drop a comment with the URL that was problematic. If other inconsistencies occur, also mention. I'm working on ironing those out but I only have one set of eyes! Blog posts may be a bit sparse to start with, but check out the Selection of Truly Exciting Finnish Words... I'm populating that with more words than there will be blog entries for while.

The content of this blog is not necessarily meant to be Northern Sámi-centric, but it just happens to be what I'm working on more lately, as will be explained in future posts. The reason for this is not that I am culturally Northern Sámi myself, but rather, I am a student of linguistics who has taken an interest in this language and its respective culture and language family. I'm basically a big nerd for Finno-ugric languages, and not ashamed to admit it.

Things may slowly end up getting tweaked through use. For instance, the Selection of Truly Exciting Finnish Words is currently in it's infancy, but I expect it to grow. Some words are not as thoroughly populated with interesting tidbits, or are there as a placeholder for more information. Word tags also contain a decent amount of information, for example: consonant gradation and the ghost consonant tags. Drop comments where comments are welcome; they'll only help improve things.

The sanasto itself does not store all word forms individually, and instead they are generated by a series of rules. I will be tweaking the underlying code that handles this over the course of time, so for any of you Finnish speakers out there, please tell me if you notice odd inflections, or are aware of additional variation that is available in certain words (e.g., tunturia/tuntureita). Be advised that Standard Finnish may accept one thing, but this may not be true of the wealth of Finnish dialects.

Happy reading and word-sleuthing!

4 comments