aspell — spelling with foreign dictionaries 

Introduction

Regardless of whether you are a spelling genius or a spelling disaster, like myself, it is useful to have a spellchecker handy. There are number of tools available for this purpose. For single words, Google is quite useful. There are many other websites with spellchecking and lexical capabilities, such as the free dictionary, Cambridge dictionary, dictionary.com, and wordnet, to name just a few. One could also use chunky tools such as the Open Office Writer. Instead, I use a light-weight shell program called aspell. aspell may be used to check spelling interactively from the shell as shown below.

$ aspell -a # ©2007 dsplabs.com.au
@(#) International Ispell Version 3.1.20 (but really Aspell 0.60.3)
Polish dictionary
*
*

Note that the leading stars correspond to the spellchecked words and indicate that they have been spelled correctly, on the other hand, if we try to spell check poilsh dictionary we get

$ aspell -a # ©2007 dsplabs.com.au
@(#) International Ispell Version 3.1.20 (but really Aspell 0.60.3)
poilsh dictionary
& poilsh 29 0: Polish, polish, palish, plush, polisher, plash, …
*

where the first word, poilsh, is flagged as a spelling mistake using a leading ampersand with included potential alternative spellings, while the second word, dictionary, is marked with a star since it is spelled correctly.

It is also possible to spellcheck entire documents with aspell as follows.

$ aspell -c file_to_spellcheck.txt

When a spelling mistake is encountered, aspell goes into an interactive mode allowing the user to make spelling corrections.

Foreign languages
How about spellchecking in a language other than the default one? Well, lets try to spellcheck the Polish translation of Polish dictionary, i.e. polski slownik.

$ aspell -a # ©2007 dsplabs.com.au
@(#) International Ispell Version 3.1.20 (but really Aspell 0.60.3)
polski slownik
& polski 29 0: pol ski, pol-ski, Pulaski, Polanski, Polk, pols, …
& slownik 12 7: Salonika, slink, slunk, Slinky, slinky, Sloan, …

We got spelling errors since the default dictionary is an English dictionary. It is possible to specify a custom dictionary using the -d option, however, an error will occur if the custom dictionary has not been installed or is unavailable:

$ aspell -dpl -a # ©2007 dsplabs.com.au
Error: The file "/usr/lib/aspell-0.60/pl" can not be opened for reading.

Installing foreign dictionaries

How to install a foreign dictionary? Simple… lets walk through dictionary installation using Polish dictionary as an example. Polish version of aspell dictionary can be obtained from http://www.kurnik.org/dictionary. After download extract the archive as follows:

$ tar xjf alt-aspell6-pl-6.0_20070912-0.tar.bz2 # ©2007 dsplabs.com.au
$ ls -la # ©2007 dsplabs.com.au
total 42080
drwx—— 11 kamil kamil     4096 Sep 12 18:28 .
drwx—— 18 kamil kamil     4096 Aug 29 15:37 ..
drwx——  3 kamil kamil     4096 Sep 12 15:00 aspell6-pl-6.0_20070912-0
-rw——-  1 kamil kamil   557834 Sep 12 18:26 alt-aspell6-pl-6.0_20070912-0.tar.bz2

Enter the extracted directory:

$ cd aspell6-pl-6.0_20070912-0/ # ©2007 dsplabs.com.au
$ ls -la # ©2007 dsplabs.com.au
total 2452
drwx——  3 kamil kamil    4096 Sep 12 15:00 .
drwx—— 11 kamil kamil    4096 Sep 12 18:28 ..
drwx——  2 kamil kamil    4096 Sep 12 15:00 doc
-rwx——  1 kamil kamil    2477 Sep 12 15:00 configure
-rw——-  1 kamil kamil     149 Sep 12 15:00 Copyright
-rw——-  1 kamil kamil     340 Sep 12 15:00 info
-rw——-  1 kamil kamil    1745 Sep 12 15:00 Makefile.pre
-rw——-  1 kamil kamil  241083 Sep 12 15:00 pl_affix.dat
-rw——-  1 kamil kamil 2214173 Sep 12 15:00 pl.cwl
-rw——-  1 kamil kamil      71 Sep 12 15:00 pl.dat
-rw——-  1 kamil kamil      70 Sep 12 15:00 pl.multi
-rw——-  1 kamil kamil      72 Sep 12 15:00 polish.alias
-rw——-  1 kamil kamil    2539 Sep 12 15:00 README

Run configure:

$ ./configure # ©2007 dsplabs.com.au
Finding Dictionary file location … /usr/lib/aspell-0.60
Finding Data file location … /usr/lib/aspell-0.60

After configure finds where aspell libs reside on your system, run make install:

$ sudo make install # ©2007 dsplabs.com.au
/usr/bin/prezip-bin -d < pl.cwl | /usr/bin/aspell  –lang=pl create master ./pl.rws
Warning: Removing inapplicable affix 'm' from word ka.
Warning: Removing inapplicable affix 'y' from word nied?wiedzi.
Warning: Removing inapplicable affix 'B' from word owa?.
Warning: Removing inapplicable affix 'G' from word owa?.
Warning: Removing inapplicable affix 'J' from word owa?.
Warning: Removing inapplicable affix 'E' from word przepiszcze?.
Warning: Removing inapplicable affix 'Y' from word ty.
mkdir -p /usr/lib/aspell-0.60/
cp pl.rws pl.multi polish.alias /usr/lib/aspell-0.60/
cd /usr/lib/aspell-0.60/ && chmod 644 pl.rws pl.multi polish.alias
mkdir -p /usr/lib/aspell-0.60/
cp pl.dat pl_affix.dat /usr/lib/aspell-0.60/
cd /usr/lib/aspell-0.60/ && chmod 644 pl.dat pl_affix.dat

Have a look in the aspell libs directory, you should now have the Polish dictionary files (amongst others) in there.

$ ls -la /usr/lib/aspell-0.60/ # ©2007 dsplabs.com.au
total 12660
drwxr-xr-x   2 root root    4096 Sep 12 18:29 .
drwxr-xr-x 207 root root  139264 Jun 24 14:02 ..
…
-rw-r–r–   1 root root      90 Jul 12  2006 en_GB-ise.multi
-rw-r–r–   1 root root     110 Jul 12  2006 en_GB-ise-w_accents.multi
-rw-r–r–   1 root root   93056 Jul 12  2006 en_GB-ise-w_accents-only.rws
-rw-r–r–   1 root root     111 Jul 12  2006 en_GB-ise-wo_accents.multi
-rw-r–r–   1 root root   93056 Jul 12  2006 en_GB-ise-wo_accents-only.rws
…
-rw-r–r–   1 root root  241083 Sep 12 18:29 pl_affix.dat
-rw-r–r–   1 root root      71 Sep 12 18:29 pl.dat
-rw-r–r–   1 root root      70 Sep 12 18:29 pl.multi
-rw-r–r–   1 root root 6642032 Sep 12 18:29 pl.rws
-rw-r–r–   1 root root      72 Sep 12 18:29 polish.alias
…

Using foreign dictionaries

You can now use Polish dictionary as follows:

$ aspell -dpl -a # ©2007 dsplabs.com.au
@(#) International Ispell Version 3.1.20 (but really Aspell 0.60.3)
polski slownik
*
& slownik 53 7: słownik, ?liwnik, sokownik, Salonik, …

No more mistakes after making one correction.

$ aspell -dpl -a # ©2007 dsplabs.com.au
@(#) International Ispell Version 3.1.20 (but really Aspell 0.60.3)
polski słownik
*
*

Other languages can be specified in a similar fashion, f.e. for french dictionary we would use -dfr, assuming that the french aspell dictionary files are installed. If they are not, locate their installation sources on the Internet and follow the above procedure. It should be very similar.

Also, note that an on-line Polish-English dictionary can be found at dict.pl.


Did you find the above information useful and interesting? If so, please support this site by using the blog directory links at the bottom of this page. Thanks for your support!

If you have any Linux related problems or questions then please feel free to post them on our Linux Forums: http://linux.dsplabs.com.au/forums.




VPS Hosting Referral Code DZZCC3

Add me to Technorati Favorites Vote for me on Blog Catalog

3 Responses to “aspell — spelling with foreign dictionaries”

  1. anne Says:

    Hello,

    Is it possible to use aspell with dictionaries in different languages? I have tried to use the add-extra-dicts option but it complains that the default language is say english. I can use either a french, a dutch, an english or whatever other dict, but not all at once which is what I need.

    Thanks for your help

    Best wishes

    anne

  2. Kamil Says:

    Hi Anne, thanks for this great question! Unfortunately, as far as I can ascertain, aspell cannot be used with multiple dictionaries in different languages, whilst multiple dictionaries in the same language are quite ok.

    Note that running aspell –extra-dicts=pl -a produes the following error:

    Error: Expected language "en" but got "pl".

    Worry not however! The beauty and the power of Linux come form its shell tools… and my mate James, from awklores, is writing a simple awk script for you right now. The script will list all words in an input file that do not occur in any of the specified dictionaries, along with the spelling suggestions.

    I presume you use aspell like so:
    cat spellme.txt |aspell -a
    or equivalently:
    aspell -a < spellme.txt
    or by pasting text into the terminal after running:
    aspell -a

    On the other hand, emulation of aspell's interactive mode for multiple languages would be a 'bit' more tricky, so we'll leave that one for another time.

  3. Kamil Says:

    Here is a very preliminary version of the multi-dictionary aspell bash script: mspell as well as a test file I used: spellme.txt. Here is the listing for both:

    cat mspell spellme.txt

    #!/bin/bash
    
    while read line; do
    
        ENGLISH=`echo $line | aspell -a --lang=en | awk '/^&/ && NF>0 && NR>1'`;
        POLISH=`echo $line | aspell -a --lang=pl | awk ' /^&/ && NF>0 && NR>1'`;
    
        awk -v en="$ENGLISH" -v pl="$POLISH" \
        'BEGIN{
            split(en, en_lines, "\n");
            split(pl, pl_lines, "\n");
            for(e in en_lines){
                split(en_lines[e], en_words, " ");
                for(p in pl_lines){
                    split(pl_lines[p], pl_words, " ");
                    if(en_words[2]==pl_words[2])
                        printf "\nEnglish Suggestions: "en_lines[e]"\nPolish Suggestions:  "pl_lines[p]"\n";
                }
            }
        }'
    
    done
    
    be or nnot to bee ttto blah yeah
    be or mehhhh to bee ttto blah yeah
    be or nnot to ttto blah yeah
    

    Here are ways you can run mspell:

    ./mspell < spellme.txt

    English Suggestions: & nnot 3 6: not, knot, snot
    Polish Suggestions:  & nnot 5 6: nnoto, NOT, not, knot, Note?
    
    English Suggestions: & ttto 3 18: Otto, Tito, Toto
    Polish Suggestions:  & ttto 7 18: Otto, TTTM, Tito, tato, toto, ttta, tuto
    
    English Suggestions: & ttto 3 20: Otto, Tito, Toto
    Polish Suggestions:  & ttto 7 20: Otto, TTTM, Tito, tato, toto, ttta, tuto
    
    English Suggestions: & nnot 3 6: not, knot, snot
    Polish Suggestions:  & nnot 5 6: nnoto, NOT, not, knot, Note?
    
    English Suggestions: & ttto 3 14: Otto, Tito, Toto
    Polish Suggestions:  & ttto 7 14: Otto, TTTM, Tito, tato, toto, ttta, tuto
    

    or

    ./mspell

    be or nnot to bee ttto blah yeah
    be or mehhhh to bee ttto blah yeah
    be or nnot to ttto blah yeah
    
    English Suggestions: & nnot 3 6: not, knot, snot
    Polish Suggestions:  & nnot 5 6: nnoto, NOT, not, knot, Note?
    
    English Suggestions: & ttto 3 18: Otto, Tito, Toto
    Polish Suggestions:  & ttto 7 18: Otto, TTTM, Tito, tato, toto, ttta, tuto
    
    English Suggestions: & ttto 3 20: Otto, Tito, Toto
    Polish Suggestions:  & ttto 7 20: Otto, TTTM, Tito, tato, toto, ttta, tuto
    
    English Suggestions: & nnot 3 6: not, knot, snot
    Polish Suggestions:  & nnot 5 6: nnoto, NOT, not, knot, Note?
    
    English Suggestions: & ttto 3 14: Otto, Tito, Toto
    Polish Suggestions:  & ttto 7 14: Otto, TTTM, Tito, tato, toto, ttta, tuto
    

    Please check for updates in the future as we'll polish this code over time.

Leave a Reply