Collation – sorting like clockwork

July 25, 2014

Sorting character strings alphabetically is an effective way of quickly finding the element you are looking for in a list. The sorting of character strings plays an important role, especially in software solutions with large amounts of data.

To do this, the elements must be placed in the correct collation. But which collation is the right one?

In this article, I would like to answer this question and introduce you to a way of stringing together strings within conzept 16.

Alphabets

Depending on the country and language, different rules are defined for the correct sorting sequence for alphabetical sorting. Most Western languages are based on the Latin alphabet with the 26 letters A to Z. Other letters are then added – often with diacritical marks – such as umlauts.

The German alphabet also contains the umlauts Ä, Ö and Ü as well as the ß (Eszett). These are sorted alphabetically according to their base characters, i.e. Ä, Ö and Ü with A, O and U and ß with S.

Character sets

In software, these different alphabets are combined in character sets. For example, the character set “Windows-1252” (Windows code page 1252) contains the letters required for most Western languages. In addition to German, this also includes French, Portuguese and Swedish, for example.

The Unicode character set, on the other hand, contains all the letters of most living languages.

Sorting sequence

Within the character sets, the letters are not sorted alphabetically. This means that for a correct alphabetical sorting, the alphabet of the character set must be put in a different order depending on the country and language in order to obtain a correct sorting.

A selection of practical collation sequences can be found on the page collation-charts.org. Tables are listed there for various software solutions and languages that describe the sort sequence used. For example, the sort sequence for the German Windows Vista.

conzept 16

In concept 16, sorting can be implemented using a table key or the container elements CteTree and CteNode, for example. However, the character strings are sorted according to the alphabet of the character set and not according to a country- or language-specific alphabet:

Franca
Franz
François (ç to z)
Jakob
Julia
Jákup (á to u)
Raul
Ray
Raúl (ú to y)
Zoe
Zofia
Zoë (ë to f)

This would be a possible correct order for the German language:

Franca
François (ç to c)
Franz
Jakob
Jákup (á to a)
Julia
Raul
Raúl (ú to u)
Ray
Zoe
Zoë (ë to e)
Zofia

To achieve this sorting sequence, the character string must be converted into a sortable form. The letters must be assigned their position in the sorted alphabet.

The Str.Collate() function – which you can download at the end of the article – does exactly that.

Example

// Load sort sequence for Germany
Str.CollationLoad(_Str.Collation_deDE);

tCteTree # CteOpen(_CteTree);

tCteTree->CteInsertItem(Str.Collate('Franca'), 0, 'Franca');
tCteTree->CteInsertItem(Str.Collate('François'), 0, 'François');
tCteTree->CteInsertItem(Str.Collate('Franz'), 0, 'Franz');

...

// Unload sort sequence
Str.CollationUnload();

The passed alphabet is loaded in the Str.CollationLoad() function. A collation sequence is generated from all 255 different characters of the conzept 16 internal character set – which contains all characters of Windows code page 1252. You can easily extend this function with additional sorting sequences. The sorting sequence is used in function Str.Collate() to convert the character string into a sortable form. Function Str.CollateUnload() is used to release a loaded sort alphabet again.

Conclusion

You can use this function to sort character strings in conzept 16 according to any alphabet. You can save the resulting sort values in a key field of a table or a container element, for example.

Although this option is not a native solution, it offers the possibility of being expanded and adapted to your own needs.

Download

Zum downloaden hier klicken — SysStrCollate.prc (5.28 KB)

Sie müssen angemeldet sein, um die Datei herunterladen zu können.

Klicken Sie hier, um die Nutzungsbedingungen für unseren Blog zu lesen.

Switch Platform

Collation – sorting like clockwork

Alphabets

Character sets

Sorting sequence

conzept 16

Example

Conclusion

Download

Leave a Reply Cancel reply

Im Blog suchen …

Categories

Products

Company

Info

Portal

Legal

Address

Contact

Social Media

Sign up

Requests, questions or feedback are welcome:

vectorsoft

Terms of use of the comment function in the blog

1. General information

2. Netiquette

3. Prohibition of illegal content

4. No advertising

5. Details of the name

6. Source references

7. Violation of the terms of use

Get your Trial Version now!

Test yeet free of charge

IHRE EVALUIERUNGSLIZENZ - JETZT ANFORDERN!

TESTEN SIE DIE CONZEPT 16 VOLLVERSION - UNVERBINDLICH und KOSTENFREI

Subscribe to our newsletter